PythonPredictordoes in Cortex. Most
PythonPredictorscan be converted to custom predictor by copy pasting the code and renaming some variables.
kubectland adding your CoreWeave Cloud access credentials, the following steps will deploy the Inference Service. Clone this repository and folder, and execute all commands in there. We'll be using all the files.
custom-predictordirectory. Build and push the Docker image. No modifications are needed to any of the files to follow along. The default Docker tag is
latest. We strongly discourage you to use this, as containers are cached on the nodes and in other parts of the CoreWeave stack. Once you have pushed to a tag, do not push to that tag again. Below, we use simple versioning by using tag
1for the first iteration of the image.
Secretwith the Docker Hub credentials. The secret will be named
docker-hub. This will be used by nodes to pull your private image. Refer to the Kubernetes Documentation for more details.
Secretby patching the
ServiceAccountfor your namespace to reference this
PersistentVolumeClaim. We'll also deploy a simple container that we can use to copy files to our newly created volume.
PersistentVolumeClaimand the manifest for the
sleepcontainer. Download the pre-trained model locally, create a directory for it in the shared volume and upload it there. The name of the sleep Pod is assigned to a variable using
kubectl. You can also get the name with
kubectl get pods.
sentiment-inferenceservice.yamlto reference your docker image.
kubectl logs sentiment-predictor-default-px8xk-deployment-85bb6787d7-h42xk kfserving-container.
kubectl get ksvc.
InferenceService. This will delete all the associated resources, except for your model storage and sleep Deployment.