This example demonstrates deploying an auto-scaling Inference service from a pre-existing docker image. This can be useful when deploying off-the-shelf models that aren't available as ie. Tensorflow SavedModels. One example of this is the IBM COCO Based Object Detector. An example InferenceService for that also exists in this repository. The rest of this example will focus on a public wrapped version of the BASNet object detection model. This example and the test client is based on work by Cyril Diagne.
To follow along, clone the manifests from GitHub.
kubectl and adding your CoreWeave Cloud access credentials, the following steps will deploy the Inference Service. Clone all the files in this repository to follow along.
Apply the resources. This can be used to both create and update existing manifests
$ kubectl apply -f basnet-inferenceservice.yamlinferenceservice.serving.kubeflow.org/basnet configured
List pods to see that the Transformer and Predictor have launched successfully
$ kubectl get podsNAME READY STATUS RESTARTS AGEbasnet-predictor-default-sj9kr-deployment-76b67d669-4gjrp 2/2 Running 0 34s
If the predictor fails to init, look in the logs for clues
kubectl logs basnet-predictor-default-sj9kr-deployment-76b67d669-4gjrp kfserving-container.
Once all the Pods are running, we can get the API endpoint for our model. Since this model doesn't adhere to the Tensorflow V1 HTTP API, we can't use the API endpoint provided by
kubectl get inferenceservices. We have to hit up the predictor directly.
$ kubectl get ksvcNAME URL LATESTCREATED LATESTREADY READY REASONbasnet-predictor-default https://basnet-predictor-default.tenant-test.knative.chi.coreweave.com basnet-predictor-default-sj9kr basnet-predictor-default-sj9kr True
The URL in the output is the public API URL for your newly deployed model.
Enter the client directory. You can either run the test client locally or in docker. The output will be in
$ cd client/$ export SERVICE_URL=https://basnet-predictor-default.tenant-test.knative.chi.coreweave.com$ docker build -t test .; docker run --rm -it -v $(pwd)/images:/app/images test --basnet_service_host $SERVICE_URLINFO:root: > sending to BASNet...INFO:root:200INFO:root: > saving results...INFO:root: > opening mask...INFO:root: > compositing final image...INFO:root: > saving final image...$ open images/output.png
Remove the inference service
$ kubectl delete inferenceservices basnetinferenceservice.serving.kubeflow.org "basnet" deleted