CoreWeave
Search…
PVC Model Serving

Introduction

This example is virtually the same as the S3 based one, instead of repeating all steps please read that example first as this is an abbreviated version.

Getting Started

  1. 1.
    Follow the guide to export the model through Jupyter first. The model can be placed into a CoreWeave filesystem through many other methods, such as via an SSH or rsync container. Please note that this repository does not include the PVC definition, as it is defined in the Jupyter repository.
  2. 2.
    Apply the resources. This can be used to both create and update existing manifests
    1
    $ kubectl apply -f gpt-pvc-inferenceservice.yaml
    2
    inferenceservice.serving.kubeflow.org/gpt-pvc created
    Copied!
  3. 3.
    List pods to see that the Transformer and Predictor have launched successfully
    1
    $ kubectl get pods
    2
    NAME READY STATUS RESTARTS AGE
    3
    gpt-pvc-predictor-default-ljxhm-deployment-86596d6846-kwnkd 2/2 Running 0 34s
    4
    gpt-pvc-transformer-default-f5lf9-deployment-86bc7b4fd8-g8m69 2/2 Running 0 34s
    Copied!
    If the predictor fails to start, it is probably due to the model being saved at the wrong path. kubectl logs gpt-pvc-predictor-default-ljxhm-deployment-86596d6846-kwnkd kfserving-container will contain an error if so.
  4. 4.
    Once all the Pods are running, we can get the API endpoint for our model
    1
    $ kubectl get ksvc
    2
    NAME URL LATESTCREATED LATESTREADY READY REASON
    3
    gpt-pvc-predictor-default http://gpt-pvc-predictor-default.tenant-test.knative.chi.coreweave.com gpt-pvc-predictor-default-ljxhm gpt-pvc-predictor-default-ljxhm True
    4
    gpt-pvc-transformer-default http://gpt-pvc-transformer-default.tenant-test.knative.chi.coreweave.com gpt-pvc-transformer-default-f5lf9 gpt-pvc-transformer-default-f5lf9 True
    Copied!
    We want to use the transformer endpoint to be able to make requests in cleartext. Ensure that both services are listed as Ready.
  5. 5.
    The model is now available to access over the Internet. Use curl to test it out.
    1
    $ curl -d '{"instances": ["That was easy"]}' http://gpt-pvc-transformer-default.tenant-test.knative.chi.coreweave.com/v1/models/gpt-pvc:predict
    2
    {"predictions": ["That was easy to say, what else would you do, what would you do, would you say to your daughter and say to her, 'Where is the work you're doing, where is the work you're working on, and how are you doing it?' and she was like, 'I'm not going to be here, I can't do it!' and she became, you know, frustrated. And I think there's a different type of anxiety. There's this self-pity that comes in, and that's also why they call their child a 'brilliant' child.\n\nShe was always saying that when she was a little girl, there was something really important to do. But she really doesn't go. She knows that whatever she does, when she's ready, she's going to go into any school or program and that she's going to do. And she really needs to do that, because it's just so much more exciting to her now.\n\nIt made her less able to put her mind at the 'solution' to her child's difficulties \u2013 even as she had more opportunities than I or anyone could ever do, and at a time when we were trying a lot of things to find the balance in the world. And I"]}
    Copied!
Last modified 1yr ago