Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.coreweave.com/llms.txt

Use this file to discover all available pages before exploring further.

Dynamo provides deployment and orchestration for inference workloads. This tutorial shows you how to deploy Dynamo on CKS for cluster-wide inference, install its custom resources and platform components, and run an inference model. In this tutorial, you will:
  1. Install Dynamo CRDs.
  2. Install the Dynamo platform with the Kubernetes AI (Kai) scheduler and Grove enabled.
  3. Deploy an inference model using an example from the Dynamo repository and a Hugging Face token.
  4. List and delete deployments when you are done.

What you'll need

Before you start, you must have:

What you'll use

You’ll use these tools and components:
  • Dynamo: Cluster-wide inference orchestration from the Dynamo repository
  • Helm: To install Dynamo CRDs and platform charts from NVIDIA NGC
  • Kubernetes AI (Kai) scheduler and Grove: Enabled in the platform install for scheduling and routing

1. Set environment

Set the namespace where Dynamo will be installed and the Dynamo release version.
export NAMESPACE=dynamo-system
export RELEASE_VERSION=0.9.0
You will use these variables in the following steps. Verify with echo $NAMESPACE and echo $RELEASE_VERSION if needed.

2. Install CRDs

Skip this step if Dynamo CRDs are already installed on the cluster.
  1. Fetch and install the Dynamo CRDs chart:
    helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds-${RELEASE_VERSION}.tgz
    
    helm install dynamo-crds dynamo-crds-${RELEASE_VERSION}.tgz --namespace default
    
  2. When the install completes without error, the CRDs are installed. Optionally verify with:
    kubectl get crd | grep dynamo
    

3. Install platform

Install the Dynamo platform into the chosen namespace with the Kubernetes AI (Kai) scheduler and Grove enabled.
  1. Fetch the Dynamo platform chart:
    helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-${RELEASE_VERSION}.tgz
    
  2. Install the Dynamo platform:
    helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace ${NAMESPACE} --create-namespace \
    --set "grove.enabled=true" \
    --set "kai-scheduler.enabled=true" \
    --set "dynamo-operator.controllerManager.manager.image.tag=${RELEASE_VERSION}" \
    --set "dynamo-operator.controllerManager.kubeRbacProxy.image.repository=registry.k8s.io/kubebuilder/kube-rbac-proxy" \
    --set "dynamo-operator.controllerManager.kubeRbacProxy.image.tag=v0.15.0"
    
    You should see output similar to the following:
     I0312 13:13:13.800378    5735 warnings.go:110] "Warning: tls: failed to find any PEM data in certificate input"
     NAME: dynamo-platform
     LAST DEPLOYED: Thu Mar 12 13:12:16 2026
     NAMESPACE: dynamo-system
     STATUS: deployed
     REVISION: 1
     TEST SUITE: None
     NOTES:
     SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
     SPDX-License-Identifier: Apache-2.0
    
     Licensed under the Apache License, Version 2.0 (the "License");
     you may not use this file except in compliance with the License.
     You may obtain a copy of the License at
    
     http://www.apache.org/licenses/LICENSE-2.0
    
     Unless required by applicable law or agreed to in writing, software
     distributed under the License is distributed on an "AS IS" BASIS,
     WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
     See the License for the specific language governing permissions and
     limitations under the License.
    
  3. When the install completes, Dynamo platform components run in ${NAMESPACE}. Verify with:
    kubectl get pods -n ${NAMESPACE}
    
    Ensure the expected pods are running before you deploy a model.

4. Deploy an inference model

This tutorial uses the Qwen3-0.6B model, which you deploy using the agg.yaml from the Dynamo repo.
  1. Download the agg.yaml and modify the image value by changing my-tag to 0.9.1 on lines 16 and 30:
    image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.1
    
  2. Create a Kubernetes secret with your Hugging Face token:
    kubectl create secret generic hf-token-secret --from-literal=HF_TOKEN="INSERT-TOKEN-HERE" -n ${NAMESPACE}
    
    Replace INSERT-TOKEN-HERE with your Hugging Face token.
  3. Apply the agg.yaml to your cluster:
    kubectl apply -f agg.yaml -n ${NAMESPACE}
    

    More example models

    More example models and manifests are available in the Dynamo repository. If you use one of these example models, be sure to create a Kubernetes secret for your Hugging Face token (same as step 2 above). If you’re deploying an example model instead of the agg.yaml, you might need to configure storage:
    • Configure storage for the example. In your chosen example, edit model-cache/cache.yaml and replace every instance of your-storage-class with shared-vast so the model cache uses CoreWeave storage.
    • Apply the model download manifest to download the model data:
      kubectl apply -f model-cache/model-download.yaml
      
    When the download job completes, the model is cached and ready for inference. Follow any additional instructions in the example for running or exposing the model.

5. Run inference

In another terminal, forward the vllm-agg-frontend service:
kubectl port-forward service/vllm-agg-frontend 8000:8000 -n $NAMESPACE
To test the deployment, run:
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen3-0.6B",
    "messages": [
      {
        "role": "user",
        "content": "Hello, how are you?"
      }
    ],
    "stream": false,
    "max_tokens": 100
  }'
The response includes a JSON body with a choices array containing the model reply.

6. Clean up

To list Dynamo graph deployments in your namespace, run:
kubectl get dynamographdeployment -n $NAMESPACE
To delete a deployment, use the resource kind and the deployment name. For example, to delete a deployment named vllm-agg-router:
kubectl delete dynamographdeployment vllm-agg-router -n $NAMESPACE
The deployment is removed when the command succeeds.
Last modified on April 20, 2026