About inference on CKS - CoreWeave Docs

Inference on CKS gives you full control over your inference deployment stack using CoreWeave Kubernetes Service. Deploy inference runtimes, configure networking, and manage scaling directly through Kubernetes resources on CoreWeave GPU infrastructure. The following tutorials walk through deploying common inference runtimes on CKS:

Deploy Red Hat AI inference and llm-d on CoreWeave Kubernetes Service (CKS)
Deploy NVIDIA Dynamo on CKS
Deploy vLLM for inference