Skip to main content
CoreWeave Kubernetes Service (CKS) provides high-performance computing and a flexible Kubernetes experience by running Kubernetes on bare metal. This page introduces the architecture of CKS clusters, explains how the managed control and data planes work together, and outlines the advantages of running Kubernetes directly on dedicated hardware. Read this page to understand the foundational concepts before creating or operating CKS clusters.

Kubernetes on bare metal

The following sections describe what running Kubernetes on bare metal means and the benefits it provides. CKS runs Kubernetes directly on bare metal GPU Nodes without any intermediary software, such as a hypervisor or virtual machine. This direct hardware access is the defining difference of CKS. Nodes in your cluster access their GPUs without any virtualization overhead. Difference between bare metal versus hypervisor. CoreWeave maintains, scales, and updates Kubernetes clusters that run directly on Nodes hosted in CoreWeave regions. CKS provides the following benefits by eliminating most of the virtualized intermediary stages between customer and cluster:
  • Better performance: Virtualization layers aren’t competing for hardware resources.
  • Increased efficiency: Quicker data access and processing.
  • Reduced latency: Applications have direct access to the underlying hardware.
  • Better observability: More fine-grained access to kernel logs and Node behavior.

CKS clusters

CKS clusters grant you privileged access to Kubernetes resources. You run workloads on your own dedicated hardware. These servers operate independently of other operations and connect through isolated Virtual Private Clouds (VPCs). Host permissions to Nodes are also securely managed within VPCs, so you don’t need to manage the underlying infrastructure. CKS uses clusters to provide efficient, scalable, secure Kubernetes workload orchestration. Containers access server hardware directly, which suits workloads such as model training and VFX rendering. The architecture and components are designed for efficiency, low latency, and customization.

Cluster architecture

The following sections describe the two planes that make up a CKS cluster and the role each plays in cluster operation. clusters use two planes:
  • The Managed Data Plane
  • The Managed Control Plane
Each plane has distinct components responsible for different functionalities within the cluster. The Managed Control Plane is a Kubernetes Control Plane managed by CoreWeave, providing a managed Kubernetes environment that is secure, stable, and optimized for high-performance computing tasks. CoreWeave builds the Managed Control Plane to take advantage of high-performance computing (HPC) hardware, and designs it to support compute-intensive applications. The Managed Control Plane includes components that manage the overall state and lifecycle of CKS clusters. These components handle cluster orchestration, scheduling, and control functionality. The Managed Data Plane includes components responsible for handling the data traffic and operations within the Kubernetes cluster. These components ensure that the workloads are managed efficiently and securely, and can be customized for flexibility.
Learn more about the specific components within the Managed Control Plane and Managed Data Plane, and how they support CKS cluster performance.

Region-level image proxy

CoreWeave operates a region-level registry proxy that accelerates container image pulls and reduces exposure to public registry rate limits for your cluster’s Nodes. The proxy has the following advantages:
  • Caches some image layers to improve startup time and lower bandwidth consumption.
  • Caches image metadata (manifests) to mitigate external rate limiting.

Avoid mutable tags

Mutable tags behave as if they are immutable. For example, using :latest doesn’t pull the latest image since the proxy serves only the cached manifest, which might not be the latest. Therefore, avoid reusing tags like :latest across different image builds. Doing so can result in Nodes pulling an older cached manifest until the cache expires.
To load the correct image, you must use immutable tags or pin by digest for production workloads to avoid “sticky” results from proxy metadata caching.

Digest pinning example

Pinning by digest ensures every Node pulls the exact same artifact, independent of tag changes.
# Deployment snippet (example)
spec:
  template:
    spec:
      containers:
        - name: app
          image: registry.example.com/myapp@sha256:3f5e8b5c7ab2740c0f7b1e3e8d3d7b5f0c1a2b3c4d5e6f7081920a1b2c3d4e5f
          imagePullPolicy: IfNotPresent
If your workloads require it, CoreWeave can disable the transparent proxy (but not the metadata caching feature) on a per-cluster basis. To disable the transparent proxy, contact Support with your cluster details.

Advantages

The following list summarizes the key benefits CKS clusters offer for compute-intensive workloads:
  • Strong security: CKS clusters exist in their own VPCs, created by on-site networking. Isolation is built in from start to finish. Each server is equipped with a Data Processing Unit (DPU) that aids in VPC orchestration, further supporting security and performance.
  • Suited to compute-intensive workloads: CoreWeave clusters provide Reserved Node Pools for running large jobs with static Node requirements, such as model training. Nodes are available when and where they’re needed.
  • Visibility into systems: CKS lets you run your own metrics exporters, gather detailed logs, and use low-level access for insight into system health.
  • Observable infrastructure: CKS oversees your compute infrastructure, taking care of the heavy lifting while still allowing you to manage and observe resources at the Control Plane level.
Last modified on June 10, 2026