Select GPU driver versions in CKS Node Pools

This page explains how to select and update NVIDIA GPU driver versions on CoreWeave Kubernetes Service (CKS) Node Pools, and how to target specific driver versions from your workloads. Use this guidance when you need a workload to run against a known driver version or when you want to control which driver Nodes receive during configuration updates.

Limitations

GPU driver management in CKS has the following limitations:

No Cloud Console support: Configuration must be done through Kubernetes manifests.
Limited version updates: You can only select major version updates. When minor version updates are available, CKS generates a new pending configuration on the Node Pool. Pending configurations can be found on the Node Pool’s status.pendingNodeConfiguration field. See Manage Node Pool configuration for more details.
Release channels are not supported: The latest and stable release channels are not supported in Node Pools.

Contact CoreWeave support if you have questions about GPU driver management.

Create a new Node Pool with a specific driver version

Driver versions are configured in the Node Pool manifest. To select a driver version, add the gpu section to your Node Pool manifest’s spec section, specifying the desired major version without dots. For example, for an H100 Node Pool, specify the driver version as 570:

apiVersion: compute.coreweave.com/v1alpha1
kind: NodePool
metadata:
  name: example-nodepool
spec:
  computeClass: default
  instanceType: gd-8xh100ib-i128
  targetNodes: 1
  gpu:
    version: "570"  # Specify driver version by major version

If no driver is specified, the Node Pool automatically uses the latest available driver.

Update the driver version on an existing Node Pool

This section shows how to change the driver version on a Node Pool that already specifies one. If a driver is currently specified on an existing Node Pool, you can update it to a new major version by modifying the existing Node Pool manifest.

# Original Node Pool
apiVersion: compute.coreweave.com/v1alpha1
kind: NodePool
metadata:
  name: test-nodepool
spec:
  computeClass: default
  instanceType: gd-8xh100-i128
  nodeConfigurationUpdateStrategy:
    type: OnSpecUpdate
  targetNodes: 1
  gpu:
    version: "570"

Update the Node Pool manifest to:

# Updated Node Pool with new driver version
apiVersion: compute.coreweave.com/v1alpha1
kind: NodePool
metadata:
  name: test-nodepool
spec:
  computeClass: default
  instanceType: gd-8xh100-i128
  nodeConfigurationUpdateStrategy:
    type: OnSpecUpdate
  targetNodes: 1
  gpu:
    version: "580"

Apply GPU driver updates

With the default node configuration update strategy OnSpecUpdate, updating the driver version automatically stages the new configuration onto the Node Pool. You can then reconfigure-reboot existing Nodes to apply the change. For more information about configuration management, see Manage Node Pool configuration.

Target driver versions using Node labels and selectors

Once your Node Pool is configured with a driver version, you can identify and target Nodes by their driver version from within Kubernetes. Driver version information is exposed on Nodes through Kubernetes labels. You can use these labels to get information on current driver versions and to target specific driver versions in your workloads.

# Check the current driver version label on nodes
kubectl get nodes --show-labels | grep driver-version

Node labels are in the format gpu.coreweave.cloud/driver-version=[DRIVER-VERSION], where [DRIVER-VERSION] is the full driver version string. For example, a Node with the label gpu.coreweave.cloud/driver-version=570.172.08-0ubuntu1 is running driver version 570.172.08-0ubuntu1.

The gpu.coreweave.cloud/driver-version label is always applied to Nodes, even if no driver version is specified in the Node Pool manifest.

Target specific driver versions in workloads

The gpu.coreweave.cloud/driver-version label lets you target Nodes with exact driver version matches.

For detailed information about scheduling workloads on Nodes with specific driver versions, see Scheduling Workloads. Avoid scheduling across multiple driver versions in a single Node Pool.

Schedule workloads on Nodes with specific driver versions

For workloads that require a specific driver version, use an exact match with the nodeSelector field:

apiVersion: v1
kind: Pod
metadata:
  name: gpu-workload
spec:
  containers:
  - name: gpu-container
    image: nvidia/cuda:11.8-base
    command: ["nvidia-smi"]
    resources:
      limits:
        nvidia.com/gpu: 1
  nodeSelector:
    gpu.coreweave.cloud/driver-version: "570.172.08-0ubuntu1"

Troubleshoot scheduling issues

If Pods fail to schedule due to driver version constraints, check the available driver versions in your cluster. Replace [POD-NAME] with the name of your Pod.

# Check available driver versions
kubectl get nodes --show-labels | grep driver-version

# Check Pod events for scheduling failures
kubectl describe pod [POD-NAME] | grep -A 10 Events:

Common scheduling issues include:

No Nodes available with the exact driver version specified.
Nodes with the required driver version are unavailable due to resource constraints.
Driver version constraints conflict with other scheduling requirements.

Troubleshooting

This section covers common error conditions and how to verify the active driver version on your Nodes.

Common error conditions

If you encounter issues with driver configuration, check the Node Pool status for error conditions:

Status:
  Conditions:
    Last Transition Time: 2025-06-30T19:25:16Z
    Message: unable to create configuration for NodePool
    Reason: InternalError
    Status: False
    Type: Validated

Node Pool errorsFor more information about Node Pool events and possible error conditions, see Node Pool events.

Verify the driver version

To verify your Node Pool configuration and driver status, use any of the following methods. Describe the Node Pool: Replace [NODE-POOL-NAME] with the name of your Node Pool.

# Check Node Pool status
kubectl describe nodepool [NODE-POOL-NAME]

Check the Node labels for driver version:

# Check node labels for driver version
kubectl get nodes --show-labels | grep driver-version

Or, check the GPU driver information on the Nodes by running nvidia-smi on a Pod running on the Node. Replace [POD-NAME] with the name of your Pod.

# Check GPU driver information on nodes
kubectl exec -it [POD-NAME] -- nvidia-smi

Next steps

Apply the new driver version to the Node Pool by queuing a reconfigure reboot for the Node Pool.

​Limitations

​Create a new Node Pool with a specific driver version

​Update the driver version on an existing Node Pool

​Apply GPU driver updates

​Target driver versions using Node labels and selectors

​Target specific driver versions in workloads

​Schedule workloads on Nodes with specific driver versions

​Troubleshoot scheduling issues

​Troubleshooting

​Common error conditions

​Verify the driver version

​Next steps

Limitations

Create a new Node Pool with a specific driver version

Update the driver version on an existing Node Pool

Apply GPU driver updates

Target driver versions using Node labels and selectors

Target specific driver versions in workloads

Schedule workloads on Nodes with specific driver versions

Troubleshoot scheduling issues

Troubleshooting

Common error conditions

Verify the driver version

Next steps