Enabling IMEX Compute Domains with Dynamic Resource Allocation

Configuring Resource Claims for IMEX channels with Dynamic Resource Allocation (DRA)

Enabling workloads to use IMEX with Dynamic Resource Allocation (DRA)

With DRA, users can enable better utilization of resources such as IMEX domains for workloads on rack-based instances such as the NVIDIA GB200 and GB300.

Limited Availability feature

IMEX with Dynamic Resource Allocation (DRA) is currently a Limited Availability feature in CKS and has the following limitations:

Limited instance support: Currently only supported on rack-based instances such as the NVIDIA GB200 and GB300.
Active development: IMEX and the imex-dra components are still in active development.
Limited Kubernetes Version Support: Requires Kubernetes v1.30 or higher to support DRA for IMEX.
Manual enablement required: This feature is not enabled by default.

If you would like to have this feature enabled for your rack-based instances, please contact your CoreWeave account manager or reach out to our sales team to learn more.

Rack-based instances

GB200/GB300 instances are comprised of individual Nodes (which function as independent instances), yet they are deployed and scheduled in multiples of 18 Nodes per Rack. The key difference from traditional GPU instances is that their performance and throughput are tightly interconnected and optimized at the Rack level, which often leads to them being logically treated as a single, large-scale compute resource by the scheduling and orchestration layers, rather than 18 discrete, isolated instances.

The image below shows the list of Kubernetes Node objects that represent all 18 instances of an NVIDIA GB200 rack.

When scheduling workloads on rack-based instances, you will need to create a ComputeDomain. An example ComputeDomain can look like the following

Example

apiVersion: resource.nvidia.com/v1beta1
kind: ComputeDomain
metadata:
  name: s0-011-us-west-01a
  namespace: cw-nvidia-gpu-operator
spec:
  channel:
    allocationMode: Single
    resourceClaimTemplate:
      name: imex-channel-0
  numNodes: 0 #Deprecated field

Scheduling workloads with rack-based instances

Note that scheduling currently only supports GB200 and GB300 instances.

Once a rack has been delivered to your cluster and a ComputeDomain is created, you can schedule workloads to use IMEX with DRA by creating a ResourceClaim for the IMEX resource.

A ResourceClaimTemplate will exist for the ComputeDomain, which can be used to create ResourceClaims for workloads.

Example

apiVersion: resource.k8s.io/v1beta1
kind: ResourceClaimTemplate
metadata:
  creationTimestamp: "2025-10-29T18:26:02Z"
  finalizers:
  - resource.nvidia.com/computeDomain
  labels:
    resource.nvidia.com/computeDomain: e6d3a22d-12a8-47f7-b55b-fdcbcf4e35d8
    resource.nvidia.com/computeDomainTarget: Workload
  name: imex-channel-s0-011-us-west-01a
  namespace: cw-nvidia-gpu-operator
  resourceVersion: "107136502"
  uid: e9e2590e-f362-434a-a9a8-d48a864ca672
spec:
  metadata:
    creationTimestamp: null
  spec:
    devices:
      config:
      - opaque:
          driver: compute-domain.nvidia.com
          parameters:
            allocationMode: Single
            apiVersion: resource.nvidia.com/v1beta1
            domainID: e6d3a22d-12a8-47f7-b55b-fdcbcf4e35d8
            kind: ComputeDomainChannelConfig
        requests:
        - channel
      requests:
      - allocationMode: ExactCount
        count: 1
        deviceClassName: compute-domain-default-channel.nvidia.com
        name: channel

With the ResourceClaimTemplate created, workloads can now create ResourceClaims to use IMEX channels from the ComputeDomain.

Example

apiVersion: kubeflow.org/v2beta1
kind: MPIJob
metadata:
  name: dra-example-gb200-4x
spec:
  slotsPerWorker: 4
  runPolicy:
    cleanPodPolicy: Running
  mpiReplicaSpecs:
    Launcher:
      replicas: 1
      template:
        spec:
          containers:
            - image: ghcr.io/nvidia/k8s-samples:nvbandwidth-v0.7-8d103163
              name: mpi-launcher
              securityContext:
                runAsUser: 1000
              command: ["/bin/bash", "-c"]
              args:
                - |
                  sleep infinity;
              resources:
                requests:
                  cpu: 2
                  memory: 128Mi
    Worker:
      replicas: 18
      template:
        metadata:
          labels:
            app: nvbandwidth-test-worker
        spec:
          containers:
            - image: ghcr.io/nvidia/k8s-samples:nvbandwidth-v0.7-8d103163
              name: nccl
              securityContext:
                privileged: false
              resources:
                claims:
                - name: imex-channel-0
                requests:
                  cpu: 110
                  memory: 960Gi
                  nvidia.com/gpu: 4
                limits:
                  memory: 960Gi
                  nvidia.com/gpu: 4
              volumeMounts:
                - mountPath: /dev/shm
                  name: dshm
          affinity:
            podAffinity:
              requiredDuringSchedulingIgnoredDuringExecution:
              - labelSelector:
                  matchExpressions:
                  - key: app
                    operator: In
                    values:
                    - nvbandwidth-test-worker
                topologyKey: nvidia.com/gpu.clique
          resourceClaims:
          - name: imex-channel-0
            resourceClaimTemplateName: imex-channel-s0-011-us-west-01a
          volumes:
            - emptyDir:
                medium: Memory
              name: dshm

After submitting the MPIJob we can see the following resources created

Example

$  k get computedomain,resourceclaim,resourceclaimtemplate,resourceslice,ds;
NAME                                                   AGE
computedomain.resource.nvidia.com/s0-011-us-west-01a   22h
NAME                                                                                       STATE                AGE
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-0-imex-channel-0-5vz5j           allocated,reserved   4m3s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-1-imex-channel-0-szpkc           allocated,reserved   4m3s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-10-imex-channel-0-57xxh          allocated,reserved   4m2s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-11-imex-channel-0-qrzqq          allocated,reserved   4m2s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-12-imex-channel-0-qz6ld          allocated,reserved   4m2s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-13-imex-channel-0-gnrst          allocated,reserved   4m2s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-14-imex-channel-0-r2ccv          allocated,reserved   4m1s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-15-imex-channel-0-sffsc          allocated,reserved   4m1s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-16-imex-channel-0-l4l2w          allocated,reserved   4m1s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-17-imex-channel-0-5445b          allocated,reserved   4m1s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-2-imex-channel-0-zdwrt           allocated,reserved   4m3s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-3-imex-channel-0-g62bj           allocated,reserved   4m3s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-4-imex-channel-0-nm2r8           allocated,reserved   4m3s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-5-imex-channel-0-sw48w           allocated,reserved   4m3s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-6-imex-channel-0-7xq25           allocated,reserved   4m3s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-7-imex-channel-0-n8kxc           allocated,reserved   4m3s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-8-imex-channel-0-snk6s           allocated,reserved   4m3s
resourceclaim.resource.k8s.io/dra-example-gb200-4x-worker-9-imex-channel-0-68vvh           allocated,reserved   4m2s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-2mhfk-compute-domain-daemon-hv8p9   allocated,reserved   4m2s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-46bj5-compute-domain-daemon-2plxp   allocated,reserved   4m2s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-6cxtq-compute-domain-daemon-9z6tz   allocated,reserved   4m2s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-7dj8z-compute-domain-daemon-gqznf   allocated,reserved   4m1s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-d46g2-compute-domain-daemon-s6lxj   allocated,reserved   4m2s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-dclxf-compute-domain-daemon-f8vwt   allocated,reserved   4m2s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-dh4vl-compute-domain-daemon-mqzgz   allocated,reserved   4m1s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-dnfb7-compute-domain-daemon-xfp5s   allocated,reserved   4m2s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-f2c64-compute-domain-daemon-ckbts   allocated,reserved   4m1s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-jnbct-compute-domain-daemon-zw9jg   allocated,reserved   4m1s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-l2lzs-compute-domain-daemon-dlj7s   allocated,reserved   74s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-l5tr8-compute-domain-daemon-nkth6   allocated,reserved   4m
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-nmg8q-compute-domain-daemon-cslk6   allocated,reserved   4m2s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-nvz88-compute-domain-daemon-7dxlp   allocated,reserved   4m2s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-tpnkg-compute-domain-daemon-bzkpn   allocated,reserved   4m
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-wg82q-compute-domain-daemon-m7xz2   allocated,reserved   4m2s
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-zc2wf-compute-domain-daemon-r4t9t   allocated,reserved   4m
resourceclaim.resource.k8s.io/s0-011-us-west-01a-ws9p4-zsg4b-compute-domain-daemon-wmw9x   allocated,reserved   4m1s
NAME                                                                                   AGE
resourceclaimtemplate.resource.k8s.io/imex-channel-s0-011-us-west-01a                  22h
resourceclaimtemplate.resource.k8s.io/s0-011-us-west-01a-daemon-claim-template-nbscw   22h
NAME                                                                     NODE       DRIVER                      POOL       AGE
resourceslice.resource.k8s.io/s1vqxs64-compute-domain.nvidia.com-n42kz   s1vqxs64   compute-domain.nvidia.com   s1vqxs64   10m
resourceslice.resource.k8s.io/s2cqxs64-compute-domain.nvidia.com-2vpk9   s2cqxs64   compute-domain.nvidia.com   s2cqxs64   10m
resourceslice.resource.k8s.io/s3bqxs64-compute-domain.nvidia.com-6zpgz   s3bqxs64   compute-domain.nvidia.com   s3bqxs64   10m
resourceslice.resource.k8s.io/s58qxs64-compute-domain.nvidia.com-ggk54   s58qxs64   compute-domain.nvidia.com   s58qxs64   10m
resourceslice.resource.k8s.io/s5fqxs64-compute-domain.nvidia.com-h7bct   s5fqxs64   compute-domain.nvidia.com   s5fqxs64   10m
resourceslice.resource.k8s.io/s67qxs64-compute-domain.nvidia.com-m7c7k   s67qxs64   compute-domain.nvidia.com   s67qxs64   10m
resourceslice.resource.k8s.io/s70qxs64-compute-domain.nvidia.com-6mqlg   s70qxs64   compute-domain.nvidia.com   s70qxs64   10m
resourceslice.resource.k8s.io/s7dqxs64-compute-domain.nvidia.com-6m6f5   s7dqxs64   compute-domain.nvidia.com   s7dqxs64   10m
resourceslice.resource.k8s.io/s7vqxs64-compute-domain.nvidia.com-k6hd6   s7vqxs64   compute-domain.nvidia.com   s7vqxs64   10m
resourceslice.resource.k8s.io/s9lqxs64-compute-domain.nvidia.com-hd4gk   s9lqxs64   compute-domain.nvidia.com   s9lqxs64   10m
resourceslice.resource.k8s.io/s9sqxs64-compute-domain.nvidia.com-7bt68   s9sqxs64   compute-domain.nvidia.com   s9sqxs64   10m
resourceslice.resource.k8s.io/sbdqxs64-compute-domain.nvidia.com-shkz9   sbdqxs64   compute-domain.nvidia.com   sbdqxs64   10m
resourceslice.resource.k8s.io/sf9qxs64-compute-domain.nvidia.com-tscd4   sf9qxs64   compute-domain.nvidia.com   sf9qxs64   10m
resourceslice.resource.k8s.io/sg9qxs64-compute-domain.nvidia.com-zg89f   sg9qxs64   compute-domain.nvidia.com   sg9qxs64   10m
resourceslice.resource.k8s.io/sgcqxs64-compute-domain.nvidia.com-xf4hx   sgcqxs64   compute-domain.nvidia.com   sgcqxs64   10m
resourceslice.resource.k8s.io/sh0qxs64-compute-domain.nvidia.com-xt6ll   sh0qxs64   compute-domain.nvidia.com   sh0qxs64   75s
resourceslice.resource.k8s.io/sh7qxs64-compute-domain.nvidia.com-ckfns   sh7qxs64   compute-domain.nvidia.com   sh7qxs64   10m
resourceslice.resource.k8s.io/shrqxs64-compute-domain.nvidia.com-h4nkt   shrqxs64   compute-domain.nvidia.com   shrqxs64   10m

Next steps

If you are interested in leveraging DRA for IMEX, please contact your CoreWeave account manager or reach out to our sales team to see about having your CKS cluster enabled with this feature.

Enabling workloads to use IMEX with Dynamic Resource Allocation (DRA)​

Rack-based instances​

Scheduling workloads with rack-based instances​

Next steps​

Enabling workloads to use IMEX with Dynamic Resource Allocation (DRA)

Rack-based instances

Scheduling workloads with rack-based instances

Next steps