NVLink domains and placement
An NVLink domain (sometimes called an NVLink cluster) is the set of Nodes that can reach each other over NVLink. Your Pods must land on Nodes that are part of the same domain, or cross-Node NVLink memory access won’t work as expected. In practice, that means you must align scheduling with physical connectivity. On NVL72-powered instances, all Nodes in a single rack share the same NVLink domain. Thenvidia.com/gpu.clique label identifies the NVLink partition within that domain. Workloads must use this label as a Pod affinity topologyKey so that all related Pods land on Nodes in the same partition. On CKS full-rack deployments, the default partition spans the entire domain, so partition and domain boundaries align. For the mechanics of setting this up, see IMEX with Dynamic Resource Allocation.
IMEX on CoreWeave Kubernetes Service
CoreWeave delivers IMEX channel access to workloads through Kubernetes Dynamic Resource Allocation and the NVIDIA DRA driver. You declare aComputeDomain and attach ResourceClaims to Pods that need IMEX. The platform provisions the supporting IMEX-related components on the Node. You don’t manually configure low-level IMEX services inside your containers.
IMEX with DRA is in Limited Availability on supported instance types and continues to evolve. For prerequisites, enablement, and full YAML examples, read IMEX with Dynamic Resource Allocation.
Earlier clusters used a transparent
nvidia-imex DaemonSet model for IMEX channels. New work should use DRA and ComputeDomain resources. See Cluster Components for how these pieces fit together.Where to go next
The following resources provide more context and configuration guidance:- IMEX with Dynamic Resource Allocation: create
ComputeDomainobjects, claim IMEX channels, and verifyResourceClaimstate. - NVIDIA IMEX guide: background on NVLink multi-Node architecture, IMEX behavior, and terminology such as Fabric Manager roles (for administrators who need NVIDIA’s full reference).