Skip to main content
With the release of ncore-image v.2.10.1, you can use NVSHMEM and GDRCopy in your container image for high-performance GPU-to-GPU communication. This page is for cluster operators and workload engineers who need to enable direct GPU-to-GPU memory access on SUNK or CKS nodes.

Overview

NVSHMEM (NVIDIA SHMEM) and GDRCopy (GPU Direct RDMA Copy) enable direct memory access between GPUs without involving the CPU, reducing latency and increasing throughput for certain workloads. The following sections describe how to obtain the supported image, what modifications it includes, and how to configure your containers to use NVSHMEM and GDRCopy.

Access the image

To gain access to ncore-image v.2.10.1, contact CoreWeave Support.

Image modifications

ncore-image v.2.10.1 contains the following modifications to support NVSHMEM usage with ibgda. These modifications are applied at the node level and do not require changes from workload authors. The image includes the following NVIDIA driver options:
  • nvidia.NVreg_EnableStreamMemOPs=1
  • nvidia.NVreg_RegistryDwords="PeerMappingOverride=1;"
The image also includes the GDRCopy driver gdrdrv-dkms_2.5-1.

Use the image

When you use this image, you must complete the following steps so that your containers can access GDRCopy and NVSHMEM correctly.

Enable the GDRCopy environment variable

Set the environment variable in the container to enable GDRCopy. This lets the container access gdrdrv:
env:
  - name: NVIDIA_GDRCOPY
    value: enabled
If you’re using Slurm, this environment variable is already set.

Patch NVSHMEM ibgda

In NVSHMEM version 3.2.5, you must patch ibgda in one or more of your containers so that NVSHMEM recognizes the InfiniBand devices presented on CoreWeave nodes. Download NVSHMEM version 3.2.5. In src/modules/transport/ibgda/ibgda.cpp, change line 3659 from mlx5 to ibp to work in SUNK and CKS. Original code:
if (!strstr(name, "mlx5")) {
    ftable.close_device(device->context);
    device->context = NULL;
    NVSHMEMI_WARN_PRINT("device %s is not enumerated as an mlx5 device. Skipping...\n",
                        name);
    continue;
}
Modified code:
if (!strstr(name, "ibp")) {
    ftable.close_device(device->context);
    device->context = NULL;
    NVSHMEMI_WARN_PRINT("device %s is not enumerated as an mlx5 device. Skipping...\n",
                        name);
    continue;
}

Additional resources

Last modified on May 27, 2026