ghcr.io/coreweave/ml-containers and built from the public coreweave/ml-containers repository, where you can inspect the Dockerfiles to see exactly what each image installs.
Available images
The following PyTorch images are the recommended starting points for most customers.| Image | Description | Recommended for |
|---|---|---|
torch | A custom build of PyTorch, torchvision, and torchaudio tuned for the CoreWeave platform. | A smaller starting point with the core PyTorch stack. |
torch-extras | The torch image plus a set of common PyTorch extensions. | Distributed training and LLM training. This is the recommended default. |
nightly-torch | An experimental, daily release channel that tracks the latest development versions of PyTorch. | Testing the latest features, not production. |
nightly-torch-extras | The PyTorch extensions built on top of nightly-torch. | Testing the latest features, not production. |
PyTorch base images (torch)
Theml-containers/torch image contains custom builds of PyTorch, torchvision, and torchaudio, each tuned for use on the CoreWeave platform.
Each image is built on an Ubuntu LTS release. The image tag indicates the Ubuntu version, which in turn determines the Python version.
Image variants
CoreWeave builds two variants of thetorch image. Both variants are also available for torch-extras.
base: Includes only the essentials (CUDA, torch, torchvision, and torchaudio). This variant has a small image size, which makes it fast to launch.nccl: Includes the development libraries and build tools, such asnvcc, that are required to compile other PyTorch extensions. This variant is larger thanbase.
The
nccl variant is built on component libraries optimized for the CoreWeave platform. For more details, see coreweave/nccl-tests.PyTorch extras (torch-extras)
Theml-containers/torch-extras image extends the torch image with a set of common PyTorch extensions, including DeepSpeed, xformers, and NVIDIA Apex. (FlashAttention is already included in the base torch image.) Each extension is compiled against the custom PyTorch builds in the torch image.
For the complete, current list of included extensions, see the coreweave/ml-containers repository.
Both the base and nccl variants are available for torch-extras, matching those provided for torch. The base variant stays small because it uses a multi-stage build that avoids including CUDA development libraries, even though those libraries are required to build the extensions.
Customers running supervised fine-tuning, reinforcement learning, pretraining, or any multi-node PyTorch training should start with torch-extras.
Nightly images
Thenightly-torch image is an experimental, nightly release channel of the PyTorch base images, in the style of PyTorch’s own nightly preview builds. It features the latest development versions of torch, torchvision, and torchaudio, pulled daily and compiled from source. The nightly-torch-extras image builds the PyTorch extensions on top of nightly-torch.
Choose an image tag
Image tags encode the component versions in each build. For example:- The variant, either
baseornccl. - The CUDA version, for example
cuda12.9.1. - The Ubuntu version, for example
ubuntu22.04. - The PyTorch, torchvision, and torchaudio versions, for example
torch2.8.0,vision0.23.0, andaudio2.8.0. - The NCCL version and the ABI version.
Match the CUDA version to your GPU driver
Choose an image whose CUDA version is compatible with the GPU driver on your nodes. Don’t assume the newest image is the right one. A recently published image can use a CUDA version that’s newer than your nodes’ driver supports. When this happens, workloads fail to start with driver-compatibility errors. You can check the driver version on a node by runningnvidia-smi.
Use an image
After you’ve chosen an image and a tag, you can use an ML container image as a base for your own custom image, or run it directly on CKS or SUNK. In the following examples, replace[TAG] with a tag from the packages list.
Build a custom image
To add your own dependencies, use an ML container image as the base image in a Dockerfile:Run on CKS
Reference the image in theimage field of a Pod specification:
Run on SUNK
SUNK uses Pyxis and enroot to run containers. Pass the image tosrun with the --container-image flag. In the container URI, a # separates the registry host from the image path:
Additional resources
For more information, see the following resources:coreweave/ml-containersrepository: Dockerfiles and source for all images.- Packages list: every published image and its current tags.
- Slurm images: the SUNK-built
slurm-containersimages for the Slurm control plane and nodes. - Create custom images: customize a published image for SUNK.
- Introduction to third-party frameworks: frameworks supported on CKS and SUNK.