Slurm images
The SUNK team builds several images: base images (controller
, slurmd-cw-*
) and extras
images (controller-extras
, slurmd-cw-*-extras
) built on top of the base.
controller
: primarily for the control plane components but can also be used by CPU nodescontroller-extras
: extras for use on login nodes or CPU nodesslurmd-cw-*
: images for GPU nodesslurmd-cw-*-extras
: extras images for GPU nodes
For full relationship information please see the following table:
Image | Based On | Image Type |
---|---|---|
controller (22.04) | ubuntu:jammy-20240911.1 | Base |
controller-extras (22.04) | controller (22.04) | Extra |
slurmd-cw-cu122 (22.04) | nccl-tests:12.2.2-devel-ubuntu22.04-nccl2.23.4-1-3ef8839 | Base |
slurmd-cw-cu124 (22.04) | nccl-tests:12.4.1-devel-ubuntu22.04-nccl2.23.4-1-3ef8839 | Base |
slurmd-cw-cu128 (22.04) | nccl-tests:12.8.0-devel-ubuntu22.04-nccl2.25.1-1-57fa979 | Base |
slurmd-cw-cu122-extras (22.04) | slurmd-cw-cu122 (22.04) | Extra |
slurmd-cw-cu124-extras (22.04) | slurmd-cw-cu124 (22.04) | Extra |
slurmd-cw-cu128-extras (22.04) | slurmd-cw-cu128 (22.04) | Extra |
We support up to two Ubuntu LTS versions and several CUDA versions.
Note
Other packages in the images that are not listed below may be dependencies for the listed packages and may change.
Base images
Purpose of base images
Serve as the base images for the Slurm control plane components as well as the login and CPU-only slurmd pods.
Main packages with pinned versions
- Slurm: Workload manager
- s6-overlay: Process supervisor and service manager.
- Pyxis: Plugin for Slurm to enable containerized jobs.
- enroot: Container runtime.
- kubectl: Command-line tool for interacting with Kubernetes clusters.
Main Packages from repositories
- End-user utilities: htop, ping, traceroute, net-tools, numactl, sudo, wget
- OpenSSH server: Enables secure remote access to the login nodes.
- SSSD (System Security Services Daemon): Provides integration with external authentication providers using LDAP.
- Munge: Authentication service for creating and validating credentials.
- Environment modules: Allows dynamic modification of the user's environment.
- libnvidia-container: NVIDIA Container Toolkit for GPU support
Extra images
Purpose of extra images
Extends the base images with additional user-facing tools and utilities for convenience and ease of use.
Additional packages with pinned versions
- Conda: Package, dependency and environment management for multiple languages.
- Micromamba: Faster and more robust alternative to Conda.