Skip to main content

June 13, 2025 - CKS Kubernetes v1.32 available

CKS now supports Kubernetes v1.32 with enhanced features and performance improvements

Update CKS CKS now supports Kubernetes v1.32, bringing the latest features and security updates to your Kubernetes clusters. This release also includes significant upgrades to SUNK, including Slurm 24.11.05, NCCL 2.26.5, and new CUDA runtime images.

Overview

This release brings Kubernetes v1.32 support to CKS and significant upgrades to SUNK, including Slurm 24.11.05, NCCL 2.26.5, and new CUDA runtime images. These updates provide the latest features, security improvements, and performance enhancements.

CKS Kubernetes v1.32 support

CKS now supports Kubernetes v1.32, bringing the latest features and security updates to your Kubernetes clusters.

Key improvements in Kubernetes v1.32

FeatureDescription
Enhanced securityLatest security patches and improvements
Performance optimizationsBetter resource utilization and faster operations
New API featuresAdditional Kubernetes API capabilities
Improved stabilityBetter reliability and error handling
Extended supportLonger support lifecycle for production workloads

Migration considerations

When upgrading to Kubernetes v1.32:

  • Review the Kubernetes v1.32 changelog for detailed changes
  • Test your applications with the new version in a non-production environment
  • Update any custom controllers or operators to ensure compatibility
  • Review deprecated APIs and plan for future migrations

SUNK major upgrades

Slurm 24.11.05 upgrade

SUNK has been upgraded to Slurm 24.11.05, bringing the latest upstream fixes and enhancements.

Key improvements

ImprovementDescription
Bug fixesLatest upstream bug fixes and stability improvements
Performance enhancementsBetter job scheduling and resource management
Security updatesLatest security patches and improvements
Feature additionsNew Slurm features and capabilities

NCCL 2.26.5 upgrade

NCCL has been upgraded to version 2.26.5, improving GPU communication performance.

Performance benefits

  • Faster GPU communication: Improved collective communication performance
  • Better scalability: Enhanced performance for large-scale GPU clusters
  • Reduced latency: Lower communication overhead between GPUs
  • Enhanced reliability: Better error handling and recovery mechanisms

New CUDA runtime images

Added new CUDA runtime images for versions 12.8.1 and 12.9.0, providing the latest CUDA capabilities.

Available CUDA versions

VersionDescription
CUDA 12.8.1Stable CUDA runtime with latest patches
CUDA 12.9.0Latest CUDA features and improvements

Additional SUNK enhancements

nsscache integration

Introduced nsscache as an alternative option to SSSD for user caching, providing more flexibility in directory service configuration.

Enhanced configuration options

  • Timeout-based forced deletion: Enabled timeout-based forced deletion of compute pods (disabled by default)
  • SlurmdSpecOverride: Backported Slurm 25.05 features for better container awareness
  • ConfigMap flexibility: Enhanced controller.etcConfigMap to accept multiple ConfigMaps
  • Segment-calc script: Added visualization tool for block-topology segment allocations

VMPodScrape support

Added support for VMPodScrape as an alternative to PodMonitor for metric gathering, providing more flexibility in monitoring configuration.

Configuration updates

Default security improvements

  • SSH password authentication: Defaulted PasswordAuthentication to no for improved security
  • Namespace management: Charts now manage the ns.coreweave.cloud/managed namespace label

Documentation

For detailed information about these updates, see:

Migration notes

CKS clusters

  • Existing clusters will continue to work without changes
  • New clusters can be created with Kubernetes v1.32
  • Plan for upgrading existing clusters to v1.32 when ready

SUNK deployments

  • Existing SUNK deployments will automatically benefit from the upgrades
  • New deployments will use the latest versions by default
  • Test new features in non-production environments before enabling in production