July 11, 2025 - SUNK v6.6.0 release
SUNK v6.6.0 released with SCIM provisioning, dashboard links, improved node reconciliation, new compute definitions, metrics fixes, segment-calc improvements, and base image upgrades
Update SUNK SUNK v6.6.0 has been released with SCIM provisioning support, enhanced monitoring capabilities, improved node management, new compute definitions, and various infrastructure improvements.
Overview
SUNK v6.6.0 introduces SCIM provisioning via nsscache
, enhanced monitoring with dashboard links, improved node reconciliation, new GPU compute definitions, metrics fixes, segment-calc script improvements, and base image upgrades.
Key changes
SCIM provisioning via nsscache
SCIM provisioning for SUNK is now available via nsscache
, enabling automated user and group management from your Identity Provider (IdP) to CoreWeave clusters.
Configuration requirements
The SCIM suggested settings can be found in the slurm chart values-cw.yaml
at nsscache.nsscacheConfig
.
Required changes from current defaults in the values.yaml
files:
-
nsscache
does not use SSSD and it is best set tofalse
:- In the Slurm chart:
sssdContainer.enabled: false
- In the Slurm-login chart: Change to
directoryCache.source: nsscache
- In the Slurm chart:
-
Requires
nsscache.existingSecret
to be present with auth token in it:nsscache-scim-auth-token
key inside the secret with the token
Dashboard links in SLURM output
Added ability for dashboard links to be added to SLURM output, providing direct access to monitoring dashboards for jobs and nodes.
Improved node reconciliation
Enhanced reconcile logic for nodes that are in a reservation, improving cluster stability and resource management.
New compute definitions
Added new GPU compute definitions:
rtxp6000-8x
: NVIDIA RTX Pro 6000 Blackwell Server Editiongb300-4x
: NVIDIA GB300
Metrics improvements
- Fixed adding
slurm_cluster
label on VMPodScrape Metrics: Resolved label consistency issues - Added scraping of MySQL metrics via VMPodScrape: Enhanced monitoring coverage for database components
Segment-calc script improvements
Improved segment-calc script for visualizing block scheduling, providing better insights into cluster resource allocation and utilization patterns.
Base image upgrades
Upgraded nccl-test base images for improved performance and compatibility with the latest NVIDIA software stack.
Configuration changes
SCIM setup
To enable SCIM provisioning:
- Set
sssdContainer.enabled: false
in the slurm chart - Change
directoryCache.source: nsscache
in the slurm-login chart - Create a secret with the
nsscache-scim-auth-token
key containing your SCIM authentication token - Configure
nsscache.existingSecret
to reference your secret
Compute definitions
The new compute definitions are automatically available and can be used in job submissions and node configurations.
Migration notes
Existing SUNK deployments will continue to work, but you may want to:
- Review your SSSD configuration if you plan to use SCIM provisioning
- Test the new compute definitions in a non-production environment
- Update any custom monitoring configurations to take advantage of the new metrics
- Verify your SCIM configuration with the new nsscache integration
Additional resources
For detailed information about configuring and using SUNK v6.6.0, see: