Skip to main content

August 14, 2025 - SUNK v6.7.0 release

SUNK v6.7.0 released with CUDA 12.9 support, SCIM and nsscache updates, HDF5 plugin support, and various bug fixes including directory service integration and Slurm improvements

Update SUNK SUNK v6.7.0 has been released with CUDA 12.9 support, enhanced SCIM and nsscache functionality, HDF5 plugin support, and various bug fixes for improved system stability and performance.

Overview

CUDA 12.9 support

CUDA 12.9 is now supported in the compute node definition.

SCIM and nsscache updates

  • SCIM filtering: New configuration options are now available for SCIM user and group filtering, via nsscache.nsscacheConfig.default.scim_users_parameters.

  • Shadow map for SCIM: nsscache for SCIM will now create the shadow map by default.

  • Home directory override: NSSCache now has the ability to override the home directory. This can be set with the following settings based on your authentication method:

    • For SCIM: nsscache.nsscacheConfig.passwd.scim_override_home_directory
    • For LDAP: nsscache.nsscacheConfig.passwd.ldap_override_home_dir

CronJob scheduling

The default schedule for the nsscache CronJob has been changed to run every minute (* * * * *).

HDF5 plugin support

Support for the HDF5 plugin is now included, allowing for advanced data handling capabilities.

Bug fixes

  • Directory Service Integration: Enable sudoGroups mounts when nsscache is configured.

  • Correct Gres Type for NVIDIA RTX PRO 6000: Fixed the gres type in the rtxp6000-8x compute definition to ensure proper GPU detection.

  • Slurm Fixes for TaskProlog: Backporting of Slurm fixes to address errors when using TaskProlog.

  • Node Locking Improvements: Continue polling for pod information within the context timeout to prevent errors with node locking.

  • Switch to bitnamilegacy: Switched to bitnamilegacy images for the following resources as per bitnami recommendations:

Impact

During upgrades, there may be a moment where MySQL enters an Error state. While the MySQL pod is down, other components that depend on it - such as the Slurm accounting and controller pods - may experience issues coming up. Depending on how long it takes to come back up, it could also trigger a cascading crash. This should resolve once the bitnamilegacy image is pulled. If that process is fast enough, these issues will not be visible.