Manage users in SUNK
Enable automatic user provisioning for SUNK with nsscache
SUNK User Provisioning (SUP) uses nsscache, a lightweight directory service, to manage users across CoreWeave clusters. It supports two directory protocols: SCIM and LDAP. We recommend SCIM, as it enables automated user and group synchronization from your upstream identity provider (IdP) to SUNK. When a user is added or removed in your IdP, the change is reflected in your SUNK cluster within minutes, ensuring access stays accurate, secure, and up-to-date.
This implementation replaces the previous directory service based on SSSD, securing access and improving reliability. Whether your users are managed in a third-party IdP or directly in CoreWeave IAM, SUP automatically provisions, updates, and removes POSIX users in your cluster without manual intervention.
This guide demonstrates how to configure nsscache to support SUP with the following steps:
- Set up SUNK User Provisioning (SUP).
- Create user groups with CoreWeave IAM or your upstream IdP.
- Create a Kubernetes Secret and add it to the
slurmchart.
Prerequisites
Before making any changes to the nsscache configuration, you must first do the following:
- Set up SUP. SUP is required to provision cluster access to users, whether you are using a federated IdP or not.
- Create user groups with CoreWeave IAM or your upstream IdP. SUP provisions access to groups of users, rather than individual users. These groups must be created before configuring
nsscache, and the group names must exactly match the group names specified in thensscacheconfiguration.
Configure a Kubernetes Secret for SUP
You will need to create a Kubernetes Secret that contains the configuration for your directory service, then add the Secret to the slurm chart's values.yaml file, as detailed below.
Create a Kubernetes Secret
Create a Kubernetes Secret that contains the configuration for your directory service.
CoreWeave strongly discourages including sensitive information, such as plaintext user credentials, directly in your values.yaml configuration. Instead, use a Kubernetes Secret for added security and manageability. CoreWeave provides encryption at rest for etcd data, which includes Secrets.
This secret can be a SCIM token or an LDAP service account password. Be sure to follow the appropriate naming conventions detailed below when creating your Secret.
Naming conventions for SCIM Secrets
For a SCIM Secret, the suggested Secret name is <release-name>-nsscache-scim-secret.
Inside the Secret, the key under data must be named nsscache-scim-auth-token, as shown in the example below. This contains the base64-encoded SCIM token.
apiVersion: v1kind: Secretmetadata:name: scim-auth-tokendata:nsscache-scim-auth-token: <base64-encoded-scim-token>
Naming conventions for LDAP Secrets
For an LDAP Secret, the suggested Secret name is <release-name>-nsscache-ldap-secret.
Inside the Secret, the key under data must be named nsscache-ldap-password. This contains the base64-encoded password for the LDAP service.
Update the slurm chart with your Kubernetes Secret
After creating the Secret, you will need to update the slurm chart's values.yaml file.
Edit the nsscache.existingSecret parameter with the name of your Secret, as shown in the examples below.
Provision cluster access to groups of users
SUP provisions cluster access to groups of users, rather than individual users. You must create the groups in CoreWeave IAM or your upstream IdP before configuring nsscache. The names of your created user groups must exactly match the group names specified in the nsscache configuration.
sssdContainer:enabled: false# You may remove the `directoryService` key when using nsscachedirectoryService:nsscache:enabled: trueexistingSecret: scim-auth-tokensudoGroups:- slurm-adminsnsscacheConfig:default:source: scimscim_base_url: https://api.coreweave.com/scim/<org>scim_users_parameters: filter=active eq "true"&groups=slurm-users,slurm-adminsscim_groups_parameters: excludeInactiveUsers=true&includeVirtualUserGroups=slurm-users,slurm-admins
The suggested nsscache configuration for SCIM is set by default in the values.yaml file. For a full list of configuration options, see the SCIM parameter reference.
The following parameters are used to configure the groups that will be provisioned:
sudoGroupsspecifies the user groups that can runsudocommands in the nodes.scim_users_endpointspecifies the SCIM endpoint path for retrieving user data. The default value isUsers.scim_groups_endpointspecifies the SCIM endpoint path for retrieving group data. The default value isGroups.scim_users_parametersspecifies that the users in these groups will be provisioned, meaning they will find their User ID in the cluster. By default, this will filter out inactive users withfilter=active eq "true".scim_groups_parametersspecifies the groups to be provisioned. By default, this will filter out inactive users withexcludeInactiveUsers=true. If no user groups are specified, all groups will be provisioned.
Filter specific user groups
scim_groups_parameters and scim_users_parameters allow optional parameters to be added to the groups and users endpoints, respectively. Special characters (spaces, quotes, etc.) will be automatically URL encoded.
To provision access to specific user groups, list the group names in both the scim_users_parameters and scim_groups_parameters parameters, as follows:
scim_users_parameters: filter=active eq "true"&groups=slurm-users,slurm-adminsscim_groups_parameters: excludeInactiveUsers=true&includeVirtualUserGroups=slurm-users,slurm-admins
The above example will do the following:
- Filter out inactive users
- Provision access to the users in the
slurm-usersandslurm-adminsgroups. - Provision the virtual user groups
slurm-usersandslurm-admins.
You must specify the group names exactly as they appear in your IdP, and list them in both the scim_users_parameters and scim_groups_parameters parameters to ensure that the users and groups are provisioned correctly.
Example: Authentik LDAP values
sssdContainer:enabled: false# Remove the `directoryService` keydirectoryService:nsscache:enabled: trueexistingSecret: nsscache-ldap-secretnsscacheConfig:default:source: ldapldap_uri: ldap://authentik-outpost-ldap-outpostldap_base: dc=coreweave,dc=cloudldap_bind_dn: cn=ldapsvc,dc=coreweave,dc=cloudldap_bind_password:ldap_rfc2307bis: 1ldap_default_shell: /bin/bashpasswd:ldap_filter: (objectClass=user)ldap_override_home_dir: /mnt/home/%%ugroup:ldap_filter: (objectClass=group)shadow:ldap_filter: (objectClass=user)sshkey:ldap_filter: (objectClass=user)
Verify and troubleshoot nsscache
You do not need to manually sync data with nsscache. It takes about two minutes for data to sync from an identity provider to SUNK through nsscache. If you encounter an issue with user data not being available, wait a few minutes and check again.
To validate that nsscache is working,
-
Log in to the Login pod for your cluster.
Example$kubectl exec -it <LOGIN-POD> -c sshd -- /bin/bash -
Access the
/etc/nsscachedirectory and list the files within:Example$cd /etc/nsscache && lsThis should return the following files:
Examplegroup.cache passwd.cache shadow.cache sshkey.cache -
Check the contents of the
group.cache,passwd.cache,shadow.cache, andsshkey.cachefiles directly for information about the users and groups in your directory, as shown with thecatcommand below:Example$cat sshkey.cacheAlternatively, use the
getentcommand with thegroup,passwd, orshadowoptions to list the system users and groups in addition to the data pulled bynsscache, as shown below:Example$getent passwdNote that the
getentcommand does not directly retrieve SSH keys. To view the contents of thesshkey.cachefile, you will need to use thecatcommand, as demonstrated above.
Migrate to nsscache from SSSD
In SUNK v7.0.0, nsscache is now the default directory service. The following steps only apply if you are using an earlier version of SUNK and are migrating to nsscache for your directory service.
Disable SSSD in the slurm and slurm-login charts
To enable nsscache, you must edit the slurm Helm chart. If using individual login pods, you must also edit the slurm-login Helm chart.
Update the slurm chart
In the slurm chart's values.yaml file, update the sssdContainer.enabled parameter to false and remove the directoryService section, as shown below:
sssdContainer:enabled: false# Remove the `directoryService` keydirectoryService:
This disables SSSD, which is incompatible with nsscache.
You may remove the directoryCache.directoryService section in the values.yaml of the slurm chart, as nsscache does not use this configuration.
You must disable SSSD to use nsscache. For SUNK versions v6.x and below, SSSD is enabled by default.
Update the slurm-login chart
If using individual login pods, you will also need to edit the slurm-login chart's values.yaml file.
Set the directoryCache.source parameter to nsscache, as shown below:
directoryCache:source: nsscache# Remove the `directoryService` keydirectoryService:
You may remove the directoryCache.directoryService section in the values.yaml of the slurm-login chart, as nsscache does not use this configuration.