Skip to main content

Configure individual Login nodes

Enable and configure individual Slurm login nodes

Running jobs and management tasks in the Slurm cluster requires connecting to the Slurm login node. Login nodes are configured on per-user basis and deployed as a sub-chart to the Slurm chart. Pass these configuration values from the values.yaml of the Slurm chart to define how slurm-login integrates and functions.

Manage individual login pods

You can manage individual login pods using the following features and functionality:

  • Enable the deployment of the sub-chart: To enable the slurm-login functionality as part of the Slurm chart, set the slurm-login.enabled parameter to true.

  • Reboot individual login pods: If an individual login pod is out of sync with the underlying StatefulSet, run the reboot command from within the pod. This command deletes and recreates the pod using the updated version. If the pod is out of sync, a Message of the Day (MOTD) will appear on SSH login with instructions to reboot:

    Example
    **********************************************************************
    * *
    * The login statefulset has been updated, please restart your login *
    * pod to get the latest changes. *
    * *
    * To restart the login pod, issue the command "reboot". *
    **********************************************************************
  • Access individual login pods: For instructions on accessing each individual Slurm login pod and running Slurm jobs, refer to Connect to a Slurm login node.

Manage user identities and provision resources

The slurm-login.directoryCache parameter defines the directory service configuration used for managing user identities and provisioning resources. This parameter includes multiple sub-values, with the key sub-values detailed below.

Select all users from a specified group

slurm-login.directoryCache.selectGroups provides a list of user groups, from which the slurm-login chart will retrieve all associated users. This acts as a filter, meaning only users belonging to any of the specified groups will be included. It uses an OR logic, so a user needs to be in at least one of the listed groups to be selected.

Define the polling interval for detecting changes to users and user groups

slurm-login.directoryCache.interval defines the polling interval for detecting changes to users and user groups. This interval determines how frequently updates are applied, and modifies resources accordingly.

Verify individual Slurm login resources

You can verify created resources using the StatefulSets and Services as shown below.

Example
$
kubectl get sts
NAME READY AGE
slurm-login 1/1 20d
slurm-login-slurmuser1-7fcd49e 1/1 31d
slurm-login-slurmuser2-a19c153 1/1 31d
slurm-login-slurmuser3-4fe7f1d 1/1 31d
slurm-login-slurmuser4-4f705de 1/1 31d
slurm-login-slurmuser5-5e909b3 1/1 31d
Example
$
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
slurm-login LoadBalancer 10.96.46.53 <pending> 22:31408/TCP 35d
slurm-login-0 LoadBalancer 10.96.146.233 <pending> 22:31276/TCP 35d
slurm-login-slurmuser1-7fcd49e ClusterIP 10.96.61.246 <none> 22/TCP 32d
slurm-login-slurmuser2-a19c153 ClusterIP 10.96.160.110 <none> 22/TCP 32d
slurm-login-slurmuser3-4fe7f1d ClusterIP 10.96.2.45 <none> 22/TCP 32d
slurm-login-slurmuser4-4f705de ClusterIP 10.96.203.63 <none> 22/TCP 32d
slurm-login-slurmuser5-5e909b3 ClusterIP 10.96.184.148 <none> 22/TCP 32d

Optionally, to ensure that users only use their designated pods, you can disable the shared Slurm login StatefulSet and slurm-login Service.

List the directory services to be configured

slurm-login.directoryCache.directoryService.directories specifies a list of directory services to be configured. This is similar to the directoryService configuration in Slurm and can be duplicated or referenced using a YAML anchor for reuse.

Example configuration

See an example of a typical configuration below:

Example
slurm-login:
enabled: true
directoryCache:
selectGroups: ["group1", "group2"]
interval: 1m
directoryService:
# Google Secure LDAP
directories:
- name: google-example.com
enabled: true
ldapUri: ldaps://ldap.google.com:636
user:
defaultShell: "/bin/bash"
fallbackHomeDir: "/home/%u"
overrideHomeDir: /mnt/nvme/home/%u
ldapsCert: google-ldaps-cert
schema: rfc2307bis

Parameter reference table

ParameterDescription
slurm-login.enabledSet this value to true in the Slurm chart to enable the slurm-login chart.
slurm-login.directoryCacheDefines the directory service configuration used for managing user identities and provisioning resources. The key sub-values include:
slurm-login.directoryCache.selectGroupsProvides a list of user groups that the slurm-login chart will use to retrieve all associated users. This acts as a filter, meaning only users belonging to any of the specified groups will be included. It uses an OR logic, so a user needs to be in at least one of the listed groups to be selected.
slurm-login.directoryCache.intervalDefines the polling interval for detecting changes to users and user groups. This interval determines how frequently updates are applied, and resources are modified accordingly.
slurm-login.directoryCache.directoryService.directoriesSpecifies a list of directory services to be configured. This is similar to the directoryService configuration in Slurm and can be duplicated or referenced using a YAML anchor for reuse.

For extra customizations, refer to the full parameters list.