Skip to main content

Connect to the Slurm Login Node

Connect to an individual Slurm login node using SSH

Running jobs and management tasks in the Slurm cluster requires connecting to the Slurm login node. You can access the login node through SSH or kubectl exec, depending on your directory service configuration. Connecting through SSH requires a directory service pre-configured for SSH access, while kubectl exec does not.

For information about initial setup of Slurm login nodes, see Configure Slurm individual login nodes.

Connect through SSH

Note

Accessing the login node through SSH requires a directory service with users configured for SSH access.

First, use the kubectl get svc slurm-login command to identify the login service's IP address or DNS record. The EXTERNAL-IP field in the command output contains the relevant IP address.

In the following example, the target IP address is 203.0.113.100:

Obtain the External IP address
$
kubectl get svc slurm-login
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
slurm-login LoadBalancer 192.0.2.100 203.0.113.100 22/TCP 2d21h

Then, use SSH to log in with either the IP address or the DNS record created for the node.

Log in with SSH
Welcome to Ubuntu 22.04.1 LTS (GNU/Linux 5.13.0-40-generic x86_64)
example-user@slurm-login-0:~$

You are now logged into the Slurm login node and can run Slurm commands.

Warning

SSH is the preferred method of access for Slurm Login nodes. However, we do not recommend directly accessing Slurm Compute nodes through SSH to run tasks. Bypassing Slurm can interfere with currently running jobs and may cause nodes to drain unintentionally, leading to temporary loss of resources. SSH to Slurm Compute nodes should only be used for debugging existing jobs on the nodes.

Running Slurm commands

After logging in, you will have access to all normal Slurm operations to submit jobs or manage the cluster. SchedMD provides extensive documentation for Slurm commands and some handy printable cheat-sheets

To verify that the cluster is working, run a simple job. For example, discover the hostname on 6 nodes, as shown below:

Example
$
root@slurm-login-0:~# srun -N 6 hostname
slurm-rtx4000-3
slurm-rtx4000-1
slurm-rtx4000-0
slurm-cpu-epyc-0
slurm-cpu-epyc-1
slurm-rtx4000-2

Port forwarding

If there is no public IP address allocated for the node, you can port-forward the node then log in with the port-forwarded address.

Log in with port-forwarding
$
kubectl port-forward svc/slurm-login 10022:22
ssh example-user@localhost -p 10022

Troubleshooting

For troubleshooting purposes in cases where SSH is not possible, kubectl exec can be used to access the Slurm login node as root. This method is useful for debugging and maintenance tasks.

Access the Slurm login node with kubectl exec
$
kubectl exec -it slurm-login-0 -c sshd -- bash
root@slurm-login-0:/tmp#