Connect to the Slurm Login node

This page shows administrators and Slurm users how to connect to a Slurm login node to submit jobs and run management tasks against the cluster. To run jobs or management tasks in the Slurm cluster, you must first connect to the Slurm login node. You can access the login node through SSH or kubectl exec, depending on your directory service configuration. SSH requires a directory service pre-configured for SSH access, while kubectl exec doesn’t. This page covers SSH (with and without port forwarding), running your first Slurm command to confirm access, and using kubectl exec when SSH isn’t available. For information about initial setup of Slurm login nodes, see Configure Slurm individual login nodes.

Connect through SSH

SSH is the preferred way to reach a Slurm login node when the cluster exposes the login service on a public IP address. Use this method when your directory service is already configured for SSH access.

Accessing the login node through SSH requires a directory service with users configured for SSH access.

First, use the kubectl get svc slurm-login command to identify the login service’s IP address or DNS record so you know where to point your SSH client. The EXTERNAL-IP field in the command output contains the IP address. In the following example, the target IP address is 203.0.113.100:

Obtain the External IP address

kubectl get svc slurm-login

You should see output similar to the following:

NAME          TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)   AGE
slurm-login   LoadBalancer   192.0.2.100      203.0.113.100    22/TCP    2d21h

Then, use SSH to log in with either the IP address or the DNS record created for the node.

ssh example-user@203.0.113.100

You should see output similar to the following:

Welcome to Ubuntu 22.04.1 LTS (GNU/Linux 5.13.0-40-generic x86_64)

example-user@slurm-login-0:~$

You’re now logged into the Slurm login node and can run Slurm commands.

SSH is the preferred method of access for Slurm login nodes. However, don’t directly access Slurm compute nodes through SSH to run tasks. Bypassing Slurm can interfere with currently running jobs and may cause nodes to drain unintentionally, leading to temporary loss of resources. Only use SSH to Slurm compute nodes for debugging existing jobs on the nodes.

Connect through port forwarding

Use port forwarding when no public IP address is allocated for the login node, so you can still reach it through SSH from your local machine. If no public IP address is allocated for the node, first port-forward the service with the kubectl port-forward command, then log in through SSH using the port-forwarded address. Each login pod has an associated headless service, allowing users to refer to the pod by name without specifying a fully qualified domain name (FQDN). To access an individual login pod with port-forwarding, use the kubectl port-forward and ssh commands, as shown in the following example:

kubectl port-forward svc/slurm-login-slurmuser1 10022:22
ssh example-user@localhost -p 10022

The port-forwarding command in this example, kubectl port-forward svc/slurm-login-slurmuser1 10022:22, works as follows:

The kubectl port-forward command creates a port-forward.
svc/ specifies that the targeted resource is a Service.
slurm-login-slurmuser1 is the exact name of the targeted Kubernetes Service. Replace this value with the name used within your namespace.
10022:22 defines the port mapping. In this case, it forwards traffic from local port 10022 to port 22 on the target Service.

The SSH command, ssh example-user@localhost -p 10022, then connects to the local port 10022. Because of the port-forwarding performed in the preceding command, Kubernetes sends this traffic to port 22 of the specified Kubernetes Service. You’re now logged into the Slurm login node and can run Slurm commands.

Run Slurm commands

Once you’re connected to the login node, confirm that the cluster is reachable from your session and that your user has permission to submit work. After you log in, you have access to all normal Slurm operations to submit jobs or manage the cluster. SchedMD provides extensive documentation for Slurm commands and some printable cheat-sheets. To verify that the cluster is working, run a small job. For example, discover the hostname on six nodes, as shown in the following example:

root@slurm-login-0:~# srun -N 6 hostname
slurm-rtx4000-3
slurm-rtx4000-1
slurm-rtx4000-0
slurm-cpu-epyc-0
slurm-cpu-epyc-1
slurm-rtx4000-2

If you run into any errors such as “Invalid partition name specified” or “Invalid account or account/partition combination specified”, you likely haven’t been added as a Slurm user. Replace [YOUR-USERNAME] with your Slurm username, then run the following commands:

Add yourself as a Slurm user

sudo su
sacctmgr create user -i account=root adminlevel=admin name=[YOUR-USERNAME]
exit

If your Slurm cluster uses accounts other than root, run the preceding command for each account you need to be added to.

Troubleshooting

If SSH is unavailable or you need root access for debugging, you can fall back to kubectl exec to open a shell directly on the login pod. When SSH isn’t possible, use kubectl exec to access the Slurm login node as root. This method is useful for debugging and maintenance tasks.

Access the Slurm login node with kubectl exec

kubectl exec -it slurm-login-0 -c sshd -- bash
root@slurm-login-0:/tmp#

​Connect through SSH

​Connect through port forwarding

​Run Slurm commands

​Troubleshooting

Connect through SSH

Connect through port forwarding

Run Slurm commands

Troubleshooting