Schedule flow
The following description explains how a Pod moves from submission to a running state under the SUNK Pod Scheduler. When a Pod is marked to schedule through the SUNK Pod Scheduler (.spec.schedulerName), the SUNK Pod Scheduler processes the Pod and attempts to schedule it. The SUNK Pod Scheduler validates the Pod to confirm the annotations can be passed to the Slurm Controller through RPC. It doesn’t validate that the values are correct, only that they can be passed. It also verifies that the resource requests are non-zero.
Validation errors block the retry of scheduling, and the scheduler creates an event on the Pod for the reason.
A Slurm job isn’t considered running until it completes the prolog stage and the placeholder job script starts. The script appends
: started to SLURM_JOB_EXTRA when it starts. The SUNK Pod Scheduler uses Node locking to ensure Nodes are ready for the placeholder jobs. This process is similar to normal Slurm jobs, but with strict checking.Pods can become stuck after binding due to other Kubernetes constraints, such as taints or required affinities on the Kubernetes Node, that the SUNK Pod Scheduler isn’t aware of.
Unschedule flow
The following description explains how the SUNK Pod Scheduler tears down a workload when either side initiates termination. The SUNK Pod Scheduler handles both Pod deletion and Slurm job cancellation flows, so you can stop the workload from either Kubernetes or Slurm. When the placeholder job receives a termination signal, it contacts the SUNK Pod Scheduler’s hook API and blocks until the Pod is deleted. This prevents Slurm from running another job before the Kubernetes Pod is fully deleted. If the Pod is still running just beforeKillWait is reached, the script places the Node into Drain. This prevents scheduling further workloads on the Node until you resolve the issue. After the KillWait timeout value is reached, Slurm forcibly terminates the job.
Because the SUNK Pod Scheduler validates terminationGracePeriodSeconds when scheduling Pods, the Node is unlikely to be drained as a result of the Pod taking too long to delete.