Manage resource binding with task plugins

Optimize performance by binding tasks to common resources within a node

Using task plugins, you can set Slurm parameters to bind a task to specific subsets of resources on a node, like CPU cores or GPUs. You can optimize the performance of a task in Slurm by selecting resources within a common Non-Uniform Memory Access (NUMA) node. Resources can access memory within their own NUMA node much more quickly than memory in a separate NUMA node, which can reduce data transfer latency and improve job performance.

This guide explains how to effectively manage resource binding in SUNK with task plugins.

Enable resource binding

To enable resource binding, you'll need to modify the TaskPlugin variable in the Slurm configuration section of the SUNK Helm chart.

In the slurmConfig section of the Slurm values.yaml file, set the TaskPlugin variable to task/affinity,task/cgroup, as shown below:

Example

slurmConfig:
  slurmCtld:
    TaskPlugin: "task/affinity,task/cgroup"

This enables the task/affinity and task/cgroup plugins, which work together to optimize resource allocations in the SUNK cluster. The task/affinity plugin controls how processes bind to CPU resources on a Compute node. The task/cgroup plugin uses the cgroup filesystem and its controllers to enforce the resource limits and binding policies specified by Slurm.

Configure the task cgroup plugin

SUNK supports Linux cgroups through the cgroup.conf value, slurmConfig.cgroupConfig.

To use Linux cgroups in SUNK, do the following:

Add the task/cgroup value to the TaskPlugin variable, as shown in the section above.
In the slurmConfig section of the Slurm values.yaml, set the procTrackType variable to proctrack/cgroup. If this parameter is not set correctly, your Linux cgroups settings will not be applied.

This will enable cgroups with the following settings:

Example

slurmConfig.cgroupConfig: |
    CgroupPlugin=autodetect
    IgnoreSystemd=yes
    ConstrainCores=yes
    ConstrainDevices=yes
    ConstrainRAMSpace=yes

The Constrain settings enforce binding and limits for different resources, as follows:

ConstrainCores=yes enforces CPU binding.
ConstrainDevices=yes enforces limits on GPU devices.
ConstrainRAMSpace=yes enforces memory limits.

Bind tasks to GPUs

Use the --gpu-bind parameter in your job script's #SBATCH directives to manage how tasks are assigned to GPUs, as shown below:

Example

#SBATCH --gpu-bind=single:1,verbose

To print information about which GPU resources each task is bound to, add the verbose option to your other binding options, separated by a comma. This can be helpful when debugging or checking your binding strategy.

Performance optimization

To optimize performance, use the --gpu-bind=single:1 option when starting your Slurm job. This ensures that each task on a node is assigned a single GPU, and that the CPU cores and GPU are on the same NUMA node. Matching CPU and GPU NUMA affinities is crucial for maximum performance, so use the appropriate parameters when launching tasks with Slurm. If you do not use the --gpu-bind parameter, your task could be assigned a GPU with a different NUMA affinity than the assigned CPU cores, which could lead to suboptimal performance.

The --gpu-bind parameter supports multiple options, including:

Option	Meaning
`none`	No GPU binding; Slurm will not enforce any specific binding between tasks and GPUs. This option may be suitable if your application handles GPU selection internally.
`single:<count>`	Ensures that a task receives the number of GPUs specified with `<count>`, and attempts to place the task on the same NUMA node as the CPU cores allocated to that task.
`closest`	Attempts to bind a task to the GPUs that are "closest" to the CPU cores the task is running on, based on the system topology. This setting may assign multiple GPUs to a task if they share the same NUMA node, regardless of what you have specified in the `--gpus-per-task` setting.

For a complete list of available --gpu-bind options, see SchedMD's Slurm documentation.

Bind tasks to CPU cores

Use the --cpu-bind parameter in your job script's #SBATCH directives to control which CPU cores your tasks are bound to. For example:

Example

#SBATCH --cpu-bind=map_cpu:1,verbose

To print information about which CPU resources each task is bound to, add the verbose option to your other binding options, separated by a comma. This can be helpful when debugging or checking your binding strategy.

The --cpu-bind parameter supports multiple options, including:

Option	Meaning
`none` or `no`	No CPU binding; the operating system will schedule tasks on any available CPU resources. This can lead to suboptimal resource usage, and is not generally recommended when performance is critical.
`map_cpu:<list>`	Allows you to provide a comma-separated list of CPU IDs to specify the exact CPUs on which to run your task.
`mask_cpu:<list>`	Functions similarly to `map_cpu`, but uses a hexadecimal CPU mask for even more granular control.

For a full list of available --cpu-bind options, see SchedMD's Slurm documentation.

Enable resource binding​

Configure the task cgroup plugin​

Bind tasks to GPUs​

Bind tasks to CPU cores​

Enable resource binding

Configure the task cgroup plugin

Bind tasks to GPUs

Bind tasks to CPU cores