Skip to main content

Manage resource binding with task plugins

Optimize performance by binding tasks to common resources within a node

Using task plugins, you can set Slurm parameters to bind a task to specific subsets of resources on a node, like CPU cores or GPUs. You can optimize the performance of a task in Slurm by selecting resources within a common Non-Uniform Memory Access (NUMA) node. Resources can access memory within their own NUMA node much more quickly than memory in a separate NUMA node, which can reduce data transfer latency and improve job performance.

This guide explains how to effectively manage resource binding in SUNK with task plugins.

Enable resource binding

To enable resource binding, you'll need to modify the TaskPlugin variable in the Slurm configuration section of the SUNK Helm chart.

In the slurmConfig section of the Slurm values.yaml file, set the TaskPlugin variable to task/affinity,task/cgroup, as shown below:

Example
slurmConfig:
slurmCtld:
TaskPlugin: "task/affinity,task/cgroup"

This enables the task/affinity and task/cgroup plugins, which work together to optimize resource allocations in the SUNK cluster. The task/affinity plugin controls how processes bind to CPU resources on a Compute node. The task/cgroup plugin uses the cgroup filesystem and its controllers to enforce the resource limits and binding policies specified by Slurm.

Configure the task cgroup plugin

SUNK supports Linux cgroups through the cgroup.conf value, slurmConfig.cgroupConfig.

To use Linux cgroups in SUNK, do the following:

  1. Add the task/cgroup value to the TaskPlugin variable, as shown in the section above.
  2. In the slurmConfig section of the Slurm values.yaml, set the procTrackType variable to proctrack/cgroup. If this parameter is not set correctly, your Linux cgroups settings will not be applied.

This will enable cgroups with the following settings:

Example
slurmConfig.cgroupConfig: |
CgroupPlugin=autodetect
IgnoreSystemd=yes
ConstrainCores=yes
ConstrainDevices=yes
ConstrainRAMSpace=yes

The Constrain settings enforce binding and limits for different resources, as follows:

  • ConstrainCores=yes enforces CPU binding.
  • ConstrainDevices=yes enforces limits on GPU devices.
  • ConstrainRAMSpace=yes enforces memory limits.

Bind tasks to GPUs

Use the --gpu-bind parameter in your job script's #SBATCH directives to manage how tasks are assigned to GPUs, as shown below:

Example
#SBATCH --gpu-bind=single:1,verbose

To print information about which GPU resources each task is bound to, add the verbose option to your other binding options, separated by a comma. This can be helpful when debugging or checking your binding strategy.

Performance optimization

To optimize performance, use the --gpu-bind=single:1 option when starting your Slurm job. This ensures that each task on a node is assigned a single GPU, and that the CPU cores and GPU are on the same NUMA node. Matching CPU and GPU NUMA affinities is crucial for maximum performance, so use the appropriate parameters when launching tasks with Slurm. If you do not use the --gpu-bind parameter, your task could be assigned a GPU with a different NUMA affinity than the assigned CPU cores, which could lead to suboptimal performance.

The --gpu-bind parameter supports multiple options, including:

OptionMeaning
noneNo GPU binding; Slurm will not enforce any specific binding between tasks and GPUs. This option may be suitable if your application handles GPU selection internally.
single:<count>Ensures that a task receives the number of GPUs specified with <count>, and attempts to place the task on the same NUMA node as the CPU cores allocated to that task.
closestAttempts to bind a task to the GPUs that are "closest" to the CPU cores the task is running on, based on the system topology. This setting may assign multiple GPUs to a task if they share the same NUMA node, regardless of what you have specified in the --gpus-per-task setting.

For a complete list of available --gpu-bind options, see SchedMD's Slurm documentation.

Bind tasks to CPU cores

Use the --cpu-bind parameter in your job script's #SBATCH directives to control which CPU cores your tasks are bound to. For example:

Example
#SBATCH --cpu-bind=map_cpu:1,verbose

To print information about which CPU resources each task is bound to, add the verbose option to your other binding options, separated by a comma. This can be helpful when debugging or checking your binding strategy.

The --cpu-bind parameter supports multiple options, including:

OptionMeaning
none or noNo CPU binding; the operating system will schedule tasks on any available CPU resources. This can lead to suboptimal resource usage, and is not generally recommended when performance is critical.
map_cpu:<list>Allows you to provide a comma-separated list of CPU IDs to specify the exact CPUs on which to run your task.
mask_cpu:<list>Functions similarly to map_cpu, but uses a hexadecimal CPU mask for even more granular control.

For a full list of available --cpu-bind options, see SchedMD's Slurm documentation.