Manage resource binding with task plugins
Optimize performance by binding tasks to common resources within a node
Using task plugins, you can set Slurm parameters to bind a task to specific subsets of resources on a node, like CPU cores or GPUs. You can optimize the performance of a task in Slurm by selecting resources within a common Non-Uniform Memory Access (NUMA) node. Resources can access memory within their own NUMA node much more quickly than memory in a separate NUMA node, which can reduce data transfer latency and improve job performance.
This guide explains how to effectively manage resource binding in SUNK with task plugins.
Enable resource binding
To enable resource binding, you'll need to modify the TaskPlugin
variable in the Slurm configuration section of the SUNK Helm chart.
In the slurmConfig
section of the Slurm values.yaml
file, set the TaskPlugin
variable to task/affinity,task/cgroup
, as shown below:
slurmConfig:slurmCtld:TaskPlugin: "task/affinity,task/cgroup"
This enables the task/affinity
and task/cgroup
plugins, which work together to optimize resource allocations in the SUNK cluster. The task/affinity
plugin controls how processes bind to CPU resources on a Compute node. The task/cgroup
plugin uses the cgroup filesystem and its controllers to enforce the resource limits and binding policies specified by Slurm.
Configure the task cgroup plugin
SUNK supports Linux cgroups through the cgroup.conf
value, slurmConfig.cgroupConfig
.
To use Linux cgroups in SUNK, do the following:
- Add the
task/cgroup
value to theTaskPlugin
variable, as shown in the section above. - In the
slurmConfig
section of the Slurmvalues.yaml
, set theprocTrackType
variable toproctrack/cgroup
. If this parameter is not set correctly, your Linux cgroups settings will not be applied.
This will enable cgroups with the following settings:
slurmConfig.cgroupConfig: |CgroupPlugin=autodetectIgnoreSystemd=yesConstrainCores=yesConstrainDevices=yesConstrainRAMSpace=yes
The Constrain
settings enforce binding and limits for different resources, as follows:
ConstrainCores=yes
enforces CPU binding.ConstrainDevices=yes
enforces limits on GPU devices.ConstrainRAMSpace=yes
enforces memory limits.
Bind tasks to GPUs
Use the --gpu-bind
parameter in your job script's #SBATCH
directives to manage how tasks are assigned to GPUs, as shown below:
#SBATCH --gpu-bind=single:1,verbose
To print information about which GPU resources each task is bound to, add the verbose
option to your other binding options, separated by a comma. This can be helpful when debugging or checking your binding strategy.
To optimize performance, use the --gpu-bind=single:1
option when starting your Slurm job. This ensures that each task on a node is assigned a single GPU, and that the CPU cores and GPU are on the same NUMA node. Matching CPU and GPU NUMA affinities is crucial for maximum performance, so use the appropriate parameters when launching tasks with Slurm. If you do not use the --gpu-bind
parameter, your task could be assigned a GPU with a different NUMA affinity than the assigned CPU cores, which could lead to suboptimal performance.
The --gpu-bind
parameter supports multiple options, including:
Option | Meaning |
---|---|
none | No GPU binding; Slurm will not enforce any specific binding between tasks and GPUs. This option may be suitable if your application handles GPU selection internally. |
single:<count> | Ensures that a task receives the number of GPUs specified with <count> , and attempts to place the task on the same NUMA node as the CPU cores allocated to that task. |
closest | Attempts to bind a task to the GPUs that are "closest" to the CPU cores the task is running on, based on the system topology. This setting may assign multiple GPUs to a task if they share the same NUMA node, regardless of what you have specified in the --gpus-per-task setting. |
For a complete list of available --gpu-bind
options, see SchedMD's Slurm documentation.
Bind tasks to CPU cores
Use the --cpu-bind
parameter in your job script's #SBATCH
directives to control which CPU cores your tasks are bound to. For example:
#SBATCH --cpu-bind=map_cpu:1,verbose
To print information about which CPU resources each task is bound to, add the verbose
option to your other binding options, separated by a comma. This can be helpful when debugging or checking your binding strategy.
The --cpu-bind
parameter supports multiple options, including:
Option | Meaning |
---|---|
none or no | No CPU binding; the operating system will schedule tasks on any available CPU resources. This can lead to suboptimal resource usage, and is not generally recommended when performance is critical. |
map_cpu:<list> | Allows you to provide a comma-separated list of CPU IDs to specify the exact CPUs on which to run your task. |
mask_cpu:<list> | Functions similarly to map_cpu , but uses a hexadecimal CPU mask for even more granular control. |
For a full list of available --cpu-bind
options, see SchedMD's Slurm documentation.