> ## Documentation Index
> Fetch the complete documentation index at: https://docs.coreweave.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Use Docker in SUNK

> Enable Docker-in-Docker inside SUNK clusters by configuring AppArmor, privileged pods, and s6 services

This guide explains how to use Docker inside a SUNK cluster with a technique known as Docker-in-Docker (DinD). This approach enables automated evaluations and lets you run benchmarks, [such as SWE-bench](/products/sunk/tutorials/swe-bench-in-sunk), within your SUNK cluster.

This guide is for cluster administrators who need to enable container-based workflows inside SUNK compute nodes. Before you begin, review the [known security risks](#known-security-risks) section so you understand the trade-offs this configuration introduces.

<Warning>
  The solution presented in this guide requires enabling privileged Pods and disabling the AppArmor profile recommended for use in SUNK. This process grants elevated kernel capabilities and weakens isolation guarantees. See the [known security risks](#known-security-risks) section for more details.

  You are responsible for verifying that third-party code is safe to execute alongside your other workloads.
</Warning>

## Enable Docker-in-Docker

To run Docker inside SUNK compute nodes, you must disable AppArmor and run privileged Pods. The following changes to the Slurm Helm chart install the Docker daemon on each compute node, mount persistent storage for Docker, and relax the security context so the daemon can operate.

To disable AppArmor and run privileged Pods, edit the `values.yaml` file of the Slurm Helm chart.

In the `compute` section of the `slurm` Helm chart:

1. Create an `s6` block to install and enable Docker.

2. Add a volume mount for Docker using the `volumeMount` and `volumes` blocks.

3. Under `pyxis`, apply the following settings:

   * Set `pyxis.enabled` to `true`.
   * Set `pyxis.appArmorProfile` to `null` to disable the AppArmor profile.
   * Set `pyxis.podSecurityContext` to `null` to remove the pod security context.

4. Under `securityContext`, set `securityContext.privileged` to `true`.

   The result should resemble the following:

   ```yaml theme={"system"}
   compute:
     s6:
       docker:
         type: longrun
         script: |
           #!/bin/sh
           curl -fsSL https://get.docker.com | sh
           dockerd
     volumeMounts:
     - mountPath: /var/lib/docker
       name: docker-storage
     volumes:
       - name: docker-storage
         emptyDir: {}
     pyxis:
       enabled: true
       appArmorProfile: null
       podSecurityContext: null
     securityContext:
       privileged: true

   ```

5. Verify that the AppArmor profile is correctly disabled.

   Replace `[SLURMD-NODE-NAME]` with the name of your slurmd Pod.

   ```bash theme={"system"}
   kubectl describe po [SLURMD-NODE-NAME] | grep LocalhostProfile
   ```

   This should return no output. If it outputs `LocalhostProfile: profiles/enroot`, then the configuration isn't applied correctly.

## Grant users Docker access

With the Helm chart changes applied, Docker is available on the compute nodes, but only `root` can use it by default. The next step grants non-root users the permissions they need to run Docker commands.

After you disable AppArmor and enable privileged Pods, grant users access to Docker.

If not running as `root`, users must belong to the `sudoers` group. If the SUNK cluster manages users with SUNK User Provisioning (SUP), set `nsscache.sudoGroups` to the groups that should have sudo privileges.

### SUP-based clusters

For SUP-based clusters, configure `nsscache:` to provision access through SCIM, and list the sudo-enabled groups under `sudoGroups:`. Group names must exactly match the group names in CoreWeave IAM or your upstream IdP. Refer to [the SUP documentation on creating user groups](/products/sunk/manage_sunk/manage_cluster_access/sunk_user_provisioning#create-sunk-user-groups).

```yaml theme={"system"}
 nsscache:
  sudoGroups:
   - slurm-sudo
   - additional-group
```

### SSSD-based clusters

For SSSD-based clusters, set the `sudoGroups:` configuration in `directoryService:` to a list of Unix groups from all directories with sudo privileges.

* The group names are not fully qualified for the default directory. For example, the default directory's group name should be `group1` instead of `group1@example.com`.
* The group names are fully qualified for any additional directories.

```yaml theme={"system"}
directoryService:
  sudoGroups: ["slurm-sudo", "additional-group@example.com"]
```

## Known security risks

Enabling Docker-in-Docker requires loosening several of SUNK's default isolation guarantees. Before you apply the preceding configuration, review the risks in this section so you can decide whether the trade-offs are acceptable for your environment.

Kubernetes provides a list of [pod security standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/). The following sections outline the subset most relevant to SUNK, but consider all of the standards listed in the Kubernetes documentation when assessing risk.

The setup described in this guide:

* Runs Docker inside a Pod.
* Requires privileged mode.
* Disables AppArmor protections.

<Danger>
  This setup grants full access to the `/proc` and `sys` directories, and allows kernel-level operations, including mounting, pivoting `root`, and process tracing.
</Danger>

### Item at risk: `cgroups`

`cgroups` handle management and monitoring of resources used by Slurm jobs. In privileged mode, users can bypass `cgroup` restrictions, reconfigure the resource limits set by Slurm, and move processes outside of their assigned `cgroups`.

### Item at risk: System and security

Disabling AppArmor on a privileged Pod removes additional security features, including protections against kernel exploits. This allows unrestricted access to the `/proc` and `/sys` directories. Without AppArmor protections in place, users can gain access to and tamper with other containers in the Pod, or access other devices outside the intended allocation.
