Run GitHub Actions Runners on SUNK
Deploy self-hosted GitHub Actions runners using Actions Runner Controller (ARC)
This guide shows you how to deploy self-hosted GitHub Actions runners on SUNK using Actions Runner Controller (ARC).
This documentation describes deploying ARC version 0.13.0, which uses the modern autoscaling architecture with AutoscalingRunnerSet resources. This is the recommended approach maintained by GitHub.
Prerequisites
- Access to a SUNK cluster with Kubernetes
- Cluster admin access or permissions to create namespaces and deploy applications
- A GitHub organization or repository where you want to register runners
- GitHub organization admin permissions (for creating GitHub Apps)
- For GPU runners: A SUNK cluster with the SUNK Scheduler deployed
Create a GitHub App
ARC authenticates to GitHub using a GitHub App. Follow these steps to create one:
-
Navigate to your GitHub organization settings, select Developer settings, then GitHub Apps, and then select New GitHub App:
-
Configure the GitHub App:
- GitHub App name: Choose a name, for example,
arc-controller. - Homepage URL: Use your organization URL.
- Webhook: Uncheck "Active" (not needed for basic setup).
- GitHub App name: Choose a name, for example,
-
Set the following permissions:
Repository permissions
Permission Access Level Notes Actions Read Administration Read & write Required for managing self-hosted runners. Metadata Read Organization permissions
Permission Access Level Notes Self-hosted runners Read and write Required for organization-level runners. -
Select Create GitHub App.
-
Note and save the App ID from the app details page.
-
Generate and download a private key by going to the Private keys section and selecting Generate a private key. Save the downloaded
.pemfile. -
Install the GitHub App:
- In the left sidebar, select Install App.
- Select your organization.
- Choose All repositories or specific repositories.
- Note the Installation ID from the URL (for example,
https://github.com/organizations/YOUR_ORG/settings/installations/12345678. The number is your Installation ID).
Deploy ARC Controller
Create the configuration files for deploying the ARC controller in your GitOps repository.
-
Create the namespace configuration in the following directory
namespaces/actions-runner-system.yamlnamespaces/actions-runner-system.yamlapiVersion: v1kind: Namespacemetadata:name: actions-runner-system -
Deploy the manifest:
Example$kubectl apply -f actions-runner-system.yaml -
Create an Authentication Secret with your GitHub App credentials in the following directory
secrets/arc/controller-manager.yaml:secrets/arc/controller-manager.yamlapiVersion: v1kind: Secretmetadata:name: controller-managernamespace: actions-runner-systemtype: OpaquestringData:github_app_id: "YOUR_APP_ID"github_app_installation_id: "YOUR_INSTALLATION_ID"github_app_private_key: |YOUR_PRIVATE_KEY_CONTENTReplace the following with your values:
YOUR_APP_IDwith your GitHub App IDYOUR_INSTALLATION_IDwith the installation ID from the URLYOUR_PRIVATE_KEY_CONTENTwith your actual private key content
-
Create the Helm values file for the controller in the
arc/controller-values.yamldirectory.arc/controller-values.yamlaffinity:nodeAffinity:requiredDuringSchedulingIgnoredDuringExecution:nodeSelectorTerms:- matchExpressions:- key: node.coreweave.cloud/classoperator: Invalues:- cpuresources:requests:cpu: 100mmemory: 90Milimits:cpu: 16memory: 64Gi -
Add the ARC controller to your ArgoCD Applications manifest:
Apps.yamlarc-controller:enabled: truenamespace: actions-runner-systemclusters:- name: your-cluster-namesource:repoURL: "ghcr.io/actions/actions-runner-controller-charts"chart: "gha-runner-scale-set-controller"targetRevision: "0.13.0"helm:releaseName: "arc-controller"valueFiles:- "arc/controller-values.yaml" -
Commit and push the changes to deploy the controller. Verify it's running:
Example$kubectl get pods -n actions-runner-systemYou should see the controller pod with status
Running.
Configure Runner Scale Sets
You can deploy multiple runner scale sets for different workload types. Below are examples for CPU and GPU runners.
CPU Runners
-
Create a values file for CPU runner scale sets in the
arc/cpu-runner-values.yamldirectory:arc/cpu-runner-values.yamlgithubConfigUrl: "https://github.com/YOUR_ORG"githubConfigSecret: "controller-manager"template:spec:affinity:nodeAffinity:requiredDuringSchedulingIgnoredDuringExecution:nodeSelectorTerms:- matchExpressions:- key: node.coreweave.cloud/classoperator: Invalues:- cpucontainers:- name: runnerimage: ghcr.io/actions/actions-runner:latestcommand: ["/home/runner/run.sh"]resources:requests:cpu: "16"memory: 32Gilimits:cpu: "62"memory: 220GiReplace the following with your values:
YOUR_ORGwith your GitHub organization name.- For repository-level runners, use:
https://github.com/YOUR_USERNAME/YOUR_REPO - For enterprise runners, use:
https://github.com/enterprises/YOUR_ENTERPRISE
-
Add the CPU runner scale set to your ArgoCD Applications:
Apps.yamlarc-runner-cpu:enabled: truenamespace: actions-runner-systemclusters:- name: your-cluster-namesource:repoURL: "ghcr.io/actions/actions-runner-controller-charts"chart: "gha-runner-scale-set"targetRevision: "0.13.0"helm:releaseName: "arc-runner-cpu"valueFiles:- "arc/cpu-runner-values.yaml"
GPU Runners with SUNK Scheduler
For GPU workloads that need SUNK scheduler integration, you'll need to create both a pod template ConfigMap and a values file.
-
Configure Your SUNK Scheduler Name
Replace
YOUR_SCHEDULER_NAMEwith your SUNK scheduler name. The default format is typically<namespace>-<releaseName>-slurm-scheduler(for example,tenant-slurm-slurm-scheduler).To find your scheduler name, run the following command:
Example$kubectl get pods -l app.kubernetes.io/name=sunk-scheduler -n tenant-slurm -o json | \jq -r '.items[0].spec.containers[] | select(.name=="scheduler") | .args[] | select(startswith("--scheduler-name=")) | sub("^--scheduler-name="; "")'For details on SUNK scheduler configuration and annotations, see Schedule Kubernetes Pods with Slurm.
-
Create a Pod Template ConfigMap. ARC supports container hooks that allow you to customize the pod specification for job containers. Create a ConfigMap with pod templates in the
arc/supporting/podtemplates.yamldirectory:arc/supporting/podtemplates.yamlapiVersion: v1kind: ConfigMapmetadata:name: arc-podtemplatesnamespace: actions-runner-systemdata:gpu.yaml: |metadata:annotations:sunk.coreweave.com/account: rootsunk.coreweave.com/comment: github-actionssunk.coreweave.com/partition: hpc-highThese options are specific to SUNK and will need to be adjusted based on your environment .spec:nodeName: "" # This is not a typo, leave this a blank stringschedulerName: YOUR_SCHEDULER_NAMEReplace with your scheduler's name.nodeSelector:node.coreweave.cloud/class: gpuaffinity:nodeAffinity:requiredDuringSchedulingIgnoredDuringExecution:nodeSelectorTerms:- matchExpressions:- key: gpu.nvidia.com/classoperator: Invalues:- H100 # Update to whatever GPU class you have in your clusterUpdate the GPU class.tolerations:- key: is_gpu_computeoperator: Existscontainers:- name: $job # This needs to be the literal string '$job'resources:requests:cpu: "16"memory: "32Gi"nvidia.com/gpu: "8"limits:cpu: "64"memory: "256Gi"nvidia.com/gpu: "8" -
Add this as a supporting application to deploy the ConfigMap:
Apps.yamlarc-supporting:enabled: truenamespace: actions-runner-systemclusters:- name: your-cluster-namesource:repoURL: "your-gitops-repo-url"path: arc/supportingtargetRevision: "main"directory:recurse: truesyncPolicy:automated:prune: trueselfHeal: true -
Create a values file for GPU runners. This configuration uses ARC's container mode with Kubernetes container hooks to apply the SUNK scheduler pod template to job containers:
arc/gpu-runner-values.yamlgithubConfigUrl: "https://github.com/YOUR_ORG"githubConfigSecret: "controller-manager"containerMode:type: "kubernetes"kubernetesModeWorkVolumeClaim:accessModes:- ReadWriteManyresources:requests:storage: 10Gi # How much scratch space each runner will have by defaultstorageClassName: shared-vasttemplate:spec:containers:- name: runnerimage: ghcr.io/actions/actions-runner:latestcommand: ["/home/runner/run.sh"]env:- name: ACTIONS_RUNNER_CONTAINER_HOOKSvalue: /home/runner/k8s/index.js- name: ACTIONS_RUNNER_CONTAINER_HOOK_TEMPLATEvalue: /etc/arc/gpu.yaml- name: ACTIONS_RUNNER_REQUIRE_JOB_CONTAINERvalue: "true"volumeMounts:- name: arc-podtemplatesmountPath: /etc/arcreadOnly: truevolumes:- name: arc-podtemplatesconfigMap:name: arc-podtemplatesThe container mode configuration enables the following:
Value Description ACTIONS_RUNNER_CONTAINER_HOOKSActivates Kubernetes container hooks for the runner. ACTIONS_RUNNER_CONTAINER_HOOK_TEMPLATEPoints to the pod template that contains SUNK scheduler configuration. ACTIONS_RUNNER_REQUIRE_JOB_CONTAINEREnsures jobs run in containers with the pod template applied. This will require you to use the container:feature in your workflow. You can read more at Running jobs in a container.kubernetesModeWorkVolumeClaimProvides persistent storage for job workspaces. This allows GitHub Actions jobs to run as separate pods with SUNK scheduler annotations and GPU resources applied via the pod template.
-
Add the GPU runner scale set to your ArgoCD Applications:
Apps.yamlarc-runner-gpu:enabled: truenamespace: actions-runner-systemclusters:- name: your-cluster-namesource:repoURL: "ghcr.io/actions/actions-runner-controller-charts"chart: "gha-runner-scale-set"targetRevision: "0.13.0"helm:releaseName: "arc-runner-gpu"valueFiles:- "arc/gpu-runner-values.yaml" -
Commit and push to deploy the runner scale sets.
Use Runners in Workflows
Target your self-hosted runners in GitHub Actions workflows using the runs-on key with the runner scale set name.
The runs-on value must match your Helm release name:
arc-runner-cpucorresponds tohelm install arc-runner-cpu ...arc-runner-gpucorresponds tohelm install arc-runner-gpu ...
CPU runners:
name: CPU Runneron: [push]jobs:sample:runs-on: arc-runner-cpu # Name of RunnerSetsteps:- uses: actions/checkout@v5- run: echo "Running on CPU runner"
GPU runners:
name: GPU Runneron: [push]jobs:sample:runs-on: arc-runner-gpu # Name of RunnerSetcontainer:image: "nvidia/cuda:13.0.1-cudnn-devel-ubuntu24.04"steps:- uses: actions/checkout@v5- name: nvidia-smirun: |nvidia-smi- name: Install the latest version of uvuses: astral-sh/setup-uv@v7- name: Test PyTorch GPU Accessif: always()run: |UV_TORCH_BACKEND=auto uv run --with torch python -c "import torch; print(f'PyTorch version: {torch.__version__}'); print(f'CUDA available: {torch.cuda.is_available()}'); print(f'CUDA device count: {torch.cuda.device_count()}'); [print(f'GPU {i}: {torch.cuda.get_device_name(i)}') for i in range(torch.cuda.device_count())] if torch.cuda.is_available() else print('No GPUs detected by PyTorch')"
Verify Installation
Check that the runner scale set is deployed:
$kubectl get autoscalingrunnersets -n actions-runner-system
Check the listener pod:
$kubectl get pods -n actions-runner-system -l app.kubernetes.io/name=gha-rs-listener
View listener logs to verify GitHub connection:
$kubectl logs -n actions-runner-system -l app.kubernetes.io/name=gha-rs-listener --tail=20
When a workflow is triggered, you should see ephemeral runner pods created:
$kubectl get pods -n actions-runner-system -w
For GPU runners, verify the SUNK scheduler is working:
$kubectl get pods -n actions-runner-system -l app.kubernetes.io/component=runner -o yaml
Check that pods created by GPU workflows have the schedulerName set to your SUNK scheduler.
Scaling Configuration
Runner scale sets automatically scale based on workflow job demand:
| Setting | Value |
|---|---|
| Default minimum runners | 0 (no idle runners) |
| Default maximum runners | Unlimited |
| Scaling behavior | Runners are ephemeral and terminate after job completion |
To customize scaling limits, add to your values file:
minRunners: 0maxRunners: 10
Additional Resources
- Actions Runner Controller GitHub Repository
- GitHub Actions Documentation
- ARC Documentation
- Schedule Kubernetes Pods with Slurm - SUNK scheduler configuration and annotations