NodeSet resources, how to structure a GitOps repository with Helm charts and values files for SUNK and Slurm, and how to apply an app of apps pattern so a single ArgoCD Application manages the full deployment.
Use this guide if you maintain a SUNK environment and want declarative, repeatable cluster deployments that align with GitOps practices.
ArgoCD for application management
SUNK uses ArgoCD, a GitOps continuous delivery tool, to manage Kubernetes applications declaratively. ArgoCD provides a streamlined deployment workflow that aligns with the GitOps model. This automates the deployment process to maintain consistency across different environments. ArgoCD defines two main applications:- SUNK Application: Manages the deployment of the SUNK cluster, including all necessary configurations and dependencies.
- Slurm Application: Handles the Slurm cluster deployment to manage and schedule compute resources properly.
- Persistent Volume Claims (PVCs) for storage.
- Prolog and epilog scripts for preparing and cleaning up compute nodes before and after job execution.
NodeSet sync customization
Update the ArgoCD configuration to syncNodeSet definitions. The resource customization feature lets ArgoCD sync the NodeSet spec in the same manner as PodSpec. Without this customization, ArgoCD doesn’t process NodeSet fields correctly, which can cause drift between the desired and actual state of the cluster.
The method for applying configuration changes varies depending on the cluster’s ArgoCD installation. The following sections describe two ways to apply the customization: edit the argocd-cm ConfigMap directly, or patch it with kubectl.
Apply configuration changes with a ConfigMap patch
To apply configuration changes with a ConfigMap patch, edit thedata section of the ConfigMap.
To open the argocd-cm ConfigMap for editing, use the following command:
The
kubectl edit command opens the entire ConfigMap and sends the entire modified YAML back to the API server to replace the existing ConfigMap with your newly modified version.data section of the ConfigMap, add the following key-value pairs:
Apply configuration changes with kubectl
To apply configuration changes withkubectl, use the following command:
The
kubectl patch command only updates the specific resources you’ve modified and merges them with the existing resource.-p '...' flag specifies the patch content. This example uses the data, - field, and type parameters.
The "data":{...} parameter specifies the section of the ConfigMap to modify. The kubectl patch command only modifies the specified section.
" - field: spec.template.spec\n" and type: core/v1/PodSpec\n are the specific key-value pair to add to the ConfigMap.
The --type=merge flag specifies the patch type as a JSON Merge Patch, which operates on the following logic:
- If a field exists in the patch, it replaces the existing field in the target object.
- If a field exists in the patch with a
nullvalue, it deletes the field from the target object. - If a field does not exist in the patch, it remains unchanged in the target object.
- If you provide a list, it replaces the entire existing list with the one provided in the patch.
Configuration in git (GitOps)
This section shows an example of how to keep a git repository synced to ArgoCD with Helm and the app of apps pattern. The following subsections walk through the Helm chart and values files for SUNK and Slurm, optional custom configurations, and the ArgoCDApplication definitions that tie everything together.
Create a git repository with the contents described in the following sections.
SUNK
The following sections describe the SUNK Helm chart and values file used to manage the SUNK deployment in your GitOps repository.SUNK Helm chart
sunk/Chart.yaml
SUNK values file
Use the SUNK Values Reference to customize this file. The following shows an examplesunk/values.yaml.
sunk/values.yaml
Slurm
The following sections describe the Slurm Helm chart and values file used to manage the Slurm deployment in your GitOps repository.Slurm Helm chart
slurm/Chart.yaml
Slurm values file
Use the Slurm Values Reference to customize this file. The following shows an exampleslurm/values.yaml.
slurm/values.yaml
Optional: Custom configurations
This section shows an example of how to define custom Slurm deployment configurations and keep them synchronized with ArgoCD.Slurm controller config
Use the following ConfigMap to customize the Slurm controller configurations.slurm/templates/etc-slurmctld-configmap.yaml
Prolog and epilog
For more configuration options, see the Slurm Values Reference and Prolog and Epilog Scripts pages. The following is an example of a prolog ConfigMap,slurm/templates/prolog-configmap.yaml:
slurm/templates/prolog-configmap.yaml
slurm/templates/epilog-configmap.yaml:
slurm/templates/epilog-configmap.yaml
slurm/scripts/epilog.d/test.sh, to use with the preceding epilog ConfigMap:
slurm/scripts/epilog.d/test.sh
ArgoCD app of apps
This section shows an example of how to define multiple ArgoCDApplication resources to manage SUNK and Slurm with the app of apps pattern and GitOps principles. The following subsections define the individual SUNK and Slurm Application resources, then a parent Application that references both.
SUNK app definition
Theapps/sunk.yaml file describes where ArgoCD can find and synchronize the Helm manifests for SUNK. Replace the [REPO-URL] placeholder with your GitOps repository URL. Follow the ArgoCD Specs for more customization options.
apps/sunk.yaml
Slurm app definition
Theapps/slurm.yaml file describes where ArgoCD can find and synchronize the Helm manifests for Slurm. Replace the [REPO-URL] placeholder with your GitOps repository URL. Follow the ArgoCD Specs for more customization options.
The
spec.ignoreDifferences key contains recommended values to keep ArgoCD synchronized.apps/slurm.yaml
App of apps definition
Theapp-of-apps.yaml file describes where ArgoCD can find and synchronize the custom Helm charts defined for SUNK and Slurm in the preceding sections. Replace the [REPO-URL] placeholder with your GitOps repository URL. Follow the ArgoCD Specs for more customization options.
app-of-apps.yaml
Apply to ArgoCD
After you follow the preceding steps, your GitOps repository should be structured as follows:Additional notes
The following sections describe behaviors and recommendations to understand when operating SUNK and Slurm under ArgoCD. They cover how syncs affect running jobs, how to handle login node updates, and the lifecycle of the secret jobs created by the Slurm chart.ArgoCD impact on Slurm jobs
ArgoCD syncs are job-safe. Syncing in Argo doesn’t affect running jobs in the cluster. TheRollingUpdate strategy updates compute nodes, and you can configure the maximum percentage of nodes unavailable during an update with compute.maxUnavailable in the chart values. See the Slurm Values Reference for details.
Login node updates
The login nodes might contain user states that you might not want to delete during an update. We recommend settinglogin.updateStrategy to OnDelete in this case. This requires you to manually delete the existing pod before creating the updated login node so the user state isn’t deleted during a sync in ArgoCD. See the Slurm Values Reference for details.
Secret job lifecycle
On each sync, the Slurm chart schedules two Kubernetes Jobs to create the secrets the Slurm cluster needs to operate. When you install or upgrade, the chart replaces any existing jobs and initiates new job runs. If a job succeeds, an Argo hook deletes the Job object, and ArgoCD reportsIn Sync to indicate the job is complete. If a job fails, the Job object remains in Argo as Failed until you resolve the issue with the job run or the next sync occurs, which then follows this same process.