Skip to main content

Kubernetes Interaction with SUNK

Kubernetes is a powerful container orchestration platform that manages the underlying infrastructure for SUNK, including compute resources, networking, and storage. SUNK leverages Kubernetes to provide a scalable, flexible, and high-performance computing environment for running Slurm workloads.

When using SUNK, you can interact with Kubernetes using the kubectl command-line tool. This guide explains the key concepts, benefits, and best practices for using kubectl to interact with Kubernetes in the context of SUNK.

Key concepts

First, some terminology and concepts to understand the relationship between Kubnetes and Slurm in the context of SUNK:

TermDescription
Kubernetes cluster and NodesA Kubernetes cluster is a collection of Kubernetes Nodes, which are (in CKS) physical machines that run Kubernetes components and containerized applications.
Kubernetes Nodes are capitalized as proper nouns.
Kubernetes PodA Kubernetes Pod is the smallest deployable unit in Kubernetes, representing a single instance of a running process, such as a Slurm node. Multiple Pods can run on a single Kubernetes Node.
Slurm cluster and NodesA Slurm cluster is a collection of Slurm nodes, where each node is a Kubernetes Pod running a slurmd container.
Slurm nodes are lowercase.
kubectlkubectl is the command-line tool for interacting with Kubernetes clusters. It allows you to inspect cluster resources, create, delete, and update objects, and view logs and events.

When you use a Slurm cluster deployed by SUNK, you're operating within a Kubernetes environment. This means the underlying infrastructure is managed by Kubernetes, and many aspects of SUNK's operation can be observed and controlled using kubectl, the command-line tool for Kubernetes.

Benefits

Here are some reasons why you might want to interact with Kubernetes via kubectl:

  • Visibility: You can use kubectl to see the status of the Kubernetes Pods where your Slurm jobs are running, providing insight into the underlying execution environment. Additionally, SUNK is deployed using Helm charts, which are managed by Kubernetes. kubectl can be used to inspect these deployments.
  • Debugging: If you encounter issues, kubectl can help you inspect logs, events, and the state of the Pods, aiding in troubleshooting.
  • Familiarity: If you're already familiar with Kubernetes, using kubectl to observe SUNK offers a familiar way to interact with the system.
  • Configuration: Many aspects of SUNK's configuration are managed as Kubernetes resources (such as ConfigMaps and Secrets), which you can interact with using kubectl.

Best Practices

Namespace Awareness: In Kubernetes, namespaces help organize resources. Make sure you're working in the correct namespace where SUNK is deployed. Being in the right namespace ensures your actions affect the correct resources.

Context: Kubernetes contexts specify which cluster, user, and namespace kubectl commands interact with. If you manage multiple clusters, always ensure you're using the correct context to avoid mistakes. Using the wrong context can lead to changes in the wrong cluster or environment, so double-check before acting.

Use caution with kubectl: Be careful when using kubectl, especially in production. For customers running jobs, stick to read-only commands like kubectl get or kubectl describe to monitor resources without making changes. Infrastructure engineers should be cautious when modifying the cluster. Changes to the SUNK deployment can impact stability or cause downtime. To minimize risk, double-check commands that modify resources before running them.