Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.coreweave.com/llms.txt

Use this file to discover all available pages before exploring further.

The Pod Controller is responsible for managing data synchronized to the Pods from the Kubernetes Nodes. The Pod Controller is deployed cluster-wide as part of the sunk-controller-manager, and performs operations that are not specifically tied to an instance of a Slurm cluster. In contrast, the Syncer performs synchronization of information between Pods and an associated Slurm cluster instance.

Information flow and operations

The Pod Controller handles the flow of information from the Nodes to the Pods. Some possible information flows are described below.

Node cordon

When a Node is marked as unschedulable, the Pod Controller will propagate this information via an annotation to the respective NodeSet Pod. The reason is extracted from the Node annotation node.coreweave.cloud/cordonReason.

Node lock

The Node Controller handles the majority of the logic for the lock operation. The Pod Controller is only responsible for copying the lock annotation, sunk.coreweave.com/lock, from the associated Node to the Pod.

Features

The Pod Controller is responsible for propagating the defined NodeSet features to the Pod. When changes are made to the feature spec or, in the case of dynamic features, to the feature values, the Pod Controller updates the feature annotations on the Pod. These annotations are used by the Syncer to propagate the features onto the Slurm nodes. The features are specified in the NodeSet spec:
spec:
  features:
    staticFeatures:
      - example
    dynamicFeatures:
      gpu.nvidia.com/class: {}
      gpu.coreweave.cloud/driver-version:
        prefix: "driver-"
staticFeatures and dynamicFeatures are feature strings applied to the Slurm nodes. staticFeatures is a list of feature strings updated via the NodeSet spec. dynamicFeatures is a map where the key is a reference to the Node label, which contains the feature string. The value is the additional configuration options for the feature. This feature string is updated either from changes to the NodeSet spec or when the labels on the associated Node changes. The Pod Controller creates two annotations on the pod to reflect these feature lists:
  • sunk.coreweave.com/dynamic-features
  • sunk.coreweave.com/static-features
The lists are comma separated, sorted, and made unique.

Conditions

The Pod Controller will copy a select set of conditions from the associated Node to the Pod and keep them updated on changes. These conditions are used to facilitate the propagation of Node information into Slurm later through the Syncer.
Last modified on March 24, 2026