Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.coreweave.com/llms.txt

Use this file to discover all available pages before exploring further.

Version: 0.1.0 Type: application

Requirements

RepositoryNameVersion
file://../librarylibrary0.1.0
https://cybozu-go.github.io/moco/moco0.20.0
oci://ghcr.io/coreweave/k8s-device-plugin/chartsnvidia-device-plugin0.17.0-5c8a50df

Parameters

Key & DescriptionTypeDefault
imagePullSecrets
Image pull secrets to configure if using custom private images.
list
[]
moco
Options for the MOCO MySQL Operator.
objectSee individual settings below.
moco.enabled
Enable the MOCO MySQL Operator.
bool
true
moco.image.repository
The repository for the MOCO image.
string
"ghcr.io/cybozu-go/moco"
moco.image.tag
The tag for the MOCO image.
string
null
moco.imagePullSecrets
Image pull secrets to configure if using custom private images.
list
[]
moco.monitoring.podMonitors.enabled
Enable monitoring via the Prometheus operator PodMonitor CRD
bool
false
moco.monitoring.vmPodScrapes.enabled
Enable monitoring via the VictoriaMetrics operator VMPodScrape CRD
bool
true
moco.priorityClassName
The priority class name for the MOCO pod.
string
"sunk-control-plane"
moco.replicaCount
The number of replicas of the MOCO instance to run.
int
1
moco.resources
Resources for the MOCO container.
object
limits:
    memory: 1Gi
requests:
    cpu: 100m
    memory: 256Mi

nvidia-device-plugin
Options for the Coreweave fork of the Nvidia device plugin chart. This chart builds on the default configuration provided by Nvidia, and uses these default chart values.
objectSee default chart values.
operator.affinity
The affinity for the operator deployment.
object
nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
            - matchExpressions:
                - key: node.coreweave.cloud/class
                  operator: In
                  values:
                    - cpu

operator.config.operator.nodeSet.failedPodsBackoffGCInterval
The time that has to pass before next iteration of backoff GC is run for checking failed pods.
string
"1m"
operator.config.operator.nodeSet.forceScalingDeleteKnownConditionTimeout
The delay to wait before deleting a pod during nodeset scaling with a known Slurm state. “0” disables this feature.
string
0
operator.config.operator.nodeSet.forceScalingDeleteUnknownConditionTimeout
The delay to wait before deleting a pod during nodeset scaling with a unknown Slurm state. “0” disables this feature.
string
0
operator.config.operator.nodeSet.maxBurstReplicas
A rate limiter for booting pods when there are a lot of pods. A too high of a value can cause registry DoS issues.
int
250
operator.config.operator.nodeSet.scaleDownPriorityOrdering
Enable priority ordering for scale down. This ensures the following deletion order non-ready pods, drained pods without running workloads, idle pods, draining pods with running workloads, then running pods.
bool
true
operator.config.operator.nodeSet.statusUpdateBackoffGCInterval
The time that has to pass before next iteration of backoff GC is run for checking node status updates.
string
"1m"
operator.config.operator.nodeSlice.maxNodesPerNodeSlice
The maximum number of nodes that can be in a single nodeSlice.
int
100
operator.image
The image to use for the operator.
object
repository: registry.gitlab.com/coreweave/sunk/operator
tag:

operator.leaderElection.enabled
This forces the operator to use leader election even if the number of replicas is set to 1.
Useful if planning to scale after deployment.
bool
false
operator.leaderElection.leaderElectionID
The string value to use as the leader election id.
string
null
operator.logLevel
The log level.
Uses integers or zap log level strings:
  • debug
  • info
  • warn
  • error
  • dpanic
  • panic
  • fatal
string
"info"
operator.maxConcurrentReconciles
int
50
operator.podMonitor.enabled
Enable monitoring via the Prometheus operator PodMonitor CRD.
bool
false
operator.priorityClassName
The priority class name for the operator.
string
"sunk-control-plane"
operator.replicas
The number of replicas of the operator pod to run.
Leader election will be enabled if this is greater than 1 or leader election is explicitly enabled.
int
1
operator.resources
The resource to request for the operator.
object
limits:
    memory: 32Gi
    cpu: 16
requests:
    cpu: 8
    memory: 32Gi

operator.tolerations
The tolerations for the operator deployment.
list
[]
operator.vmPodScrape.enabled
Enable monitoring via the VictoriaMetrics operator VMPodScrape CRD. Note: To enable this, podMonitor must be disabled in addition to setting vmPodScrape.enabled to true.
bool
true
priorityClass.enabled
Enable the priority class for the control plane components.
bool
true
priorityClass.value
The value of the priority class, generally should be high relative to other priority classes as these are critical components.
int
1000000000
scheduler.podMonitor.enabled
Enable monitoring via the Prometheus operator PodMonitor CRD.
bool
false
scheduler.vmPodScrape.enabled
Enable monitoring via the VictoriaMetrics operator VMPodScrape CRD. Note: To enable this, podMonitor must be disabled in addition to setting vmPodScrape.enabled to true.
bool
true
syncer.podMonitor.enabled
Enable monitoring via the Prometheus operator PodMonitor CRD.
bool
false
syncer.vmPodScrape.enabled
Enable monitoring via the VictoriaMetrics operator VMPodScrape CRD. Note: To enable this, podMonitor must be disabled in addition to setting vmPodScrape.enabled to true.
bool
true
Last modified on March 24, 2026