SUNK parameter reference

Requirements

Repository	Name	Version
file://../library	library	0.1.0
oci://ghcr.io/coreweave/k8s-device-plugin/charts	nvidia-device-plugin	0.17.0-5c8a50df

Parameters

Key & Description	Type	Default
imagePullSecrets Image pull secrets to configure if using custom private images.	list	[]
nvidia-device-plugin Options for the Coreweave fork of the Nvidia device plugin chart. This chart builds on the default configuration provided by Nvidia, and uses these default chart values.	object	See default chart values.
operator.affinity The affinity for the operator deployment.	object	null
operator.config.operator.nodeSet.failedPodsBackoffGCInterval The time that has to pass before next iteration of backoff GC is run for checking failed pods.	string	"1m"
operator.config.operator.nodeSet.forceScalingDeleteKnownConditionTimeout The delay to wait before deleting a pod during nodeset scaling with a known Slurm state. "0" disables this feature.	string	0
operator.config.operator.nodeSet.forceScalingDeleteUnknownConditionTimeout The delay to wait before deleting a pod during nodeset scaling with a unknown Slurm state. "0" disables this feature.	string	0
operator.config.operator.nodeSet.maxBurstReplicas A rate limiter for booting pods when there are a lot of pods. A too high of a value can cause registry DoS issues.	int	250
operator.config.operator.nodeSet.statusUpdateBackoffGCInterval The time that has to pass before next iteration of backoff GC is run for checking node status updates.	string	"1m"
operator.config.operator.nodeSlice.maxNodesPerNodeSlice The maximum number of nodes that can be in a single nodeSlice.	int	100
operator.image The image to use for the operator.	object	repository: registry.gitlab.com/coreweave/sunk/operator tag:
operator.leaderElection.enabled This forces the operator to use leader election even if the number of replicas is set to 1. Useful if planning to scale after deployment.	bool	false
operator.leaderElection.leaderElectionID The string value to use as the leader election id.	string	null
operator.logLevel The log level. Uses integers or zap log level strings: `debug` `info` `warn` `error` `dpanic` `panic` `fatal`	string	"info"
operator.maxConcurrentReconciles	int	10
operator.podMonitor.enabled Enable monitoring via the Prometheus operator `PodMonitor` CRD.	bool	true
operator.priorityClassName The priority class name for the operator.	string	"sunk-control-plane"
operator.replicas The number of replicas of the operator pod to run. Leader election will be enabled if this is greater than 1 or leader election is explicitly enabled.	int	1
operator.resources The resource to request for the operator.	object	limits: memory: 2Gi requests: cpu: 2 memory: 2Gi
operator.tolerations The tolerations for the operator deployment.	list	[]
operator.vmPodScrape.enabled Enable monitoring via the VictoriaMetrics operator `VMPodScrape` CRD. Note: To enable this, `podMonitor` must be disabled in addition to setting `vmPodScrape.enabled` to `true`.	bool	false
priorityClass.enabled Enable the priority class for the control plane components.	bool	true
priorityClass.value The value of the priority class, generally should be high relative to other priority classes as these are critical components.	int	1000000000
scheduler.podMonitor.enabled Enable monitoring via the Prometheus operator `PodMonitor` CRD.	bool	true
scheduler.vmPodScrape.enabled Enable monitoring via the VictoriaMetrics operator `VMPodScrape` CRD. Note: To enable this, `podMonitor` must be disabled in addition to setting `vmPodScrape.enabled` to `true`.	bool	false
syncer.podMonitor.enabled Enable monitoring via the Prometheus operator `PodMonitor` CRD.	bool	true
syncer.vmPodScrape.enabled Enable monitoring via the VictoriaMetrics operator `VMPodScrape` CRD. Note: To enable this, `podMonitor` must be disabled in addition to setting `vmPodScrape.enabled` to `true`.	bool	false

Requirements​

Parameters​

Requirements

Parameters