How do CPU and memory requests work with GPU Pods?

GPU Pods request GPUs through the nvidia.com/gpu resource, but their CPU and memory requests behave like any other Kubernetes Pod. Set CPU and memory requests to what your workload needs steadily, and limits to the maximum it can consume. The total CPU and memory across all Pods on a GPU Node cannot exceed the Node’s allocatable capacity, so requesting more CPU than a single instance provides keeps the Pod Pending even when a GPU is free. For full details, see Manage CKS Nodes.

Workload Scheduling