Node Types
Due to high demand, A100 NVLINK (HGX) and H100 NVLINK (HGX) nodes are currently fully committed on client contracts, and are therefore not available for on-demand use. We recommend a conversation with the CoreWeave team to build a strategic plan catered to your needs to make use of available infrastructure and to plan for your future capacity requirements. Contact CoreWeave Sales to get started.
CoreWeave offers a "Standard Instance," which is useful for most workloads per GPU type offered on CoreWeave Cloud. These instances are a starting point, but can be configured entirely to suit your use case or compute needs.
You can view Standard Instance configurations on our website's pricing page.
For more information about à la carte pricing of compute components on CoreWeave Cloud, see Resource Based Pricing.
Component availability
The following table outlines the physical limitation of how many GPUs are available per instance when customizing your instances on CoreWeave Cloud.
Class | Generation | VRAM | Max per Instance | Label |
---|---|---|---|---|
H100 HGX | Hopper | 80 GB | 8 | H100_NVLINK_80GB |
H100 PCIe | Hopper | 80 GB | 8 | H100_PCIE |
A100 HGX | Ampere | 80 GB | 8 | A100_NVLINK_80GB |
A100 HGX | Ampere | 40 GB | 8 | A100_NVLINK |
A100 PCIe | Ampere | 40 GB | 8 | A100_PCIE_40GB |
A100 PCIe | Ampere | 80 GB | 8 | A100_PCIE_80GB |
A40 | Ampere | 48 GB | 8 | A40 |
RTX A6000 | Ampere | 48 GB | 8 | RTX_A6000 |
RTX A5000 | Ampere | 24 GB | 8 | RTX_A5000 |
RTX A4000 | Ampere | 16 GB | 7 | RTX_A4000 |
Tesla V100 NVLINK | Volta | 16 GB | 8 | Tesla_V100_NVLINK |
RTX 5000 | Turing | 16 GB | 4 | Quadro_RTX_5000 |
RTX 4000 | Turing | 8 GB | 7 | Quadro_RTX_4000 |
If a workload requests more peripheral compute resources (vCPU, RAM) than offered in a standard instance size, additional costs will incur.
CPU availability
CPU-only nodes are best suited for tasks such as control-plane services, databases, ingresses and CPU rendering.
CPU Model | Max RAM per vCPU | Max vCPU per Workload | Label |
---|---|---|---|
Intel Xeon v3 | 4 GB | 70 | intel-xeon-v3 |
Intel Xeon v4 | 4 GB | 60 | intel-xeon-v4 |
Intel Xeon Ice Lake | 4 GB | 94 | intel-xeon-icelake |
Intel Xeon Scalable | 6 GB | 94 | intel-xeon-scalable |
AMD Epyc Milan | 4 GB | 46 | amd-epyc-milan |
AMD Epyc Rome | 4 GB | 46 | amd-epyc-rome |
Workloads without GPU requests are always scheduled on CPU nodes.
Requesting compute in Kubernetes
A combination of resource requests and node affinity is used to select the type and amount of compute for your workload. CoreWeave Cloud relies only on these native Kubernetes methods for resource allocation, allowing maximum flexibility. The label used to select the GPU type is gpu.nvidia.com/class
. CPU type is selected using the label node.coreweave.cloud/cpu
.
These labels are mutually exclusive - a specific CPU type cannot be explicitly selected for GPU nodes.
Example specs
- Single A100 80GB
- 8x A100 NVLINK
- A100 With Fallback To A40
- 16 Core Xeon v3/v4 CPU
- Single Epyc CPU
spec:
containers:
- name: example
resources:
limits:
cpu: 15
memory: 97Gi
nvidia.com/gpu: 1
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: gpu.nvidia.com/class
operator: In
values:
- A100_PCIE_80GB
spec:
containers:
- name: example
resources:
requests:
cpu: 90
memory: 700Gi
limits:
nvidia.com/gpu: 8
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: gpu.nvidia.com/class
operator: In
values:
- A100_NVLINK
spec:
containers:
- name: example
resources:
limits:
cpu: 12
memory: 24Gi
nvidia.com/gpu: 1
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: gpu.nvidia.com/class
operator: In
values:
- A100_PCIE_40GB
- A40
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 20
preference:
matchExpressions:
- key: gpu.nvidia.com/class
operator: In
values:
- A100_PCIE_40GB
spec:
containers:
- name: example
resources:
limits:
cpu: 16
memory: 48Gi
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node.coreweave.cloud/cpu
operator: In
values:
- intel-xeon-v3
- intel-xeon-v4
spec:
containers:
- name: example
resources:
limits:
cpu: 1
memory: 4Gi
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node.coreweave.cloud/cpu
operator: In
values:
- amd-epyc-rome
- amd-epyc-milan
Kubernetes allows resources to be scheduled with requests
and limits.
When only limits
are specified, the requests
are set to the same amount as the limit.