CoreWeave
Search…
Node Types
We have configured a "Standard Instance" that we have found to be useful for most workloads for each GPU type offered on CoreWeave Cloud. These instances are a starting point, but can be configured entirely to suit your use case or compute needs.
You can view Standard Instance configurations on the pricing page.
For more information about the à la carte pricing of compute components on CoreWeave Cloud, click below.

Component Availability

When custom configuring your instances on CoreWeave Cloud, the following table outlines the physical limitation of how many GPUs are available per instance.
Vendor
Class
Generation
CUDA Cores
VRAM
Max per Instance
Label
NVIDIA
A100 NVLINK
Ampere
6,912
40 GB
8
A100_NVLINK
NVIDIA
A100 PCIe
Ampere
6,912
40 GB
8
A100_PCIE_40GB
NVIDIA
A100 PCIe
Ampere
6,912
80 GB
8
A100_PCIE_80GB
NVIDIA
A40
Ampere
10,752
48 GB
8
A40
NVIDIA
A6000
Ampere
10,752
48 GB
8
RTX_A6000
NVIDIA
RTX A5000
Ampere
8,192
24 GB
4
RTX_A5000
NVIDIA
RTX A4000
Ampere
6,144
16 GB
7
RTX_A4000
NVIDIA
Tesla V100 NVLINK
Volta
5,120
16 GB
8
Tesla_V100_NVLINK
NVIDIA
RTX 5000
Turing
3,072
16 GB
4
Quadro_RTX_5000
NVIDIA
RTX 4000
Turing
2,304
8 GB
7
Quadro_RTX_4000
If a workload requests more peripheral compute resources (vCPU, RAM) than offered in a standard instance size, additional costs will incur.

CPU Availability

CPU Only nodes are available for tasks such as control-plane services, databases, ingresses and CPU rendering.
CPU Model
RAM per vCPU
Max CPU per Workload
Label
Intel Xeon v3
4 GB
71
intel-xeon-v3
Intel Xeon v4
4 GB
71
intel-xeon-v4
Intel Xeon Scalable
4 GB
31
intel-xeon-scalable
AMD Epyc Rome
4 GB
46
amd-epyc-rome
AMD Epyc Milan
4 GB
46
amd-epyc-milan
Workloads without GPU requests are always scheduled on CPU nodes.

Requesting Compute in Kubernetes

A combination of resource requests and node affinity is used to select the type and amount of compute for your workload. CoreWeave Cloud relies only on these native Kubernetes methods for resource allocation, allowing maximum flexibilty. The label used to select CPU type is node.coreweave.cloud/cpu
Single A100 80GB
8x A100 NVLINK
A100 With Fallback To A40
16 Core Xeon v3/v4 CPU
Single Epyc CPU
1
spec:
2
containers:
3
- name: example
4
resources:
5
limits:
6
cpu: 15
7
memory: 97Gi
8
nvidia.com/gpu: 1
9
10
affinity:
11
nodeAffinity:
12
requiredDuringSchedulingIgnoredDuringExecution:
13
nodeSelectorTerms:
14
- matchExpressions:
15
- key: gpu.nvidia.com/class
16
operator: In
17
values:
18
- A100_PCIE_80GB
Copied!
1
spec:
2
containers:
3
- name: example
4
resources:
5
requests:
6
cpu: 90
7
memory: 700Gi
8
limits:
9
nvidia.com/gpu: 8
10
11
affinity:
12
nodeAffinity:
13
requiredDuringSchedulingIgnoredDuringExecution:
14
nodeSelectorTerms:
15
- matchExpressions:
16
- key: gpu.nvidia.com/class
17
operator: In
18
values:
19
- A100_NVLINK
Copied!
1
spec:
2
containers:
3
- name: example
4
resources:
5
limits:
6
cpu: 12
7
memory: 24Gi
8
nvidia.com/gpu: 1
9
10
affinity:
11
nodeAffinity:
12
requiredDuringSchedulingIgnoredDuringExecution:
13
nodeSelectorTerms:
14
- matchExpressions:
15
- key: gpu.nvidia.com/class
16
operator: In
17
values:
18
- A100_PCIE_40GB
19
- A40
20
preferredDuringSchedulingIgnoredDuringExecution:
21
- weight: 20
22
preference:
23
matchExpressions:
24
- key: gpu.nvidia.com/class
25
operator: In
26
values:
27
- A100_PCIE_40GB
Copied!
1
spec:
2
containers:
3
- name: example
4
resources:
5
limits:
6
cpu: 16
7
memory: 48Gi
8
9
affinity:
10
nodeAffinity:
11
requiredDuringSchedulingIgnoredDuringExecution:
12
nodeSelectorTerms:
13
- matchExpressions:
14
- key: node.coreweave.cloud/cpu
15
operator: In
16
values:
17
- intel-xeon-v3
18
- intel-xeon-v4
Copied!
1
spec:
2
containers:
3
- name: example
4
resources:
5
limits:
6
cpu: 1
7
memory: 4Gi
8
9
affinity:
10
nodeAffinity:
11
requiredDuringSchedulingIgnoredDuringExecution:
12
nodeSelectorTerms:
13
- matchExpressions:
14
- key: node.coreweave.cloud/cpu
15
operator: In
16
values:
17
- amd-epyc-rome
18
- amd-epyc-milan
Copied!
Kubernetes allows resources to be scheduled with requests and limits. When only limits are specified, the requests are set to the same amount as the limit.