GPU Instance Details

CoreWeave provides a diverse portfolio of GPU instances tailored to a wide spectrum of computational workloads, from graphics rendering to large-scale AI model training and inference. The following section provides a technical overview of our GPU offerings and outlines their suitability for various machine learning tasks, with a specific focus on deploying Large Language Models (LLMs).

GPU comparison

The following summary table provides a side-by-side comparison of the GPU compute instances available at CoreWeave. Available instance types vary per Region, and pricing is available in the Pricing section.

GPU Specs
CPU & Memory
Storage & Network

Name	Instance ID	GPU Count	GPU RAM (GB)	Performance Tier
GB300 (InfiniBand)	`gb300-4x`	4	279	State-of-the-Art Compute
GB200 (InfiniBand)	`gb200-4x`	4	186	State-of-the-Art Compute
B200 (InfiniBand)	`b200-8x`	8	180	State-of-the-Art Compute
H200 (InfiniBand)	`gd-8xh200ib-i128`	8	141	Mid-large Size Model Training & Inference
H100 (InfiniBand)	`gd-8xh100ib-i128`	8	80	Mid-large Size Model Training & Inference
RTX Pro 6000 Blackwell Server Edition	`rtxp6000-8x`	8	96	Professional AI & Graphics
L40S	`gd-8xl40s-i128`	8	48	Professional AI & Graphics
L40	`gd-8xl40-i128`	8	48	Professional AI & Graphics
GH200	`gd-1xgh200`	1	96	Professional AI & Graphics
A100	`gd-8xa100-i128`	8	80	Professional AI & Graphics

Name	Instance ID	CPU Model	Speed (GHz)	vCPU Count	RAM (GB)
GB300 (InfiniBand)	`gb300-4x`	2x NVIDIA Grace Arm v9	3.10	144	960
GB200 (InfiniBand)	`gb200-4x`	2x NVIDIA Grace Arm v9	3.10	144	960
B200 (InfiniBand)	`b200-8x`	Intel Emerald Rapids 8562Y+	2.80	128	2048
H200 (InfiniBand)	`gd-8xh200ib-i128`	Intel Emerald Rapids 8562Y+	2.80	128	2048
H100 (InfiniBand)	`gd-8xh100ib-i128`	Intel Sapphire Rapids 8462Y+	2.80	128	2048
RTX Pro 6000 Blackwell Server Edition	`rtxp6000-8x`	Intel Emerald Rapids 8562Y+	2.80	128	1024
L40S	`gd-8xl40s-i128`	Intel Sapphire Rapids 8462Y+	2.80	128	1024
L40	`gd-8xl40-i128`	Intel Sapphire Rapids 8462Y+	2.80	128	1024
GH200	`gd-1xgh200`	NVIDIA Grace Arm v9	3.10	72	480
A100	`gd-8xa100-i128`	Intel Ice Lake 8358	2.60	128	2048

Name	Instance ID	Local Disk (TB)¹	Network Speed²	GPU Connectivity
GB300 (InfiniBand)	`gb300-4x`	61.44	Dual-port 200GbE	InfiniBand & NVLink
GB200 (InfiniBand)	`gb200-4x`	30.72	Dual-port 100GbE	InfiniBand & NVLink
B200 (InfiniBand)	`b200-8x`	61.44	Dual-port 100GbE	InfiniBand & NVLink
H200 (InfiniBand)	`gd-8xh200ib-i128`	61.44	Dual-port 100GbE	InfiniBand & NVLink
H100 (InfiniBand)	`gd-8xh100ib-i128`	61.44	Dual-port 100GbE	InfiniBand & NVLink
RTX Pro 6000 Blackwell Server Edition	`rtxp6000-8x`	7.68	Dual-port 100GbE	PCIe
L40S	`gd-8xl40s-i128`	7.68	Dual-port 100GbE	PCIe
L40	`gd-8xl40-i128`	7.68	Dual-port 100GbE	PCIe
GH200	`gd-1xgh200`	7.68	Dual-port 100GbE	PCIe
A100	`gd-8xa100-i128`	7.68	Single-port 100GbE	NVLink

¹The default configuration deploys local disks as RAID10 volumes, which have 50% less usable space than the total capacity listed.

²Instances with Dual-port 100GbE networking virtualize the interfaces to present a single interface with a total network bandwidth of 200Gbps to the Node.

GPU compute tiers

CoreWeave's GPU instances are categorized into three distinct tiers, each optimized for specific workloads and performance requirements:

State-of-the-Art Compute Tier: For unprecedented scale, these instances are designed for the most demanding AI workloads, including training and inference of multi-trillion parameter models.
Mid-large Size Model Training & Inference Tier: For training and inference of large models, these instances provide a balanced combination of GPU memory and compute power.
Professional AI & Graphics Tier: For professional AI and graphics workloads, these instances are optimized for high-performance graphics rendering, AI inference, and smaller-scale model training.

See Selecting an Instance for help choosing the right instance for your workload.

State-of-the-Art Compute Tier: For unprecedented scale

GB300 NVL72-powered instances

Powered by four NVIDIA GB300 Superchips, each featuring a "Blackwell Ultra" GPU with an unprecedented 279 GB of memory, these instances represent the absolute pinnacle of our high-performance computing offerings. These instances form part of a larger NVL72 rack architecture which boasts 21 TB of total GPU memory, interconnected by 5th-generation NVLink for a seamless, rack-scale memory fabric. For clustering, they are equipped with next-generation NVIDIA Quantum-X800 InfiniBand and ConnectX-8 SuperNICs, delivering a groundbreaking 800 Gbps of network bandwidth, double that of the previous generation.

Primary Use Cases: Training next-generation foundation models in the trillion-parameter class, massive-scale and high-fidelity inference on the most complex AI models, and scientific simulations requiring maximum memory capacity and the fastest data throughput.
Recommended Models: Future frontier models (1T+ parameters), next-generation multimodal systems, and large-scale scientific discovery models.
Note: Instances must be requested in multiples of 18 to leverage the full NVL72 rack architecture.

Specifications: GB300 (InfiniBand)

Specification	Value
Instance ID	`gb300-4x`
GPU Count	4
GPU RAM (GB)	279
CPU Model	2x NVIDIA Grace (Arm v9)
vCPU Count	144
RAM (GB)	960
Local Disk (TB)	61.44
Network Speed	Dual-port 200GbE
GPU Connectivity	InfiniBand & NVLink
Cost per Hour	Contact us for pricing

GB200 NVL72-powered instances

Powered by NVIDIA GB200 GPUs with InfiniBand connectivity, these instances deliver a monumental leap in performance over the previous generation, setting the new standard for large-scale AI and high-performance computing. The GB200 instances are equipped with 13 TB of high-bandwidth GPU memory per rack and fifth-generation NVLink with up to 130 TB/s total bandwidth. Blackwell GPUs feature a second-generation Transformer Engine with new FP4 and FP6 precision, drastically accelerating LLM inference and training. The NVL72 architecture provides unprecedented memory capacity and ultra-fast GPU-to-GPU communication for distributed computing at the largest scale.

Primary Use Cases: Training foundation models from scratch (500B+ parameters), massive-scale inference, and developing frontier AI systems.
Recommended Models: GPT-5-class and future state-of-the-art models, complex Mixture-of-Experts (MoE) models, and large multimodal systems.
Note: Instances must be requested in multiples of 18 to leverage the full NVL72 rack architecture.

Specifications: GB200 (InfiniBand)

Specification	Value
Instance ID	`gb200-4x`
GPU Count	4
GPU RAM (GB)	186
CPU Model	2x NVIDIA Grace (Arm v9)
vCPU Count	144
RAM (GB)	960
Local Disk (TB)	30.72
Network Speed	Dual-port 100GbE
GPU Connectivity	InfiniBand & NVLink
Cost per Hour	$42.00

B200 instances

Powered by eight NVIDIA B200 Blackwell GPUs, each with 180 GB of HBM3e memory, these instances deliver outstanding performance for enterprise AI workloads, including training and real-time inference. They offer more than twice the performance per GPU compared to previous-generation NVIDIA Hopper GPUs. The GPUs are interconnected using NVIDIA NVLink and NVLink Switch to enable fast data transfer. For networking, B200 instances feature an NVIDIA BlueField-3 DPU and eight NVIDIA ConnectX-7 InfiniBand host channel adapters (HCAs), supporting high throughput and lossless scaling. A 400G NDR non-blocking NVIDIA Quantum-2 InfiniBand fabric ensures seamless, high-speed connectivity.

Primary Use Cases: Training foundation models from scratch (100B+ parameters), massive-scale inference, and developing frontier AI systems.
Recommended Models: GPT-5-class and future state-of-the-art models, complex Mixture-of-Experts (MoE) models, and large multimodal systems.

Specifications: B200 (InfiniBand)

Specification	Value
Instance ID	`b200-8x`
GPU Count	8
GPU RAM (GB)	180
CPU Model	Intel Emerald Rapids (8562Y+, 2.80 GHz)
vCPU Count	128
RAM (GB)	2048
Local Disk (TB)	61.44
Network Speed	Dual-port 100GbE
GPU Connectivity	InfiniBand & NVLink
Cost per Hour	$68.80

Mid-large Size Model Training & Inference Tier

H200 instances

Powered by eight NVIDIA H200 Hopper GPUs, each equipped with 141 GB of ultra-fast HBM3e memory, these instances are purpose-built for memory-intensive AI and HPC workloads. The significant memory per GPU allows for training larger models with bigger batch sizes and longer context windows compared to its predecessor. The GPUs are interconnected with NVIDIA NVLink for a high-speed, unified memory pool within the server. For clustering, they feature NVIDIA ConnectX-7 adapters providing a 400G NDR InfiniBand fabric, making them ideal for large, distributed training jobs that are memory-bound.

Primary Use Cases: Training and fine-tuning large models (~70B parameters) with high precision, high-throughput inference with long context lengths, and memory-intensive scientific computing.
Recommended Models: Llama 3 70B (at full precision), Falcon 180B (quantized), and other large models requiring substantial GPU memory for a single Node.

Specifications: H200 (InfiniBand)

Specification	Value
Instance ID	`gd-8xh200ib-i128`
GPU Count	8
GPU RAM (GB)	141
CPU Model	Intel Emerald Rapids (8562Y+, 2.80 GHz)
vCPU Count	128
RAM (GB)	2048
Local Disk (TB)	61.44
Network Speed	Dual-port 100GbE
GPU Connectivity	InfiniBand & NVLink
Cost per Hour	$50.44

H100 instances

These instances are powered by eight NVIDIA H100 Hopper GPUs, each with 80 GB of HBM3 memory. As the industry-defining accelerator of the previous generation, the H100 remains a powerful and reliable workhorse for a wide range of large-scale AI workloads. GPUs are fully connected with NVLink, and Nodes are linked with a 400G NDR InfiniBand fabric, providing a robust architecture for distributed training and inference.

Primary Use Cases: Large-scale AI training, high-performance inference serving, and a wide variety of HPC simulations.
Recommended Models: Llama 3 70B (with quantization or ZeRO), Mixtral 8x7B, Qwen2 72B, and fine-tuning most open-source models.

Specifications: H100 (InfiniBand)

Specification	Value
Instance ID	`gd-8xh100ib-i128`
GPU Count	8
GPU RAM (GB)	80
CPU Model	Intel Sapphire Rapids (8462Y+, 2.80 GHz)
vCPU Count	128
RAM (GB)	2048
Local Disk (TB)	61.44
Network Speed	Dual-port 100GbE
GPU Connectivity	InfiniBand & NVLink
Cost per Hour	$49.24

Professional AI & Graphics Tier

RTX Pro 6000 (Blackwell) instances

Powered by eight NVIDIA RTX Pro 6000 GPUs based on the Blackwell architecture, these instances provide a versatile and efficient solution for a mix of AI and professional graphics workloads. Each GPU has 96 GB of GDDR7 memory and connects via a high-speed PCIe interface. While they lack NVLink for memory pooling across GPUs, their large individual GPU memory makes them excellent for running multiple inference jobs in parallel or handling high-resolution rendering tasks.

Primary Use Cases: High-performance inference and fine-tuning for mid-sized models and high-fidelity rendering.
Recommended Models: Inference and fine-tuning of models up to 70B, such as Llama 3 70B and Mixtral 8x7B, on a per-GPU basis.

Specifications: RTX Pro 6000 Blackwell Server Edition

Specification	Value
Instance ID	`rtxp6000-8x`
GPU Count	8
GPU RAM (GB)	96
CPU Model	Intel Emerald Rapids (8562Y+, 2.80 GHz)
vCPU Count	128
RAM (GB)	1024
Local Disk (TB)	7.68
Network Speed	Dual-port 100GbE
GPU Connectivity	PCIe
Cost per Hour	$20.00

L40S instances

These instances feature eight NVIDIA L40S GPUs, each with 48 GB of GDDR6 memory. The L40S is a compute-optimized version of the L40, specifically tuned to deliver higher performance for AI and data science workloads. Connected via PCIe, these GPUs are a cost-effective choice for scaling out inference and for fine-tuning mainstream models. They provide a strong balance of performance and value for enterprise AI deployment.

Primary Use Cases: High-throughput inference, efficient fine-tuning, video analytics, and other compute-focused AI tasks.
Recommended Models: Inference for models up to ~40B parameters like Qwen2 32B or LLaVA 34B; efficient fine-tuning of 7B-13B models.

Specifications: L40S

Specification	Value
Instance ID	`gd-8xl40s-i128`
GPU Count	8
GPU RAM (GB)	48
CPU Model	Intel Sapphire Rapids (8462Y+, 2.80 GHz)
vCPU Count	128
RAM (GB)	1024
Local Disk (TB)	7.68
Network Speed	Dual-port 100GbE
GPU Connectivity	PCIe
Cost per Hour	$18.00

L40 instances

Powered by eight NVIDIA L40 GPUs, each with 48 GB of GDDR6 memory, these instances are designed for exceptional versatility across AI and visual computing. Connected via PCIe, they offer a balanced profile for workloads that include AI inference, rendering, and video processing. They are an ideal entry point for deploying AI-powered services that also have a significant graphics component.

Primary Use Cases: General-purpose AI inference, small-scale model fine-tuning, high-resolution rendering, and virtual desktop infrastructure (VDI).
Recommended Models: Inference for models up to ~40B parameters; ideal for models like Stable Diffusion and various computer vision models.

Specifications: L40

Specification	Value
Instance ID	`gd-8xl40-i128`
GPU Count	8
GPU RAM (GB)	48
CPU Model	Intel Sapphire Rapids (8462Y+, 2.80 GHz)
vCPU Count	128
RAM (GB)	1024
Local Disk (TB)	7.68
Network Speed	Dual-port 100GbE
GPU Connectivity	PCIe
Cost per Hour	$10.00

GH200 instances

Powered by a single NVIDIA GH200 Grace Hopper Superchip, this instance offers a unique architecture that combines a 72-core Grace CPU with a Hopper GPU. Its defining feature is the 576 GB unified memory pool (96 GB HBM3e + 480 GB LPDDR5X) connected via the high-speed NVLink-C2C interconnect. This design eliminates traditional PCIe bottlenecks and allows the GPU to access a massive memory space, making it unparalleled for running extremely large models on a single machine.

Primary Use Cases: Low-latency inference for models that are too large to fit in the memory of a standard GPU, large-scale graph analytics, and memory-intensive data science.
Recommended Models: Inference on models like Llama 3.1 405B, Nemotron-4 340B, and other models in the 100B-500B parameter range.

Specifications: GH200

Specification	Value
Instance ID	`gd-1xgh200`
GPU Count	1
GPU RAM (GB)	96
CPU Model	NVIDIA Grace (Arm v9)
vCPU Count	72
RAM (GB)	480
Local Disk (TB)	7.68
Network Speed	Dual-port 100GbE
GPU Connectivity	PCIe
Cost per Hour	$6.50

A100 instances

These instances are powered by eight NVIDIA A100 Tensor Core GPUs, each providing 80 GB of high-bandwidth HBM2e memory. As a foundational accelerator for the previous generation of AI, these are versatile instances that deliver excellent performance for mixed-precision workloads. The GPUs are interconnected with NVIDIA NVLink, making them a powerful and proven choice for distributed training. Their configuration offers a cost-effective solution for a wide range of demanding AI and data analytics tasks.

Primary Use Cases: AI model training, high-performance inference, and various HPC applications including scientific simulations.
Recommended Models: Training models up to 40B parameters like Qwen2 32B from scratch; a standard choice for fine-tuning most large open-source models. For inference, they can handle models up to 70B, such as Llama 3 70B, on a per-GPU basis.

Specifications: A100

Specification	Value
Instance ID	`gd-8xa100-i128`
GPU Count	8
GPU RAM (GB)	80
CPU Model	Intel Ice Lake (8358, 2.60 GHz)
vCPU Count	128
RAM (GB)	2048
Local Disk (TB)	7.68
Network Speed	Single-port 100GbE
GPU Connectivity	NVLink
Cost per Hour	$21.60

GPU comparison​

GPU compute tiers​

State-of-the-Art Compute Tier: For unprecedented scale​

GB300 NVL72-powered instances​

GB200 NVL72-powered instances​

B200 instances​

Mid-large Size Model Training & Inference Tier​

H200 instances​

H100 instances​

Professional AI & Graphics Tier​

RTX Pro 6000 (Blackwell) instances​

L40S instances​

L40 instances​

GH200 instances​

A100 instances​