AI models with trillions of parameters are becoming increasingly common, and the demand for computational power is surging. Traditional GPU solutions are struggling to meet these demands, leading to development bottlenecks, high energy consumption, and escalating costs. CoreWeave’s GB200 and GB300 NVL72-powered instances address these challenges by harnessing the groundbreaking architecture of NVIDIA’s Grace Blackwell Superchip and NVLink Switch System. Liquid cooling improves the overall efficiency by consuming less energy than traditional air-cooled systems. These instances represent the pinnacle of our high-performance computing offerings. Choose these instances when you need maximum performance for large-scale AI training and inference, unprecedented memory capacity for massive datasets, and ultra-fast GPU-to-GPU communication for distributed computing.Documentation Index
Fetch the complete documentation index at: https://docs.coreweave.com/llms.txt
Use this file to discover all available pages before exploring further.
GB200 instances
CoreWeave’s GB200 instances are powered by 4x NVIDIA GB200 GPUs and connected with 400 Gb/s NDR InfiniBand, built on the NVIDIA Quantum-2 InfiniBand fabric.GB300 instances
CoreWeave’s GB300 instances are offered in two specialized networking configurations delivering a groundbreaking 800 Gbps of bandwidth:- GB300 instances with Quantum-X InfiniBand are optimized for the lowest possible latency in traditional HPC and massive-scale AI training.
- GB300 instances with Spectrum-X RoCE (RDMA over Converged Ethernet) leverage BlueField-3 and ConnectX-8 SuperNICs for large scale AI in Ethernet-based cloud environments.
Deploy NVL72-powered instances as full racks
Because NVL72-powered instances must be deployed as full racks, CoreWeave’s Day 2+ automation cannot automatically replace a misbehaving Node with one from a different rack. Instead, NVL72-powered Nodes must be physically exchanged within the same rack. As a best practice, workloads should tolerate up to two unavailable Nodes per rack for maintenance purposes. If a rack experiences more than two unavailable Nodes, the entire rack is cordoned and drained for service.Manage Pod affinity
To fully leverage the capabilities of the NVL72 architecture’s shared NVLink fabric, all Nodes from the same job must be scheduled onto the same rack with the same NVLink domain for optimal performance. This is especially important for large-scale distributed computing tasks, where efficient communication between GPUs can dramatically reduce processing times.Slurm users should use the Topology/Block Plugin for Slurm to control job placement.
Control placement with NVLink domain
Kubernetes controls Pod placement with affinity rules that steer Pods toward Nodes with specific labels. In CKS, all Nodes are labeled with their NVLink domain, allowing precise control over Pod placement. To ensure multiple Pods are scheduled onto the same NVL72 rack, set their affinity toward Nodes within the same NVLink domain. In the NVL72 architecture, all Nodes within the same rack share the same uniqueds.coreweave.com/nvlink.domain label. If a Node Pool spans multiple racks, Pods can reference multiple NVLink domains in matchExpressions.values.
For example, this Pod affinity rule targets a single NVLink domain: