Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.coreweave.com/llms.txt

Use this file to discover all available pages before exploring further.

A capacity plan defines how you secure and pay for compute capacity on the CoreWeave platform. It governs your commitment level, billing structure, and capacity guarantees. The right choice affects your costs, availability, and flexibility. CoreWeave Kubernetes Service (CKS) offers four capacity plans: The table below compares them by capacity, commitment, and pricing:
Flex ReservationsReserved InstancesSpot InstancesOn-Demand
Capacity GuaranteeYesYesNoNo
CommitmentTerm DurationTerm DurationNoneNone
PreemptibleNoNoYesNo
Pay-as-you-goYes for usage rate
No for holding rate
NoYesYes
Ideal forVariable usage curvesFixed workloadsAd-hoc, interruptible workloadsAd-hoc workloads
Each capacity plan balances cost and flexibility differently. Workloads with uneven capacity demands benefit from the Flex model. Fixed workloads work best with the low flat rate of a Reserved Instance.

Flex Reservations

Flex Reservations provide guaranteed peak capacity while reducing the cost of idle capacity. This model suits critical workloads that need guaranteed capacity but have low enough utilization to make full-time Reserved Instances cost-prohibitive.

Dual-rate billing structure

Flex uses two rates: a Holding rate for reserved capacity and a Usage rate when instances are active.
  • Holding rate: A baseline fee that guarantees capacity is held for you. This rate applies to all reserved capacity throughout the billing cycle.
  • Usage rate: An incremental fee applied only when instances are active in your cluster.
With Flex Reservations, reserved Nodes are billed at the Holding rate throughout the billing cycle, while used Nodes are billed at the Usage rate. The following graphic illustrates Node usage over the day of month versus a constant peak capacity line. Area chart of node usage over the day of month versus a constant peak capacity line. Usage varies; the peak line is the committed capacity ceiling.

Reserved Instances

Reserved Instances (RI) provide a guaranteed capacity reservation. During your term commitment, the reserved instances are available in your chosen region regardless of platform-wide demand. You pay a flat rate for every hour in the billing period, regardless of whether instances are running or idle.

Spot Instances

Spot Instances provide access to unused CoreWeave capacity at steep discounts. Spot Instances are preemptible, so they can be reclaimed by CoreWeave at any time if the capacity is required elsewhere. They work well for fault-tolerant or stateless workloads like AI inference. Spot Instances are managed through Spot Node Pools.

On-Demand

On-Demand instances provide burst capacity for unpredictable workloads, where you need to scale up to meet demand. These instances are billed based on actual use without a long-term commitment. They have no capacity guarantees. During peak demand periods, On-Demand instances may not be available.

Understand billing and usage attribution

Combining capacity plans lets you reserve a “floor” of capacity for steady-state workloads while keeping headroom for spikes. The billing pipeline attributes usage at the SKU level and produces an itemized invoice from your contract terms and real-time usage with separate line items for Reserved Instances, Flex holding fees, and Flex usage rates.

What appears on your invoice

Billing reconciles all consumption into one invoice. Each line item shows the SKU, quantity (hours), and the rate type that was applied. The following table illustrates how a typical month might be broken out for a customer with a B200 Flex Reservation who also used On-Demand, plus RTX Pro 6000 Spot capacity.
Line itemQuantityRate typeHow usage is attributed
B200 (Flex reservation)720 hoursHolding rateApplied to every hour in the billing period to hold the reserved capacity.
B200 (Active usage)300 hoursUsage rateApplied only to hours when Flex Nodes were active in your cluster.
B200 (On-Demand)12 hoursStandard rateUsage exceeded Flex capacity during a peak; attributed as On-Demand overage.
RTX Pro 6000 (Spot node pool)100 hoursSpot rateSeparate compute class; attributed outside the RI/Flex/On-Demand filling logic.

Order of attribution

Attribution runs every 30 seconds. For each SKU, usage is applied in this order:
  1. Reserved Instances: Usage fills all available Reserved Instance capacity first.
  2. Flex: Remaining usage is applied to the Flex capacity band.
  3. On-Demand: Usage beyond the sum of Reserved Instance and Flex capacity is billed as On-Demand overage.
Nodes with Spot compute class are handled separately and appear as their own line items on the invoice.

How each line item is applied

  • Flex (Holding + Usage): The Holding rate applies to every hour in the period (for example, 720 hours in a 30-day month) to guarantee the reservation. The Usage rate applies only to the hours when Flex Nodes were actually active in the cluster. Total Flex cost is the sum of those two components.
  • On-Demand: When usage for a SKU exceeds your Reserved Instance and Flex capacity, the excess is attributed at the standard On-Demand rate. In the table above, the 12 On-Demand hours represent a peak that went beyond the Flex ceiling.
  • Spot: Workloads scheduled on Spot Node Pools are billed at the Spot rate for that SKU and region. Spot is tracked as a separate compute class, so it does not consume or affect your RI or Flex capacity.
For SKU-specific pricing or to convert a Flex Reservation to a Reserved Instance, contact your CoreWeave Account Executive.
Last modified on April 2, 2026