Inference API reference - CoreWeave Docs

The CoreWeave Inference API provides programmatic control over inference gateways, model deployments, and capacity claims. The API is available at api.coreweave.com. This page covers cross-cutting topics: authentication, protocols, status values, error formats, and the OpenAI-compatible inference endpoint. For per-endpoint request and response schemas, see the per-operation pages under each service in the left sidebar:

The Inference API is versioned as v1alpha1. APIs may change before general availability.

Authentication

All API requests must include a CoreWeave API access token in the Authorization header as a Bearer token. The token must belong to a user with the Inference Viewer or Inference Admin role, depending on the operation.

curl "https://api.coreweave.com/v1alpha1/inference/gateways" \
  -H "Authorization: Bearer [API-TOKEN]"

For details on obtaining an API token, see Manage API access tokens.

Protocol support

The Inference API supports multiple protocols:

Protocol	Description
REST/JSON	Standard HTTP/1.1 with JSON request and response bodies. All examples in this documentation use REST.
gRPC	Protocol buffers over HTTP/2 for high-performance programmatic access.
Connect	gRPC-compatible protocol with improved browser and HTTP/1.1 support.

Query parameters

List endpoints support the following query parameter:

Parameter	Type	Description
`updatedAfter`	`date-time`	Filter resources to those updated after the specified timestamp (ISO 8601 format).

Status values

All resources share a common set of status values:

Status	Description
`STATUS_CREATING`	Resource is being provisioned.
`STATUS_READY`	Resource is active and operational.
`STATUS_UPDATING`	Resource configuration is being updated.
`STATUS_DELETING`	Resource is being removed.
`STATUS_ERROR`	Resource encountered a recoverable error.
`STATUS_FAILED`	Resource encountered a terminal error.

Each resource includes a conditions array in its status with detailed information about the current state, including timestamps, reasons, and human-readable messages.

Error responses

Error responses follow the standard format:

{
  "code": 400,
  "message": "name is required",
  "details": []
}

OpenAI-compatible endpoint

Deployed models expose an OpenAI-compatible completions endpoint through their associated gateway. The endpoint URL depends on the gateway’s routing strategy. See Gateways for details on how each routing strategy constructs the request URL. For a complete walkthrough, see the Getting started guide.

Documentation Index

​Authentication

​Protocol support

​Query parameters

​Status values

​Error responses

​OpenAI-compatible endpoint

Authentication

Protocol support

Query parameters

Status values

Error responses

OpenAI-compatible endpoint