> ## Documentation Index
> Fetch the complete documentation index at: https://docs.coreweave.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Inference API reference

> Authentication, protocols, status values, and error formats for the CoreWeave Inference API

The CoreWeave Inference API provides programmatic control over inference [gateways](/products/inference/gateways), model [deployments](/products/inference/models), and [capacity claims](/products/inference/scaling#capacity-claims). The API is available at `api.coreweave.com`.

This page covers cross-cutting topics: authentication, protocols, status values, error formats, and the OpenAI-compatible inference endpoint. For per-endpoint request and response schemas, see the per-operation pages under each service in the left sidebar:

* [CapacityClaimService](/products/inference/reference/api-overview/capacityclaimservice/list-capacity-claims)
* [DeploymentService](/products/inference/reference/api-overview/deploymentservice/list-deployments)
* [GatewayService](/products/inference/reference/api-overview/gatewayservice/list-gateways)

<Note>
  The Inference API is versioned as `v1alpha1`. APIs may change before general availability.
</Note>

## Authentication

The Inference API uses bearer token authentication to identify the caller and authorize each request. All API requests must include a CoreWeave API access token in the `Authorization` header as a Bearer token. The token must belong to a user with the [Inference Viewer or Inference Admin role](/security/iam/access-policies), depending on the operation.

Replace `[API-TOKEN]` with your CoreWeave API access token.

```bash theme={"system"}
curl "https://api.coreweave.com/v1alpha1/inference/gateways" \
  -H "Authorization: Bearer [API-TOKEN]"
```

For details on obtaining an API token, see [Manage API access tokens](/security/authn-authz/manage-api-access-tokens).

## Protocol support

You can call the Inference API over several transport protocols, depending on your client tooling and performance needs:

| Protocol  | Description                                                                                           |
| --------- | ----------------------------------------------------------------------------------------------------- |
| REST/JSON | Standard HTTP/1.1 with JSON request and response bodies. All examples in this documentation use REST. |
| gRPC      | Protocol buffers over HTTP/2 for high-performance programmatic access.                                |
| Connect   | gRPC-compatible protocol with improved browser and HTTP/1.1 support.                                  |

## Query parameters

List endpoints support the following query parameter:

| Parameter      | Type        | Description                                                                        |
| -------------- | ----------- | ---------------------------------------------------------------------------------- |
| `updatedAfter` | `date-time` | Filter resources to those updated after the specified timestamp (ISO 8601 format). |

## Status values

Use these status values to determine the lifecycle state of a resource when polling or reconciling state in your application. All resources share a common set of status values:

| Status            | Description                               |
| ----------------- | ----------------------------------------- |
| `STATUS_CREATING` | Resource is being provisioned.            |
| `STATUS_READY`    | Resource is active and operational.       |
| `STATUS_UPDATING` | Resource configuration is being updated.  |
| `STATUS_DELETING` | Resource is being removed.                |
| `STATUS_ERROR`    | Resource encountered a recoverable error. |
| `STATUS_FAILED`   | Resource encountered a terminal error.    |

Each resource includes a `conditions` array in its status with detailed information about the current state, including timestamps, reasons, and human-readable messages.

## Error responses

When a request fails, the API returns a structured error body that your client can parse to surface details to users or trigger retries. Error responses follow the standard format:

```json theme={"system"}
{
  "code": 400,
  "message": "name is required",
  "details": []
}
```

## OpenAI-compatible endpoint

Deployed models expose an OpenAI-compatible completions endpoint through their associated gateway. The endpoint URL depends on the gateway's routing strategy. See [Gateways](/products/inference/gateways#routing-strategies) for details on how each routing strategy constructs the request URL.

For a complete walkthrough, see the [Getting started](/products/inference/getting-started) guide.