Help us improve CoreWeave documentation. Take the docs survey.
curl --request POST \
--url https://api.example.com/v1alpha1/inference/deployments \
--header 'Content-Type: application/json' \
--data '
{
"name": "<string>",
"gatewayIds": [
"<string>"
],
"runtime": {
"engine": "<string>",
"version": "<string>",
"engineConfig": {}
},
"resources": {
"instanceType": "<string>",
"gpuCount": 123
},
"model": {
"name": "<string>",
"bucket": "<string>",
"path": "<string>"
},
"autoscaling": {
"min": 123,
"max": 123,
"priority": 123,
"capacityClasses": [
123
],
"concurrency": 123
},
"traffic": {
"weight": 123
},
"id": "<string>",
"disabled": true
}
'{
"deployment": {
"spec": {
"id": "<string>",
"name": "<string>",
"gatewayIds": [
"<string>"
],
"runtime": {
"engine": "<string>",
"version": "<string>",
"engineConfig": {}
},
"resources": {
"instanceType": "<string>",
"gpuCount": 123
},
"model": {
"name": "<string>",
"bucket": "<string>",
"path": "<string>"
},
"autoscaling": {
"min": 123,
"max": 123,
"priority": 123,
"capacityClasses": [
123
],
"concurrency": 123
},
"traffic": {
"weight": 123
},
"organizationId": "<string>",
"disabled": true
},
"status": {
"createdAt": "2023-11-07T05:31:56Z",
"updatedAt": "2023-11-07T05:31:56Z",
"status": 123,
"conditions": [
{
"type": "<string>",
"status": 123,
"lastUpdateTime": "2023-11-07T05:31:56Z",
"reason": "<string>",
"message": "<string>",
"zone": "<string>"
}
]
}
}
}CreateDeployment creates a new deployment.
curl --request POST \
--url https://api.example.com/v1alpha1/inference/deployments \
--header 'Content-Type: application/json' \
--data '
{
"name": "<string>",
"gatewayIds": [
"<string>"
],
"runtime": {
"engine": "<string>",
"version": "<string>",
"engineConfig": {}
},
"resources": {
"instanceType": "<string>",
"gpuCount": 123
},
"model": {
"name": "<string>",
"bucket": "<string>",
"path": "<string>"
},
"autoscaling": {
"min": 123,
"max": 123,
"priority": 123,
"capacityClasses": [
123
],
"concurrency": 123
},
"traffic": {
"weight": 123
},
"id": "<string>",
"disabled": true
}
'{
"deployment": {
"spec": {
"id": "<string>",
"name": "<string>",
"gatewayIds": [
"<string>"
],
"runtime": {
"engine": "<string>",
"version": "<string>",
"engineConfig": {}
},
"resources": {
"instanceType": "<string>",
"gpuCount": 123
},
"model": {
"name": "<string>",
"bucket": "<string>",
"path": "<string>"
},
"autoscaling": {
"min": 123,
"max": 123,
"priority": 123,
"capacityClasses": [
123
],
"concurrency": 123
},
"traffic": {
"weight": 123
},
"organizationId": "<string>",
"disabled": true
},
"status": {
"createdAt": "2023-11-07T05:31:56Z",
"updatedAt": "2023-11-07T05:31:56Z",
"status": 123,
"conditions": [
{
"type": "<string>",
"status": 123,
"lastUpdateTime": "2023-11-07T05:31:56Z",
"reason": "<string>",
"message": "<string>",
"zone": "<string>"
}
]
}
}
}Documentation Index
Fetch the complete documentation index at: https://docs.coreweave.com/llms.txt
Use this file to discover all available pages before exploring further.
Request for CreateDeployment
The name of the deployment
The gateways to associate the deployment with
Runtime selection and configuration
Show child attributes
Resource configuration for the deployment
Show child attributes
The model configuration
Show child attributes
The autoscaling configuration
Show child attributes
The traffic configuration for the deployment
Show child attributes
The unique identifier of the deployment, UUID format
Disable the deployment
OK
Response for CreateDeployment
The deployment that was created
Show child attributes
Was this page helpful?