Object Storage
Use CoreWeave's S3-compatible Object Storage for flexible and efficient data storage
Coreweave Object Storage is an S3-compatible storage system that allows data to be used in a flexible and efficient way, with features that include:
CoreWeave Object Storage supports multiple regions, allowing you to utilize regionally optimized clusters for your needs.
Any S3-compatible CLI tool or SDK integration may be used in tandem with Object Storage.
To get started with Object Storage, generate a token in the Cloud UI, or deploy with kubectl. Then, manage your data with Kubernetes, Python, or tools like s3cmd, s5cmd, and rclone.
Accelerated Object Storage uses anycasted NVMe-backed storage caches for blazing fast download of model weights, training data, and other frequently-accessed objects.
Get started
Object Storage is easily deployed and configured with the Cloud UI. Advanced users that need more control, encryption, or fine-grained bucket policies should deploy with Kubernetes.
Using the Cloud UI
Use the Object Storage section of the Cloud UI to generate tokens and s3cmd configuration files. A token is a key-pair used by utilities such as s3cmd, s5cmd, rclone, Boto3, and more.
To create one, click Create a New Token. This brings up the Create Token modal, which prompts you to assign a name, default region (which can be changed later), and an access level.
If using s3cmd, select Automatically download s3cmd config.
Once created, the Access and Secret keys are available on the Object Storage page. Click the icon to the right of each key to copy it to the clipboard.
The download icon on the far right generates a new s3cmd configuration, with an opportunity to set the default region.
Please see our s3cmd documentation to learn how to use the config file.
About access levels
When creating the token, choose the desired access level: Read, Write, Read/Write, and Full. To learn more about the capabilities of each level, please see Identity and Access Management (IAM) and access levels below.
From the Object Storage page, the Access Level field displays the key's current access level.
Using Custom Resource Definitions (CRDs)
CoreWeave provides Kubernetes Custom Resource Definitions (CRDs) for programmatic and automated methods of generating access to Object Storage clusters.
In most cases, a single user can be used. If separate credentials per user are needed, generate them by deploying a user CRD. More granular permissions can be created with bucket policies.
The user CRD
Deploying a user CRD creates a user with access the Object Storage clusters. Each user has an access key and a secret key, which are stored in a Kubernetes secret in the namespace.
The Secret's name can be controlled with spec.secretName
as shown below. If not specified, the secret is named <namespace>-<metadata.name>-obj-store-creds
. The Secret is associated with the user, and deleted when the user is deleted.
Tip
As a best practice, do not share a secretName
between multiple users. Because Secrets are associated with users, they are automatically deleted when the user is deleted, disrupting other users with the same Secret name. Unless setting the secretName is important for the use-case, consider leaving it blank to generate the default name.
After creating the user CRD, it can be viewed and deleted with kubectl
. It can also be viewed and deleted in the Cloud UI as described above because these methods are compatible and work on the same resource.
Create the user CRD
This CRD creates my-example
user with full
permissions and my-example-secret
.
To create the user with kubectl
:
View the user CRD and Secret
To view the user:
To view my-example-secret
:
To view access key and secret key in my-example-secret
, extract and base64 decode them. These are used for tools like rclone
and s3cmd
:
Delete the user
To delete the user with kubectl
:
The corresponding Secret is deleted automatically.
Object Storage endpoints
When retrieving files with HTTPS, use the endpoint for the region where the bucket is located.
Region | Endpoint |
---|---|
New York - LGA1 |
|
Chicago - ORD1 |
|
Las Vegas - LAS1 |
|
Each endpoint represents an independent, exclusive object store. This means that objects stored in ORD1
buckets are not accessible from the LAS1
region, and so on.
Bucket names must be unique per region. It is a recommended practice to include the region name (such as ord1
) in the name of the bucket.
Users may use any regional Object Storage endpoint and create and use buckets as they wish, but each region comes with its own quota limit. The default quota limit is 30TB
of data per region.
Note
Should you require an increase in your quota limit, please contact support.
Accelerated Object Storage
CoreWeave also offers Accelerated Object Storage, a series of Anycasted NVMe-backed storage caches that provide blazing fast download speeds. Accelerated Object Storage is best suited for frequently accessed data that doesn't change often, such as model weights and training data.
One of the biggest advantages of Anycasted Object Storage Caches is that data can be pulled from across data center regions, then be cached in the data center in which your workloads are located.
For example, if your models are hosted in ORD1
(Chicago), but have a deployment scale to all regions (ORD1
, LAS1
, LGA1
), your call to https://accel-object.ord1.coreweave.com
will be routed to a cache located closest to the workload - that is to say, if you are calling from LGA1
, it will hit the cache in LGA1
; if you are calling from LAS1
, it will hit the cache in LAS1
. This drastically reduces spin up times for workloads where scaling is a concern.
Note
When using Accelerated Object Storage, there's no need to change the endpoint for every region your application is deployed in - this is the beauty of it!
Use of CoreWeave's Accelerated Object Storage is available at no additional cost. To use Accelerated Object Storage, simply modify your Object Storage endpoint to one of the addresses that corresponds to your Data Center region.
Region | Endpoint |
---|---|
Las Vegas - LAS1 |
|
New York - LGA1 |
|
Chicago - ORD1 |
|
Server Side Encryption
Note
Server Side Encryption is implemented according to AWS SSE-C standards.
CoreWeave supports Server Side Encryption via customer-provided encryption keys. The client passes an encryption key along with each request to read or write encrypted data. No modifications to your bucket need to be made to enable Server Side Encryption (SSE-C); simply specify the required encryption headers in your requests.
Important
It is the client’s responsibility to manage all keys, and to remember which key is used to encrypt each object.
SSE with customer-provided keys (SSE-C)
The following headers are utilized to specify SSE-C customizations.
Name | Description |
---|---|
| Use this header to specify the encryption algorithm. The header value must be |
| Use this header to provide the 256-bit, base64-encoded encryption key to encrypt or decrypt your data. |
| Use this header to provide the base64-encoded, 128-bit MD5 digest of the encryption key according to RFC 1321. This header is used for a message integrity check to ensure that the encryption key was transmitted without error or interference. |
Server Side Encryption example
The following example demonstrates using an S3 tool to configure Server Side Encryption for Object Storage.
Note
Because SSE with static keys is not supported by s3cmd
at this time, the AWS CLI tool is used for this example. For a full explanation of the parameters used with the s3
tool in this example, review the AWS CLI s3
documentation.
First, run aws configure
to set up access and to configure your Secret Keys.
Separately, generate a key using your preferred method. In this case, we use OpenSSL to print a new key to the file sse.key
.
Important
The generated key must be 32 bytes in length.
Once the process of aws configure
is complete and your new key has been configured for use, run the following s3
commands to upload a file with Server Side Encryption.
Finally, to retrieve the file, pass the path of the encryption key used (sse-customer-key
) to aws s3
to decrypt the file:
Identity and Access Management (IAM) and access levels
When an initial key pair is created for Object Storage access, that key pair is given the permissions specified on creation in order to read, write, and modify policies of the buckets which it owns. Each key pair is considered an individual user for access, and can be used to provide granular access to applications or users.
Permission levels that may be granted are:
Permission level | CRD key | Description |
---|---|---|
Read |
| Gives access to only read from buckets you own and have created |
Write |
| Gives access to only write to buckets you own and have created |
Read/Write |
| Grants access to both read and write to buckets you own and have created |
Full |
| Grant Write/Read access, as well as admin access to create buckets and apply policies to buckets |
IAM actions
Currently, CoreWeave Cloud supports the following IAM bucket policy actions:
Important
CoreWeave Cloud does not yet support setting policies on users, groups, or roles. Currently, account owners need to grant access directly to individual users. Granting an account access to a bucket grants access to all users in that account.
For all requests, the condition keys CoreWeave currently supports are:
aws:CurrentTime
aws:EpochTime
aws:PrincipalType
aws:Referer
aws:SecureTransport
aws:SourceIpaws:UserAgent
aws:username
Certain S3 condition keys for bucket and object requests are also supported. In the following tables, <perm>
may be replaced with
read
write/read-acp
or
write-acp/full-control
for read, write/read, or full control access, respectively.
Supported S3 Bucket Operations
Permission | Condition Keys |
---|---|
|
|
|
|
| N/A |
| N/A |
| N/A |
|
|
Supported S3 Object Operations
Permission | Condition Keys |
---|---|
|
|
| N/A |
| N/A |
| N/A |
| Use |
| N/A |
|
|
|
|
| N/A |
|
|
|
|
| N/A |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note
When using AWS SDKs, the variable AWS_REGION
is defined within the V4 signature headers. The object storage region for CoreWeave is named default
.
Bucket policies
Another access control mechanism is bucket policies, which are managed through standard S3 operations. A bucket policy may be set or deleted by using s3cmd
, as shown below.
In this example, a bucket policy is created to make the bucket downloads public:
The policy is then applied using s3cmd setpolicy
:
Once the policy is applied, the data in your bucket may be accessed without credentials, for example, by using curl
:
Finally, the policy is deleted using s3cmd delpolicy
:
Note
Bucket policies do not yet support string interpolation.
Frequently asked questions
Are buckets accessible between tokens?
Yes. All Object Storage tokens for an organization can access all buckets in the organization. Tokens can have different access levels.
Is data deleted when the tokens are deleted?
No. Even when all tokens in the organization are deleted, the data is untouched. To delete all data, use a tool like s3cmd
or rclone
to purge the buckets.
Pricing
The current price for Object Storage is $0.03
per GB per month.
Last updated