Pre-stage the LOTA cache - CoreWeave Docs

The Pre-stage Cache feature allows you to proactively warm the LOTA (Local Object Transport Accelerator) cache for CoreWeave AI Object Storage. By issuing a HeadObject call against any object, you instruct LOTA to fetch the complete object from backend storage and place it into the distributed NVMe cache on your Nodes without transferring a response body to your client. This eliminates the “cold start” latency penalty that occurs on first reads. Training jobs with a warm cache can see 30-50% faster first epochs, inference services can achieve sub-second Time-To-First-Token for large models, and checkpoint restores complete at full cache speed from the very first byte. Other benefits include:

Zero client bandwidth. HeadObject returns only headers. All cache-fill traffic stays inside the backend CoreWeave network.
No SDK changes required. Works with any S3-compatible tool or SDK (AWS CLI, Boto3, s3cmd) with no custom headers or proprietary extensions.

Use cases

The following use cases benefit from pre-staging the LOTA cache:

Use Case	Description
Pre-warm datasets before distributed training	Issue HeadObject calls for every shard of the training dataset before launching distributed training across hundreds of GPUs. When the first epoch starts, all data-loading workers read directly from cache.
Pre-load model weights for inference scale-up	Pre-stage model files while compute pods are initializing during an inference deployment scale-up. By the time pods are ready to serve traffic, model weights are already in cache, ensuring optimal Time-To-First-Token.
Accelerate checkpoint-restore after failure	After a node failure interrupts training, pre-stage the most recent checkpoint shards into cache before replacement pods start. This reduces downtime between failure and resumed training.
Warm data for multi-region workflows	Before running a production workload in a specific CoreWeave region, pre-stage required objects into that region’s LOTA cache. This enables cache-speed performance from the first read, even if data resides in another region.

How it works

Pre-staging uses a HeadObject call with a Range: bytes=0-0 header as the trigger. When LOTA receives this request:

The object storage gateway authenticates the request.
The object storage service returns the object’s metadata (size, content type, ETag, last modified date) in the response headers, exactly as a normal HeadObject call.
LOTA fetches the complete object from the backend storage and writes it into the distributed NVMe cache across the Nodes in your CKS cluster.
Future GET requests for that object are served directly from the LOTA cache.

Since HeadObject does not return a response body, pre-staging consumes minimal client-side resources and bandwidth. The cache fill happens entirely within the CoreWeave infrastructure.

Whole object caching

Pre-staging caches the entire object, since a HeadObject call with a Range: bytes=0-0 header caches the complete object. Byte-range pre-fetching of partial objects is not supported. However, range-reads will be served from cache as usual.

LOTA endpoint required

Pre-staging only works through the LOTA endpoint (http://cwlota.com). HeadObject calls sent to the primary endpoint (https://cwobject.com) return metadata but do not trigger a cache fill.

Billing and storage tier behavior

A HeadObject with bytes=0-0 request counts as an object access triggering the transition of an object from Cold or Warm to the Hot Storage Pricing tier. If bytes=0-0 is not included in the HeadObject request, no tier transition will occur.

Pre-staging a large number of objects transitions them all to the Hot pricing tier. Review the Storage Pricing documentation to understand the cost implications before pre-staging at scale.

Finite cache capacity

Total LOTA cache capacity scales with cluster size, with each Node contributing 1 TiB to the distributed cache by default (for example, a 30-Node cluster has approximately 30 TiB of cache). Pre-stage only the objects your workload will access in the near term. The cache uses LRU (Least Recently Used) eviction, so pre-staging more data than the cache can hold evicts objects that other active workloads may depend on. Use the CAIOS LOTA dashboard to monitor cache utilization and hit rates. To request a larger cache allocation, contact CoreWeave support.

Prerequisites

An active CoreWeave organization with at least one AI Object Storage bucket containing objects.
A valid API access key pair (Access Key ID and Secret Access Key) configured for your organization. See Get Started with AI Object Storage for setup instructions.
A CKS cluster where your workload will run. Your workload must run inside a CoreWeave CKS cluster to reach the LOTA endpoint at http://cwlota.com.
An S3-compatible client, such as AWS CLI, boto3 (Python), or s3cmd, installed and configured for CoreWeave AI Object Storage.

Configure CoreWeave credentials

Using a separate profile for CoreWeave AI Object Storage is recommended to avoid conflicts with your other AWS profiles and S3-compatible services; if you do not set up this configuration, you may encounter errors when using AI Object Storage.

Configure CoreWeave credentials

Create a new credentials file and profile in your CoreWeave configuration directory.
Create a new credentials file and profile
```
AWS_SHARED_CREDENTIALS_FILE=~/.coreweave/cw.credentials aws configure --profile cw
```
When prompted for information, provide the following values:
- AWS Access Key ID: The Access Key ID of your CoreWeave AI Object Storage Access Key.
- AWS Secret Access Key: The Secret Key of your CoreWeave AI Object Storage Access Key.
- Default region name: Optional. To set a default region, refer to the CoreWeave Availability Zones.
- Default output format: Use json for JSON output.
Set the default endpoint URL to the appropriate endpoint for your use case:
- The primary endpoint, https://cwobject.com, for use when running outside of a CoreWeave cluster.
- The LOTA endpoint, http://cwlota.com, for use when running inside a CoreWeave cluster. The LOTA endpoint routes to the LOTA path for best performance.
Set the primary endpoint for local development
```
AWS_CONFIG_FILE=~/.coreweave/cw.config aws configure set endpoint_url https://cwobject.com --profile cw
```

Set the S3 addressing_style to virtual:

Set virtual addressing style

AWS_CONFIG_FILE=~/.coreweave/cw.config aws configure set s3.addressing_style virtual --profile cw

Pre-stage a single object

To pre-stage an object, send a HeadObject request to the LOTA endpoint for the target bucket and key. Run these commands from within the same CKS cluster where your training or inference Pods will run, so that data is cached on the correct set of Nodes. A response is returned after the object is written into the LOTA cache. Before completing and running these commands, make sure you have configured your CoreWeave credentials.

AWS CLI
Boto3
s3cmd

To pre-stage a single object, fill in the following parameters:

[BUCKET-NAME] with the name of the bucket containing the object you want to pre-stage.
[OBJECT-KEY] with the key of the object you want to pre-stage, for example, datasets/imagenet/shard-00001.tar.

Pre-stage a single object

aws s3api head-object \
  --bucket [BUCKET-NAME] \
  --key [OBJECT-KEY] \
  --endpoint-url http://cwlota.com \
  --range "bytes=0-0"

The command returns the object’s metadata and triggers a cache fill:

Response

{
  "ContentLength": 524288000,
  "ContentType": "application/x-tar",
  "ETag": "\"a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6\"",
  "LastModified": "2025-11-01T08:30:00+00:00",
  "Metadata": {}
}

To pre-stage a single object, fill in the following parameters:

[BUCKET-NAME] with the name of the bucket containing the object you want to pre-stage.
[OBJECT-KEY] with the key of the object you want to pre-stage, for example, datasets/imagenet/shard-00001.tar.

Pre-stage a single object

import boto3
from botocore.client import Config

s3 = boto3.client(
    "s3",
    endpoint_url="http://cwlota.com",
    config=Config(
        s3={"addressing_style": "virtual"},
    ),
)

response = s3.head_object(
    Bucket="[BUCKET-NAME]",
    Key="[OBJECT-KEY]",
    Range="bytes=0-0",
)

print(f"Object size: {response['ContentLength']:,} bytes")
print(f"ETag: {response['ETag']}")

To pre-stage a single object, fill in the following parameters:

[BUCKET-NAME] with the name of the bucket containing the object you want to pre-stage.
[OBJECT-KEY] with the key of the object you want to pre-stage, for example, datasets/imagenet/shard-00001.tar.

Pre-stage a single object

s3cmd info s3://[BUCKET-NAME]/[OBJECT-KEY] \
  --host=cwlota.com \
  --host-bucket='%(bucket)s.cwlota.com' \
  --no-ssl \
  --add-header='Range: bytes=0-0'

The s3cmd info command issues an HTTP HEAD request internally, which triggers the pre-stage cache fill.

Pre-stage multiple objects in parallel

Parallelizing the pre-stage requests significantly reduces the total time to warm a large dataset. For guidance on tuning connection pool size and concurrency for high-throughput workloads, see Maximize parallelism.

Pre-staging a large number of objects transitions them all to the Hot pricing tier. Review the Storage Pricing to understand the cost implications before pre-staging at scale.

Before completing and running these commands, make sure you have configured your CoreWeave credentials.

AWS CLI
Boto3
s3cmd

To pre-stage all objects under a prefix, fill in the following parameters:

[BUCKET-NAME] with the name of the bucket containing the objects you want to pre-stage.
[PREFIX] with the prefix of the objects you want to pre-stage, for example, datasets/imagenet/.

Pre-stage all objects under a prefix

aws s3api list-objects-v2 \
  --bucket [BUCKET-NAME] \
  --prefix [PREFIX] \
  --endpoint-url http://cwlota.com \
  --query 'Contents[].Key' \
  --output text | tr '\t' '\n' | \
while read key; do
  aws s3api head-object \
    --bucket [BUCKET-NAME] \
    --key "$key" \
    --endpoint-url http://cwlota.com \
    --range "bytes=0-0" &
done

# Wait for all background jobs to complete
wait
echo "Pre-staging complete."

To pre-stage all objects under a prefix, fill in the following parameters:

[BUCKET-NAME] with the name of the bucket containing the objects you want to pre-stage.
[PREFIX] with the prefix of the objects you want to pre-stage, for example, datasets/imagenet/.

Pre-stage all objects under a prefix

import boto3
from botocore.client import Config
from concurrent.futures import ThreadPoolExecutor, as_completed

BUCKET = "[BUCKET-NAME]"
PREFIX = "[PREFIX]"
MAX_WORKERS = 32

s3 = boto3.client(
    "s3",
    endpoint_url="http://cwlota.com",
    config=Config(
        max_pool_connections=MAX_WORKERS,
        s3={"addressing_style": "virtual"},
    ),
)

def prestage_object(key):
    """Issue a HeadObject call (range 0-0) to pre-stage a single object."""
    s3.head_object(
        Bucket=BUCKET,
        Key=key,
        Range="bytes=0-0",
    )
    return key

paginator = s3.get_paginator("list_objects_v2")
keys = []
for page in paginator.paginate(Bucket=BUCKET, Prefix=PREFIX):
    for obj in page.get("Contents", []):
        keys.append(obj["Key"])

print(f"Pre-staging {len(keys)} objects...")

succeeded = 0
failed = 0

with ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
    futures = {executor.submit(prestage_object, k): k for k in keys}
    for future in as_completed(futures):
        key = futures[future]
        try:
            future.result()
            succeeded += 1
        except Exception as exc:
            failed += 1
            print(f"  Failed to pre-stage {key}: {exc}")
        if (succeeded + failed) % 100 == 0:
            print(f"  {succeeded + failed}/{len(keys)} objects processed")

print(f"Done. {succeeded} succeeded, {failed} failed out of {len(keys)} objects.")

To pre-stage all objects under a prefix, fill in the following parameters:

[BUCKET-NAME] with the name of the bucket containing the objects you want to pre-stage.
[PREFIX] with the prefix of the objects you want to pre-stage, for example, datasets/imagenet/.

Pre-stage all objects under a prefix

s3cmd ls s3://[BUCKET-NAME]/[PREFIX] \
  --host=cwlota.com \
  --host-bucket='%(bucket)s.cwlota.com' \
  --no-ssl | \
awk '{print $4}' | \
while read uri; do
  s3cmd info "$uri" \
    --host=cwlota.com \
    --host-bucket='%(bucket)s.cwlota.com' \
    --no-ssl \
    --add-header='Range: bytes=0-0' &
done

wait
echo "Pre-staging complete."

Pre-stage before your training job starts

Use a Kubernetes Job as an initial step in your training pipeline. The Job pre-stages all training data, then your training pods start with a fully warmed cache. Replace [BUCKET-NAME] with the name of the bucket containing your training data and [PREFIX] with the object prefix to pre-stage (for example, datasets/imagenet/). The Job reads credentials from a Kubernetes Secret named storage-credentials.

job-prestage-training-data.yaml

apiVersion: batch/v1
kind: Job
metadata:
  name: prestage-training-data
spec:
  template:
    spec:
      containers:
        - name: prestage
          image: amazon/aws-cli:latest
          command:
            - "/bin/bash"
            - "-c"
            - |
              aws s3api list-objects-v2 \
                --bucket [BUCKET-NAME] \
                --prefix [PREFIX] \
                --endpoint-url http://cwlota.com \
                --query 'Contents[].Key' \
                --output text | tr '\t' '\n' | \
              while read key; do
                aws s3api head-object \
                  --bucket [BUCKET-NAME] \
                  --key "$key" \
                  --endpoint-url http://cwlota.com \
                  --range 'bytes=0-0' > /dev/null &
              done
              wait
              echo "Pre-staging complete."
          env:
            - name: AWS_ACCESS_KEY_ID
              valueFrom:
                secretKeyRef:
                  name: storage-credentials
                  key: access-key-id
            - name: AWS_SECRET_ACCESS_KEY
              valueFrom:
                secretKeyRef:
                  name: storage-credentials
                  key: secret-access-key
      restartPolicy: Never
  backoffLimit: 3

Schedule the pre-stage Job to run before your training pods launch. In an Argo Workflow or a Kubernetes-native pipeline, add a dependency so the training step waits for the pre-stage Job to succeed. Pre-stage data just before the compute step that needs it to balance warm cache benefits against cache capacity.

About LOTA (Local Object Transport Accelerator)
Conditional writes: use the ETag returned by HeadObject with If-Match to perform atomic compare-and-swap updates
Get Started with AI Object Storage
Manage Buckets
Storage Pricing
CAIOS LOTA dashboard

Storage

Documentation Index

​Use cases

​How it works

​Whole object caching

​LOTA endpoint required

​Billing and storage tier behavior

​Finite cache capacity

​Prerequisites

​Configure CoreWeave credentials

​Pre-stage a single object

​Pre-stage multiple objects in parallel

​Pre-stage before your training job starts

​Related resources

Use cases

How it works

Whole object caching

LOTA endpoint required

Billing and storage tier behavior

Finite cache capacity

Prerequisites

Configure CoreWeave credentials

Pre-stage a single object

Pre-stage multiple objects in parallel

Pre-stage before your training job starts

Related resources