CoreWeave AI Object Storage now supports pre-staging objects into the LOTA cache. By issuing a HeadObject call with aDocumentation Index
Fetch the complete documentation index at: https://docs.coreweave.com/llms.txt
Use this file to discover all available pages before exploring further.
Range: bytes=0-0 header against any object, LOTA fetches the complete object from the storage backend and places it into the distributed NVMe cache on your GPU Nodes, without transferring a response body to your client.
Overview
Pre-staging eliminates the “cold start” latency penalty that occurs when objects are read for the first time. With a warm cache, training, inference, and checkpoint-restore workloads access data at full cache speed from the very first byte. Key benefits include:- Faster first epoch: Training jobs that start with a warm cache can see a 30-50% faster first epoch compared to cold reads.
- Lower Time-To-First-Token: Inference services load models from cache instead of backend storage, reducing TTFT for large models to sub-second levels.
- Faster checkpoint restores: Resuming training from the LOTA cache rather than remote storage minimizes downtime after failures.
- Zero client bandwidth: HeadObject returns only headers. All cache-fill traffic stays inside the backend CoreWeave network.
- No SDK changes required: Works with any S3-compatible tool or SDK (AWS CLI, boto3, s3cmd).
Limitations
- Pre-staging caches whole objects only. Byte-range pre-fetching is not supported.
- Pre-staging only works through the LOTA endpoint (
http://cwlota.com). HeadObject calls to the primary endpoint (https://cwobject.com) return metadata but do not trigger a cache fill. - A HeadObject with
bytes=0-0request counts as an object access triggering the transition of an object from Cold or Warm storage tiers to the Hot pricing tier. Ifbytes=0-0is not included in the HeadObject request, no tier transition will occur.
Learn more
- Pre-stage the LOTA cache — full instructions with AWS CLI, boto3, and s3cmd examples
- About LOTA — how the Local Object Transport Accelerator works
- Storage Pricing — understand cost implications of tier transitions