About CoreWeave Storage
High-Performance Computing (HPC) workloads, especially those involving AI and Machine Learning (ML) applications, typically utilize large training datasets, require data sharing between Nodes, and need fast, temporary storage for caching and logging purposes.
CoreWeave offers a variety of storage solutions designed to meet the demands of these intensive applications.
CoreWeave AI Object Storage
CoreWeave AI Object Storage is purpose-built to serve large training datasets, model weights, checkpoints, and other distributed bulk data through an S3-compatible API. CoreWeave AI Object Storage is platform independent, enabling applications to connect across Cloud providers. It provides authentication, security policies, read-after-write consistency, versioning, bucket lifecycles, and other features that are essential for managing the large datasets required for AI and ML workloads. CoreWeave AI Object Storage serves data directly to GPU Nodes, or via model serializers such as CoreWeave Tensorizer.
Performance is boosted further with LOTA (Local Object Transfer Accelerator), a first-of-its-kind intelligent proxy for CoreWeave AI Object Storage. LOTA is installed on every GPU Node to provide a highly efficient, local gateway to CoreWeave AI Object Storage that caches fetched data to reduce load times. By keeping recently accessed data on the GPU Node, LOTA significantly reduces the time it takes to load data from storage, making it an ideal solution for AI and ML workloads that require fast access to large datasets.
Distributed File Storage
Distributed File Storage is a POSIX-compliant shared filesystem that allows multiple GPU Nodes to access the same data simultaneously. Distributed File Storage has high throughput and low latency, provides automatic data snapshots, and has asynchronous file deletion features.
Distributed File Storage is an ideal choice for applications that require synchronization between Pods, such as distributed training, because it minimizes performance bottlenecks caused by file contention.
GPU-Local ephemeral storage
GPU-Local ephemeral storage is available on all GPU Nodes for high-performance temporary storage, caching, and logs. This non-persistent local storage is not intended for long-term data retention. The amount of GPU-Local ephemeral storage available varies depending on Node type.