Dedicated VAST provides access to the full VAST data services stack.Documentation Index
Fetch the complete documentation index at: https://docs.coreweave.com/llms.txt
Use this file to discover all available pages before exploring further.
VAST Catalog
VAST Catalog is a built-in metadata index that automatically catalogs all files and objects on the cluster, enabling search and query across the entire filesystem without external indexing tools. Key characteristics:- Automatic indexing: Catalogs file and object metadata including creation time, size, ownership, S3 tags, and custom metadata.
- SQL-queryable: Query the catalog through VAST DataBase for search, filtering, and aggregation across billions of files and objects.
- Always up to date: Refreshed on a configurable schedule, as frequently as every 15 seconds, using VAST’s snapshot engine.
- No external infrastructure: Runs entirely on the VAST cluster with no additional systems to deploy or manage.
- Using S3 object tags as an AI/ML feature store, embedding attributes directly on objects for retrieval by training pipelines.
- Capacity reporting across users, projects, and file types.
- Finding and managing data at scale across petabytes of storage.
DataEngine
VAST DataEngine is a compute orchestration framework that enables you to write, deploy, and manage execution pipelines directly on the VAST cluster. Pipelines run serverlessly on the cluster hardware, with no separate compute infrastructure to provision. Key capabilities:- Event-driven triggers: Pipelines execute automatically in response to data events, such as file creation or modification.
- Scheduled execution: Pipelines can run on configurable schedules for recurring batch operations.
- Serverless execution: Pipeline logic runs directly on the VAST cluster without managing additional infrastructure.
DataBase
VAST DataBase is an embedded columnar analytics database that allows you to run SQL queries directly against data stored on your VAST cluster. Queries execute on the VAST hardware itself, with no ETL pipeline, data movement, or separate analytics cluster required. Use cases include:- Running analytics over training datasets stored on VAST without egress.
- Querying checkpoint metadata or experiment logs directly from storage.
- Joining structured data from object storage with file-based datasets.
VAST Global Access and SyncEngine
Global Access
VAST Global Access enables cross-cluster data access between VAST clusters, presenting data on remote clusters as a unified namespace. This enables active-active configurations where workloads on one cluster can access data residing on another without explicit data movement. Native asynchronous replication between VAST clusters is a separate capability from Global Access and SyncEngine. For replication policy configuration, see the VAST Administrator’s Guide.SyncEngine
VAST SyncEngine is a universal data router and mobility platform. It discovers, catalogs, and moves data across hybrid storage environments. Key capabilities:- Data migration and synchronization: Move and synchronize data across storage systems with integrity verification.
- Deep metadata indexing: Catalog and index metadata across billions of unstructured files for discovery and search.
- AI data preparation: Prepare data for AI pipelines, including chunking, vectorization, and indexing for retrieval-augmented generation (RAG) workflows.
Snapshots
Dedicated VAST supports customer-configurable snapshot policies, managed directly in VMS. Snapshots are point-in-time consistent copies of a View’s filesystem state. You can configure:- Schedule: Snapshot frequency (for example, hourly, daily, weekly).
- Retention: How long snapshots are retained before automatic deletion.
- Scope: Snapshots are scoped to a View.
.snapshot directory within a mounted View, consistent with the behavior on CoreWeave’s Distributed File Storage. Snapshots are read-only and do not consume additional capacity beyond the changed blocks since the previous snapshot.
Full snapshot policy management is available through VMS. For configuration details, see the VAST Administrator’s Guide.