Skip to main content

LOTA (Local Object Transfer Accelerator)

Accelerates CoreWeave AI Object Storage

CoreWeave's Local Object Transfer Accelerator (LOTA) is an intelligent proxy installed on every GPU Node in a CKS cluster to accelerate data transfer. LOTA achieves this by providing a highly efficient, local gateway to CoreWeave AI Object Storage on each Node in the cluster for faster data transfer rates and decreased latency.

Overview

With LOTA, software clients can easily interact with CoreWeave AI Object Storage through a new API endpoint. Clients only need to point their requests to the LOTA endpoint instead of the primary endpoint, with no other changes required to S3-compatible clients.

Info
  • The primary endpoint is https://cwobject.com
  • The LOTA endpoint is http://cwlota.com

LOTA proxies all Object Storage requests to the Object Storage Gateway and storage backend. First, LOTA authenticates each request with the gateway and verifies proper authorization. Then, when possible, LOTA then bypasses the gateway and directly accesses the storage backend to fetch objects with the greatest possible throughput. LOTA stores the fetched objects in a distributed cache to significantly boost data transfer rates, especially for repeated data requests.

Data upload and retrieval

In Step 1 of the diagram below, training data is uploaded to the Object Storage Gateway via the LOTA endpoint for indexing. The Gateway then stores the data in the Object Repository. For data uploads, the LOTA endpoint is used the same way as the primary endpoint.

In Step 2, LOTA forwards the client's request to the Object Storage Gateway, including the local Node's location information. The Gateway processes the request and returns the data, which may include the direct path to the object. When LOTA uses the direct path to bypass the Gateway and access the object directly, data transfer rates improve significantly. By storing the data in a distributed cache, LOTA ensures that frequently accessed objects are readily available for quick retrieval.

Process flow

LOTA actively caches recently accessed objects on the local disks of GPU Nodes, significantly reducing latency and boosting read speeds for CoreWeave AI Object Storage. The following diagram illustrates the process flow when fetching an object using LOTA.

When a request is made to LOTA, it first checks if the object is available in the cache. If the object is found, it's fetched directly from the cache, ensuring minimal latency.

If the object is not in the cache, LOTA fetches it from the backend storage and forks it into two pathways:

  • Stream 1 sends the object to the client application.
  • Stream 2 stores the object in the cache, using local storage on one or more GPU Nodes.

This dual-pathway approach ensures that future requests for the same data are served quickly from the cache, enhancing overall performance. LOTA distributes the cache across all GPU Nodes in a CKS cluster, ensuring efficient data retrieval and management.

How LOTA manages the cache

When storing an object, LOTA computes which Node should hold the object in its local cache, always considering the object as a whole to avoid the network overhead of accessing multiple Nodes. This computation produces a list of one or more compatible Nodes. LOTA then determines the optimal cache Node placement based on distribution load optimization. See the Best Practices guide to learn how different upload patterns affect cache placement.

When a client application requests an object (or part of an object), LOTA determines which Node in the distributed cache should hold the data, then makes an HTTP request to that Node. The Node checks its local cache, and if the data is found, it is delivered to the client.

If the data is not found, the Node fetches the entire object from the storage backend and forks it into two pathways. The first path sends the object (or the requested part) to the client application, while the second path stores the entire object in the local cache of the Node that fetched it. LOTA always caches the entire object when a client application requests any part of it, without requiring multiple GET actions to cache those parts of the object.

This distributed cache is managed by LOTA using the Least Recently Used (LRU) cache algorithm to ensure that recently accessed objects are available for quick retrieval. LOTA's read-after-write consistency guarantees that objects read immediately after writing are always up-to-date without requiring cache invalidation.

Learn more

Currently, LOTA only accelerates HTTP GET requests. CoreWeave plans to support other requests in the future.

Control LOTA cache behavior

By default, all GET requests made to the LOTA endpoint are cached. For testing or analysis, it may be desirable to disable the cache for specific requests. To do so, set the Cache-Control header to either no-store or no-cache when performing GET operations. These two options have different effects on LOTA's behavior:

  • no-cache: LOTA queries the cache for the object, but the GET request is not cached. This simulates the worst-case performance by forcing LOTA to proxy the data, but not cache it.
  • no-store: LOTA fetches the object directly from the storage backend, bypassing the cache, and the object is not cached. This allows you to test the performance of the storage backend without caching.

Learn more in the S3 compatibility reference guide.

LOTA security

LOTA benefits from CoreWeave's implementation of industry best practices and standards for security. These practices and standards include:

  • Node isolation - every Node is single tenant and operates within a securely-isolated environment
  • Encryption at rest for data - strengthens data protection and security
  • VPC networking - ensures customer traffic stays private
  • Incident response and patching - reduces vulnerability attack exposure

CoreWeave has developed a Shared Responsibility Model (SRM) to outline the responsibilities of both CoreWeave and customers from a security perspective.

To further enhance LOTA security posture, customers should take the following additional measures.

Only run trusted workloads

LOTA runs on Pods in a CKS cluster. Compromised or malicious Kubernetes workloads within the same cluster could download data from CoreWeave AI Object Storage as it is pulled through the LOTA cache. CoreWeave recommends that LOTA only be used within clusters which meet the following criteria:

  • Run only trusted workloads
  • Do not grant access to Nodes to users which are not trusted by CoreWeave AI Object Storage access polices

Limit access to Nodes running LOTA

LOTA writes data that is encrypted at rest, but decrypted data is accessible to admins of Nodes. As such, customers should never expose administrative access of Nodes running LOTA to untrusted workloads (e.g. via privileged workloads). Doing so may grant access to cached data within LOTA in violation of existing CoreWeave AI Object Storage access policies.

Limit network access to LOTA Pods

LOTA's cache is accessible to other Pods in the cluster. CoreWeave recommends that customers utilize a Kubernetes NeworkPolicy to prevent unnecessary exposure of internal cache components to the rest of the cluster. Failure to apply a NetworkPolicy may result in data leakage if your cluster runs untrusted workloads. An example NetworkPolicy is provided below:

Example
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: restrict-traffic
namespace: cw-object-storage
spec:
podSelector: {} # Apply to all pods in the namespace
policyTypes:
- Ingress
ingress:
- from:
ports:
- protocol: TCP
port: 80
- protocol: TCP
port: 53
- protocol: UDP
port: 53
# Restrict all other traffic to only come from the cw-object-storage namespace
- from:
- namespaceSelector:
matchLabels:
name: cw-object-storage
ports:
- protocol: TCP
port: 1
endPort: 65535

More information

To learn how to configure popular S3-compatible tools to use LOTA, see How To: Manage Buckets.