Skip to main content

Local Object Transfer Accelerator (LOTA)

Decreasing latency and increasing speed in CoreWeave AI Object Storage

Local Object Transfer Accelerator (LOTA)

CoreWeave's Local Object Transfer Accelerator (LOTA) is a proxy that lives on every GPU Node inside a client's cluster, which performs intelligent acceleration of Object Storage data transfers behind the scenes.

While conventional transfer accelerators speed up the data transfer rates of bucket contents over long distances, LOTA achieves the same goal by acting within a client's own environment, providing a highly efficient, Node-local connection to Object Storage.

Instead of needing to reach out to the Object Storage gateway to perform requests, clients query their local LOTA instance, which then proxies the request. Rather than going through the Object Storage gateway, LOTA more directly interacts with the storage backend by caching files locally, increasing data transfer rates to the GPU.

Info

Using LOTA does not require any changes to client software, other than pointing its requests to the LOTA endpoint. No changes to S3 commands are required.

Important

Currently, LOTA only supports acceleration for GET actions. However, CoreWeave expects to support WRITE and PUT requests to LOTA in the future, enabling acceleration of throughput for both WRITE and GET actions.

How LOTA works

Here's how LOTA works:

  1. A client submits a request to their Node's local LOTA instance.
  2. LOTA then forwards the request to the Object Storage gateway. When LOTA sends a request to the CoreWeave AI Object Storage gateway for data, it also provides the gateway with location data.
  3. The gateway has a set of rules that evaluate LOTA's request to determine the response it will provide. The response to LOTA can include the direct path of the object being requested, allowing LOTA faster access to the object.
  4. LOTA can then also cache the object on the Node's local storage for faster future access.

Requesting any part of a file prompts LOTA to cache the entire file. This means users don't have to run multiple GET range reads in order to cache those parts. LOTA automatically caches all of the object for you. LOTA runs on every GPU in CoreWeave's fleet.

Learn more

For more specific information and an example use case for LOTA, see How-To: Connect an Application to LOTA.

LOTA and object caching

LOTA's object cache is a disk cache of CoreWeave AI Object Storage that keeps recently-accessed data locally on GPU Nodes, increasing AI Object Storage read speeds by significantly decreasing latency.

How LOTA's cache works

  1. LOTA computes the object cache placement for a given object. Placement is always computed for the object as a whole, in order to avoid the network requests required to access multiple Nodes.
  2. This operation produces a list of one or more compatible Nodes.
  3. LOTA then checks where the cache Node should be placed, according to distribution load optimization.
  4. LOTA makes an HTTP request to the cache Node. The cache Node checks if it has relevant object content pieces in its local cache. If it does, the cache Node reads the data, then passes it through a decryption layer.
  5. The data then gets pushed.

If the required data isn't in the local cache, the Node makes a request to the storage backend.

Once the Node gets the data, the cache then forks it into two pathways:

  • Stream 1 pushes the data to the client's designated location.
  • Stream 2 stores it in the GPU Node's local storage.
Info

If you would like to use LOTA, but do not want the object to be cached, it is possible to use the HTTP header Cache-Control with a value of no-store or no-cache on a GET operation. Learn more about Cache-Control options in the API calls section of the S3 compatibility reference guide.

Learn more

For more information on using LOTA, see How To: Connect an Application to LOTA.