Best Practices for LOTA

Best practices to achieve the best performance with LOTA

Using Multipart Upload (MPU)

LOTA performs best with Multipart Uploads. Objects uploaded via S3 Multipart API actions are distributed across a GPU cluster, with each part of the object cached on a distinct GPU Node. This is different to objects uploaded via s3:PutObject, which are stored on a single Node.

Info

In order for LOTA cache to work most effectively, adhere to Amazon's multipart upload (MPU) best procedures for uploading large objects.

Part size and count

In order to further optimize for the best possible performance for LOTA cache, object part size and part count should be optimized to avoid "hot spotting" (long I/O wait times caused by significant activity) when working with large objects, and to distribute read load across the cache pool more evenly.

However, a large object broken into many smaller parts results in more HTTP requests, meaning a greater per-request overhead.

Learn more

As an example, a GPU cluster with 1,250 Nodes can evenly serve a very large volume of data by leveraging an MPU with 1,250 parts, if the parts are perfectly distributed between Nodes. To learn more about LOTA, see Local Object Transfer Accelerator (LOTA).

Important

Only objects whose complete size is >=4MB can be cached with LOTA: objects smaller than this are fetched directly from the backend, bypassing the cache.

Prepare the cache with HTTP `Range` requests

HTTP Range requests can be used to prepare the cache. By making a Range request with 0 bytes (Range: bytes=0-0), only the first byte is returned, however metadata is fetched for the entire object, which in turn is used to prepare the cache for subsequent requests.

Using Multipart Upload (MPU)​

Part size and count​

Prepare the cache with HTTP Range requests​

Using Multipart Upload (MPU)

Part size and count

Prepare the cache with HTTP `Range` requests