Best Practices for LOTA
Best practices to achieve the best performance with LOTA
Using Multipart Upload (MPU)
LOTA performs best with Multipart Uploads. Objects uploaded via S3 Multipart API actions are distributed across a GPU cluster, with each part of the object cached on a distinct GPU Node. This is different to objects uploaded via s3:PutObject
, which are stored on a single Node.
In order for LOTA cache to work most effectively, adhere to Amazon's multipart upload (MPU) best procedures for uploading large objects.
Part size and count
In order to further optimize for the best possible performance for LOTA cache, object part size and part count should be optimized to avoid "hot spotting" (long I/O wait times caused by significant activity) when working with large objects, and to distribute read
load across the cache pool more evenly.
However, a large object broken into many smaller parts results in more HTTP requests, meaning a greater per-request overhead.
As an example, a GPU cluster with 1,250 Nodes can evenly serve a very large volume of data by leveraging an MPU with 1,250 parts, if the parts are perfectly distributed between Nodes. To learn more about LOTA, see Local Object Transfer Accelerator (LOTA).
Only objects whose complete size is >=4MB
can be cached with LOTA: objects smaller than this are fetched directly from the backend, bypassing the cache.
Prepare the cache with HTTP Range
requests
HTTP Range
requests can be used to prepare the cache. By making a Range
request with 0
bytes (Range: bytes=0-0
), only the first byte is returned, however metadata is fetched for the entire object, which in turn is used to prepare the cache for subsequent requests.