- When to use the AWS CLI instead of rclone.
- How rclone handles versioned objects differently than AWS CLI.
- Which command-line flags improve performance.
- When to contact CoreWeave Storage Engineering for help.
Create a versioned bucket
Rclone is capable of most operations on versioned buckets, but it cannot create them. Use the AWS CLI to create versioned buckets.-
Create a bucket. Replace
[BUCKET-NAME]with your bucket name and[AVAILABILITY-ZONE]with your AZ in uppercase, exactly as shown in the AZ list.The bucket name must be unique across all CoreWeave and Availability Zones (AZs).Bucket naming rules
Bucket names must be globally unique and adhere to the following rules:- Length: 3 to 63 characters.
- Characters: Only lowercase letters (
a-z), numbers (0-9), and hyphens (-). No dots, uppercase letters, underscores, spaces, or other special characters. - Start and end: Must begin and end with a letter or number. Cannot start or end with a hyphen (
-). - Prohibited patterns: Cannot start with
xn--. - Reserved: Must not begin with
cw-,vip-, orlog-stitcher-ch-. Must not be the exact nameint. CoreWeave reserves these for internal use.
LocationConstraintis required and must match the--regionAZ. - Wait approximately one minute for the bucket’s DNS name to propagate. If you try to use the bucket immediately after creating it, you might encounter errors.
-
Enable versioning on the bucket with the AWS CLI. Replace
[BUCKET-NAME]with your bucket name.
Configure rclone for versioned buckets
This section covers the rclone configuration required for versioned buckets. For general rclone setup, credential configuration, and Workload Identity Federation, see Configure rclone for AI Object Storage. To get started, create an rclone remote1 in~/.config/rclone/rclone.conf. Replace [ACCESS-KEY-ID] and [SECRET-ACCESS-KEY] with your AI Object Storage access key credentials.
~/.config/rclone/rclone.conf
1Rclone can use multiple remotes to define different storage providers, each with a unique name. The remote in this example is named default. Learn more about rclone remotes in the rclone documentation.
For optimal performance, always use the LOTA endpoint (http://cwlota.com) when running inside a CoreWeave cluster. The primary endpoint (https://cwobject.com) is available when running outside a CoreWeave cluster.
These settings are critical when configuring rclone for AI Object Storage:
- type: Must be
s3. - provider: Must be
Other. - force_path_style: Must be
false. This forces rclone to use virtual-hosted style URLs. Path-style addressing is not supported. - no_check_bucket: Must be
true. This prevents rclone from attempting to create or verify the bucket’s existence during operations.
List objects
This section describes the two main rclone listing commands and when to use each.rclone ls
Use rclone ls to get a quick overview of the contents of a bucket or prefix. It provides a simple listing of objects, but not directories. It shows only their size and path, without modification times. It’s recursive by default.
rclone lsl
rclone lsl provides a more detailed listing than rclone ls, including modification times. Like rclone ls, it is recursive by default and does not include directories. Use it when you need to see when files were last modified.
Performance tuning flags
Use the following flags to reduce API calls and speed up rclone operations against large versioned buckets.--fast-list flag
For better performance with large buckets, use the --fast-list flag to reduce the number of API calls required. This flag fetches more data in each call at the cost of increased memory usage. It can speed up the overall listing process, but the initial directory scan might take longer before transfers begin, especially if the transfers themselves are fast compared to the listing.
- You can use this flag with both
rclone lsandrclone lslto speed up listings. - You can combine it with
--use-server-modtimeto optimize performance.
--use-server-modtime flag
For better performance with the rclone lsl command on large buckets, use the --use-server-modtime flag. This flag tells rclone to get modification times from the server without incurring expensive per-object requests.
- You can combine this flag with
--fast-listto optimize performance. - It does not work in conjunction with the
--s3-versionsflag, which requires individual head requests for each object to retrieve version metadata.
Versions and delete markers
This section covers the rclone flags that expose object versions and delete markers, and how to clean them up.--s3-versions flag
The rclone --s3-versions flag lists all versions of objects, but does not include delete markers.
filename.ext as filename+v+DATE+TIME+SEQUENCE.ext. For example, myreport.txt becomes myreport-vYYYY-MM-DD-HHMMSS-000.txt. This naming pattern can lead to ambiguity and issues when copying files if your bucket contains objects that follow the same naming pattern.
For an authoritative object version listing, rely on the version IDs provided by the S3 API instead of rclone’s filename conventions. Use the AWS CLI for authoritative information.
--s3-version-deleted flag
When you use the --s3-version-deleted flag, rclone includes delete markers in the listing. You can combine this with the --s3-versions flag to see all versions along with delete markers.
--s3-version-deleted flag, rclone lists delete markers as zero-byte objects. These represent the delete markers but are not actual objects that can be retrieved. If you issue a standard GET request for that object without providing a specific version ID, Object Storage returns a 404 error, even though older versions still exist. See Understanding delete markers and soft deletes for more details.
Remove versions and delete markers
For general cleanup of accumulated old versions and delete markers, userclone backend cleanup-hidden REMOTE:. This removes all hidden versions and delete markers, leaving only the current versions intact.
Version concepts
The following sections explain how rclone interacts with versioned bucket metadata and delete markers, and when to use the AWS CLI for authoritative results.Extra metadata requests
If you userclone lsl without the --use-server-modtime flag, the S3 API does not return all possible metadata for every object in the list result. This forces rclone to make additional HEAD requests to retrieve the missing information, such as modification times. These extra HEAD requests can be slow and expensive, especially with a large number of files.
If you set the log level to NOTICE or DEBUG, or if you include the -vv (very verbose) flag, you’ll see these extra requests in the output of rclone lsl.
Options to reduce requests
To reduce the number of HEAD requests and improve performance, use one of these options:- Use the
--use-server-modtimeflag to get the server’s modification time without a separate HEAD request for each object. Also consider combining--use-server-modtimewith--fast-listto further improve performance. - Use the
lscommand instead oflsl. Thelscommand doesn’t retrieve the modification time, which avoids the extra HEAD requests.
--s3-versions flag, because rclone still needs to make individual HEAD requests to retrieve version metadata.
Delete markers and soft deletes
When you delete a file in a versioned S3 bucket, the bucket does not permanently remove it. Instead, the bucket creates a special object called a delete marker. This delete marker indicates that the object has been deleted, but the bucket still retains all previous versions of the object. This behavior is known as a “soft delete.” In contrast, a “hard delete” happens when you delete a specific version. When you list a versioned bucket through rclone, use the--s3-version-deleted flag to include delete markers in the output. Rclone reports these delete markers as zero-byte objects. These zero-byte entries represent the delete markers but are not actual files that can be retrieved. If you issue a standard GET request for that object without providing a specific version ID, Object Storage returns a 404 error, even though the older versions are still there.
The difference between a delete marker and rclone’s zero-byte representation of that marker is important to understand because the difference can lead to confusion. Seeing a zero-byte entry in the listing might suggest that the object is still accessible, but in reality, it is not retrievable without specifying a version ID of a previous version.
Deleting a delete marker with rclone restores access to the most recent version of the object prior to the deletion. Removing the delete marker makes the latest non-deleted version the current version again.
When to use AWS CLI
The AWS CLI provides definitive information about delete markers and versioned objects. It directly exposes the S3 API’s structured output, which clearly distinguishes between actual object versions and delete markers. When you attempt toGET an object that has been soft deleted, the response includes the x-amz-delete-marker: true header, clearly indicating that the object is a delete marker. Additionally, if you attempt to GET the version of the delete marker itself, the Last-Modified: timestamp shows when the bucket created the delete marker. The header is a more reliable way to confirm the presence of a delete marker than relying on rclone’s 404 Not Found response.
The AWS CLI list-object-versions command returns structured output that shows Versions[] (real object versions) and DeleteMarkers[] (soft deletes). The current state of an object is the item with IsLatest=true. If the latest is a DeleteMarker, the object is soft-deleted.
To test a single object with the AWS CLI, use either head-object or get-object, specifying the version ID if needed.
- If used without a version ID, soft-deleted objects return 404 Not Found with an
x-amz-delete-marker: trueheader. - If you specify a version ID that is a delete marker, you get 200 OK with the
x-amz-delete-marker: trueheader. - If you specify a version ID that is a real version, you get 200 OK and the object content.
Listing versions and delete markers with rcloneUsing rclone with the
--s3-versions and --s3-version-deleted flags together shows all object versions (as filenames with v+DATE+TIME+SEQUENCE appended) and delete markers (as zero-byte objects).However, in case of any doubt, rely on the AWS CLI as the authoritative source.Recovery patterns
This section describes how to recover soft deleted objects and older versions, and when to escalate to the AWS CLI or storage engineering. Rclone provides commands for recovering soft deleted objects and cleaning up old versions. However, for precise recovery, especially in intricate scenarios, direct AWS CLI commands are often necessary. After you detect a soft deleted object or an older version through the--s3-versions flag, you can copy the old version by using the version file name (such as filename-vYYYY-MM-DD-HHMMSS-000.ext) that rclone showed you. Keep in mind that this version file name is an rclone-specific construct and not the actual S3 version ID.
For better control over recovery and cleanup, use the AWS CLI commands. They provide more transparency and precision, especially when dealing with compliance or audit requirements.
- Identify the latest non-deleted version with
aws s3api list-object-versions. - Restore it to a new location using
aws s3api copy-object, or retrieve it withget-objectthen re-upload it.
Self-service recovery
For many common recovery scenarios and day-to-day operational contexts, rclone offers a self-service interface that balances capability with ease of use. It’s a good choice when you need to get a file back quickly or reset the bucket to a clean slate of active versions without worrying about an accumulation of older, potentially unnecessary versions cluttering things up. However, direct AWS CLI commands might be necessary when:- You need to recover a specific version for compliance or audit purposes.
- You need precise control over which versions are restored.
- You need to handle scenarios where delete markers are involved.