Skip to main content

Versioned Buckets

Manage versioned buckets with rclone and the AWS CLI.

Rclone is a command-line tool for managing objects in CoreWeave AI Object Storage. This guide shows you how to configure rclone for versioned buckets, list objects and their versions, understand delete markers, and recover objects.

You will also learn:

  • When to use the AWS CLI instead of rclone.
  • How rclone handles versioned objects differently than AWS CLI.
  • Which command-line flags improve performance.
  • When to contact CoreWeave Storage Engineering for help.

It's important to understand the specific behaviors of rclone commands and how they relate to the AWS CLI and underlying S3 API. These concepts are particularly relevant when using rclone to migrate data between versioned buckets. See Migrate Data to AI Object Storage for more information.

Create a versioned bucket

Rclone is capable of most operations on versioned buckets, but it cannot create them. Use the AWS CLI to create versioned buckets.

  1. Create a bucket.

    $
    aws s3api create-bucket \
    --bucket BUCKET_NAME \
    --region AVAILABILITY_ZONE \
    --create-bucket-configuration LocationConstraint=AVAILABILITY_ZONE

    Replace BUCKET_NAME with your bucket name.

    • Bucket names cannot begin with cw- or vip-.
    • The bucket name must be unique across all CoreWeave and Availability Zones (AZs).

    Replace AVAILABILITY_ZONE with your AZ in uppercase, exactly as shown in the AZ list.

    LocationConstraint is required and must match the --region AZ.

  2. Wait approximately one minute for the bucket's DNS name to propagate.

    If you try to use the bucket immediately after creating it, you may encounter errors.

  3. Enable versioning on the bucket with the AWS CLI. Replace BUCKET_NAME with your bucket name.

    $
    aws s3api put-bucket-versioning --bucket BUCKET_NAME --versioning-configuration Status=Enabled

Configure rclone for versioned buckets

To get started, create a config file (usually at ~/.config/rclone/rclone.conf) with the following values, replacing YOUR_SECRET_KEY_ID and YOUR_SECRET_ACCESS_KEY with your actual values.

~/.config/rclone/rclone.conf
[default]
type = s3
provider = Other
access_key_id = YOUR_SECRET_KEY_ID
secret_access_key = YOUR_SECRET_ACCESS_KEY
endpoint = https://cwobject.com
force_path_style = false
no_check_bucket = true

These settings are critical when working with AI Object Storage:

  • type: Must be s3.

  • provider: Must be Other.

  • endpoint: Must be the primary endpoint, https://cwobject.com, when working with versioned buckets.

    • Do not use the LOTA endpoint when working with versioned buckets.
    • When working with non-versioned buckets, you can use either the primary or LOTA endpoint.
  • force_path_style: Must be false.

    This forces rclone to use virtual-hosted addressing (https://mybucket.cwobject.com). Rclone and AWS CLI must be explicitly configured for virtual-hosted addressing when using CoreWeave AI Object Storage. Path-style addressing (https://cwobject.com/mybucket/file) is not supported and will cause errors.

  • no_check_bucket: Must be true.

    This prevents rclone from attempting to create or verify the bucket's existence during operations.

About rclone remotes

Rclone defines remotes in its config file for different storage providers. You can create multiple remotes in a single file to manage different storage services. The config file above defines a single remote named default.

  • Replace YOUR_REMOTE and YOUR_BUCKET in the examples below with your remote and bucket, like default:mybucket.

Learn more about how rclone defines remotes in the rclone documentation.

Listing objects

rclone ls

rclone ls is useful to get a quick overview of the contents of a bucket or prefix. It provides a simple listing of objects, but not directories. It shows only their size and path, without modification times. It's recursive by default.

$
rclone ls YOUR_REMOTE:YOUR_BUCKET

rclone lsl

rclone lsl provides a more detailed listing than rclone ls, including modification times. Like rclone ls, it is recursive by default and does not include directories. It's useful when you need to see when files were last modified.

$
rclone lsl YOUR_REMOTE:YOUR_BUCKET

Performance tuning flags

--fast-list flag

For better performance with large buckets, consider using the --fast-list flag to reduce the number of API calls required. This flag fetches more data in each call at the cost of increased memory usage. While it can speed up the overall listing process, the initial directory scan might take longer before transfers begin, especially if the transfers themselves are very fast compared to the listing.

  • This can be used with both rclone ls and rclone lsl to speed up listings.
  • It can be combined with --use-server-modtime to optimize performance.
$
rclone ls YOUR_REMOTE:YOUR_BUCKET --fast-list

--use-server-modtime flag

For better performance with the rclone lsl command on large buckets, use the --use-server-modtime flag. This flag tells rclone to get modification times from the server without incurring expensive per-object requests.

$
rclone lsl YOUR_REMOTE:YOUR_BUCKET --use-server-modtime
  • This can be combined with --fast-list to optimize performance.
  • It does not work in conjunction with the --s3-versions flag, which requires individual head requests for each object to retrieve version metadata.

Versions and delete markers

--s3-versions flag

The rclone --s3-versions flag lists all versions of objects, but does not include delete markers.

$
rclone lsl YOUR_REMOTE:YOUR_BUCKET --s3-versions

Rclone modifies the filenames to include version information. A versioned filename.ext is reported as filname+v+DATE+TIME+SEQUENCE.ext. For example, myreport.txt becomes myreport-vYYYY-MM-DD-HHMMSS-000.txt. This can lead to ambiguity and issues when copying files if your bucket contains objects that follow the same naming pattern.

You should rely on the version IDs provided by the S3 API instead of rclone's filename conventions for an authoritative object version listing. You should use the AWS CLI for authoritative information.

--s3-version-deleted flag

When using the --s3-version-deleted flag, rclone includes delete markers in the listing. This can be combined with the --s3-versions flag to see all versions along with delete markers.

$
rclone lsl YOUR_REMOTE:YOUR_BUCKET --s3-version-deleted

When you use the --s3-version-deleted flag, rclone lists delete markers as zero-byte objects. These represent the delete markers but are not actual objects that can be retrieved. If you issue a standard GET request for that object without providing a specific version ID, Object Storage will return a 404 error, even though older versions still exist. See Understanding delete markers and soft deletes for more details.

Removing versions and delete markers

For general cleanup of accumulated old versions and delete markers, use rclone backend cleanup-hidden YOUR_REMOTE:. This removes all hidden versions and delete markers, leaving only the current versions intact.

Warning

Use this command with caution. It permanently deletes data and cannot be undone.

Version concepts

Extra metadata requests

If you use rclone lsl without the --use-server-modtime flag, the S3 API does not return all possible metadata for every object in the list result. This forces rclone to make additional HEAD requests to retrieve the missing information, such as modification times. These extra HEAD requests can be slow and expensive, especially with a large number of files.

You'll see these extra requests in the output of rclone lsl if you set the log level to NOTICE or DEBUG, or if you include the -vv (very verbose) flag.

$
rclone lsl YOUR_REMOTE:YOUR_BUCKET/ -vv

Partial output:

...
2025/09/24 10:42:01 DEBUG : YOUR_REMOTE:YOUR_BUCKET/: Listing with bucket_name="YOUR_BUCKET" delimiter="/" prefix=""
2025/09/24 10:42:02 DEBUG : file1.txt: Sent HEAD request
2025/09/24 10:42:02 DEBUG : file1.txt: HEAD request succeeded with status code 200
2025/09/24 10:42:02 NOTICE: file1.txt: Retrieving metadata with an extra HEAD request
2025/09/24 10:42:02 DEBUG : file2.txt: Sent HEAD request
2025/09/24 10:42:02 DEBUG : file2.txt: HEAD request succeeded with status code 200
2025/09/24 10:42:02 NOTICE: file2.txt: Retrieving metadata with an extra HEAD request
...

Options to reduce requests

If you want to reduce the number of HEAD requests to improve performance, use one of these options:

  • Use the --use-server-modtime flag to get the server's modification time without a separate HEAD request for each object. Also consider combining --use-server-modtime with --fast-list to further improve performance.
  • Use the ls command instead of lsl. The ls command doesn't retrieve the modification time, avoiding the extra HEAD requests.

However, these rclone optimizations do not apply if you use the --s3-versions flag, because rclone still needs to make individual HEAD requests to retrieve version metadata.

Delete markers and soft deletes

When a file is deleted in a versioned S3 bucket, it is not permanently removed. Instead, a special object called a delete marker is created. This delete marker indicates that the object has been deleted, but all previous versions of the object are still retained in the bucket. This behavior is known as a "soft delete." In contrast, a "hard delete" happens when you delete a specific version.

When listing a versioned bucket with rclone, use the --s3-version-deleted flag to include delete markers in the output. Rclone reports these delete markers as zero-byte objects. Crucially, these zero-byte entries represent the delete markers but are not actual files that can be retrieved. If you issue a standard GET request for that object without providing a specific version ID, Object Storage will return a 404 error, even though the older versions are still there.

The difference between a delete marker and rclone's zero-byte representation of that marker is important to understand because it can lead to confusion. Seeing a zero-byte entry in the listing might suggest that the object is still accessible, but in reality, it is not retrievable without specifying a version ID of a previous version.

Deleting a delete marker with rclone restores access to the most recent version of the object prior to the deletion. This is because removing the delete marker makes the latest non-deleted version the current version again.

When to use AWS CLI

The AWS CLI provides definitive information about delete markers and versioned objects. It directly exposes the S3 API's structured output, which clearly distinguishes between actual object versions and delete markers.

When attempting to GET an object that has been soft deleted, the response will include the x-amz-delete-marker: true header, clearly indicating that the object is a delete marker. Additionally, if you attempt to GET the version of the delete marker itself, the Last-Modified: timestamp shows when the delete marker was created. This is a more reliable way to confirm the presence of a delete marker than relying on rclone's 404 Not Found response.

The AWS CLI list-object-versions command returns structured output that shows Versions[] (real object versions) and DeleteMarkers[] (soft deletes). The current state of an object is the item with IsLatest=true. If the latest is a DeleteMarker, then the object is soft-deleted.

To test a single object with the AWS CLI, you can use either head-object or get-object, specifying the version ID if needed.

  • If used without a version ID soft-deleted objects return 404 Not Found with an x-amz-delete-marker: true header.
  • If you specify a version ID that is a delete marker, you get 200 OK with the x-amz-delete-marker: true header.
  • If you specify a version ID that is a real version, you get 200 OK and the object content.
Listing versions and delete markers with rclone

Using rclone with the --s3-versions and --s3-version-deleted flags together will show all object versions (as filenames with v+DATE+TIME+SEQUENCE appended) and delete markers (as zero-byte objects).

However, in case of any doubt, you should rely on the AWS CLI as the authoritative source.

Recovery patterns

Rclone provides some convenient commands for recovering soft deleted objects and cleaning up old versions. However, for precise recovery, especially in complex scenarios, direct AWS CLI commands are often necessary.

Once you've detected a soft deleted object or an older version using the --s3-versions flag, you can copy the old version by using the version file name (such as filename-vYYYY-MM-DD-HHMMSS-000.ext) that rclone showed you. Keep in mind this version file name is an rclone-specific construct and not the actual S3 version ID.

For better control over recovery and cleanup, use the AWS CLI commands. They provide more transparency and precision, especially when dealing with compliance or audit requirements.

Self-service recovery

For many common recovery scenarios and day-to-day operational contexts, rclone offers a convenient self-service interface that strikes an excellent balance between power and simplicity. It's a good choice when you need to get a file back quickly or reset the bucket to a clean slate of active versions without worrying about an accumulation of older, potentially unnecessary versions cluttering things up.

However, direct AWS CLI commands may be necessary when:

  • You need to recover a specific version for compliance or audit purposes
  • Ensure precise control over which versions are restored
  • Handle scenarios where delete markers are involved

For these situations, the AWS CLI should be your diagnostic tool of choice. It allows you to confidently discern between different versions, including those explicitly marked as deleted, and to execute precise, auditable recovery actions.

When to request storage engineering intervention

Removing policy-controlled delete markers may require assistance from CoreWeave's storage engineering team. If you encounter issues removing delete markers, you should contact CoreWeave support.