Versioned Buckets
Manage versioned buckets with rclone and the AWS CLI.
Rclone is a command-line tool for managing objects in CoreWeave AI Object Storage. This guide shows you how to configure rclone for versioned buckets, list objects and their versions, understand delete markers, and recover objects.
You will also learn:
- When to use the AWS CLI instead of rclone.
- How rclone handles versioned objects differently than AWS CLI.
- Which command-line flags improve performance.
- When to contact CoreWeave Storage Engineering for help.
It's important to understand the specific behaviors of rclone commands and how they relate to the AWS CLI and underlying S3 API. These concepts are particularly relevant when using rclone to migrate data between versioned buckets. See Migrate Data to AI Object Storage for more information.
Create a versioned bucket
Rclone is capable of most operations on versioned buckets, but it cannot create them. Use the AWS CLI to create versioned buckets.
-
Create a bucket.
$aws s3api create-bucket \--bucket BUCKET_NAME \--region AVAILABILITY_ZONE \--create-bucket-configuration LocationConstraint=AVAILABILITY_ZONEReplace
BUCKET_NAME
with your bucket name.- Bucket names cannot begin with
cw-
orvip-
. - The bucket name must be unique across all CoreWeave and Availability Zones (AZs).
Replace
AVAILABILITY_ZONE
with your AZ in uppercase, exactly as shown in the AZ list.LocationConstraint
is required and must match the--region
AZ. - Bucket names cannot begin with
-
Wait approximately one minute for the bucket's DNS name to propagate.
If you try to use the bucket immediately after creating it, you may encounter errors.
-
Enable versioning on the bucket with the AWS CLI. Replace
BUCKET_NAME
with your bucket name.$aws s3api put-bucket-versioning --bucket BUCKET_NAME --versioning-configuration Status=Enabled
Configure rclone for versioned buckets
To get started, create a config file (usually at ~/.config/rclone/rclone.conf
) with the following values, replacing YOUR_SECRET_KEY_ID
and YOUR_SECRET_ACCESS_KEY
with your actual values.
[default]type = s3provider = Otheraccess_key_id = YOUR_SECRET_KEY_IDsecret_access_key = YOUR_SECRET_ACCESS_KEYendpoint = https://cwobject.comforce_path_style = falseno_check_bucket = true
These settings are critical when working with AI Object Storage:
-
type: Must be
s3
. -
provider: Must be
Other
. -
endpoint: Must be the primary endpoint,
https://cwobject.com
, when working with versioned buckets.- Do not use the LOTA endpoint when working with versioned buckets.
- When working with non-versioned buckets, you can use either the primary or LOTA endpoint.
-
force_path_style: Must be
false
.This forces rclone to use virtual-hosted addressing (
https://mybucket.cwobject.com
). Rclone and AWS CLI must be explicitly configured for virtual-hosted addressing when using CoreWeave AI Object Storage. Path-style addressing (https://cwobject.com/mybucket/file
) is not supported and will cause errors. -
no_check_bucket: Must be
true
.This prevents rclone from attempting to create or verify the bucket's existence during operations.
Rclone defines remotes in its config file for different storage providers. You can create multiple remotes in a single file to manage different storage services. The config file above defines a single remote named default
.
- Replace
YOUR_REMOTE
andYOUR_BUCKET
in the examples below with your remote and bucket, likedefault:mybucket
.
Learn more about how rclone defines remotes in the rclone documentation.
Listing objects
rclone ls
rclone ls
is useful to get a quick overview of the contents of a bucket or prefix. It provides a simple listing of objects, but not directories. It shows only their size and path, without modification times. It's recursive by default.
$rclone ls YOUR_REMOTE:YOUR_BUCKET
rclone lsl
rclone lsl
provides a more detailed listing than rclone ls
, including modification times. Like rclone ls
, it is recursive by default and does not include directories. It's useful when you need to see when files were last modified.
$rclone lsl YOUR_REMOTE:YOUR_BUCKET
Performance tuning flags
--fast-list
flag
For better performance with large buckets, consider using the --fast-list
flag to reduce the number of API calls required. This flag fetches more data in each call at the cost of increased memory usage. While it can speed up the overall listing process, the initial directory scan might take longer before transfers begin, especially if the transfers themselves are very fast compared to the listing.
- This can be used with both
rclone ls
andrclone lsl
to speed up listings. - It can be combined with
--use-server-modtime
to optimize performance.
$rclone ls YOUR_REMOTE:YOUR_BUCKET --fast-list
--use-server-modtime
flag
For better performance with the rclone lsl
command on large buckets, use the --use-server-modtime
flag. This flag tells rclone to get modification times from the server without incurring expensive per-object requests.
$rclone lsl YOUR_REMOTE:YOUR_BUCKET --use-server-modtime
- This can be combined with
--fast-list
to optimize performance. - It does not work in conjunction with the
--s3-versions
flag, which requires individual head requests for each object to retrieve version metadata.
Versions and delete markers
--s3-versions
flag
The rclone --s3-versions
flag lists all versions of objects, but does not include delete markers.
$rclone lsl YOUR_REMOTE:YOUR_BUCKET --s3-versions
Rclone modifies the filenames to include version information. A versioned filename.ext
is reported as filname+v+DATE+TIME+SEQUENCE.ext
. For example, myreport.txt
becomes myreport-vYYYY-MM-DD-HHMMSS-000.txt
. This can lead to ambiguity and issues when copying files if your bucket contains objects that follow the same naming pattern.
You should rely on the version IDs provided by the S3 API instead of rclone's filename conventions for an authoritative object version listing. You should use the AWS CLI for authoritative information.
--s3-version-deleted
flag
When using the --s3-version-deleted
flag, rclone includes delete markers in the listing. This can be combined with the --s3-versions
flag to see all versions along with delete markers.
$rclone lsl YOUR_REMOTE:YOUR_BUCKET --s3-version-deleted
When you use the --s3-version-deleted
flag, rclone lists delete markers as zero-byte objects. These represent the delete markers but are not actual objects that can be retrieved. If you issue a standard GET
request for that object without providing a specific version ID, Object Storage will return a 404
error, even though older versions still exist. See Understanding delete markers and soft deletes for more details.
Removing versions and delete markers
For general cleanup of accumulated old versions and delete markers, use rclone backend cleanup-hidden YOUR_REMOTE:
. This removes all hidden versions and delete markers, leaving only the current versions intact.
Use this command with caution. It permanently deletes data and cannot be undone.
Version concepts
Extra metadata requests
If you use rclone lsl
without the --use-server-modtime
flag, the S3 API does not return all possible metadata for every object in the list result. This forces rclone to make additional HEAD requests to retrieve the missing information, such as modification times. These extra HEAD requests can be slow and expensive, especially with a large number of files.
You'll see these extra requests in the output of rclone lsl
if you set the log level to NOTICE
or DEBUG
, or if you include the -vv
(very verbose) flag.
$rclone lsl YOUR_REMOTE:YOUR_BUCKET/ -vv
Partial output:
...2025/09/24 10:42:01 DEBUG : YOUR_REMOTE:YOUR_BUCKET/: Listing with bucket_name="YOUR_BUCKET" delimiter="/" prefix=""2025/09/24 10:42:02 DEBUG : file1.txt: Sent HEAD request2025/09/24 10:42:02 DEBUG : file1.txt: HEAD request succeeded with status code 2002025/09/24 10:42:02 NOTICE: file1.txt: Retrieving metadata with an extra HEAD request2025/09/24 10:42:02 DEBUG : file2.txt: Sent HEAD request2025/09/24 10:42:02 DEBUG : file2.txt: HEAD request succeeded with status code 2002025/09/24 10:42:02 NOTICE: file2.txt: Retrieving metadata with an extra HEAD request...
Options to reduce requests
If you want to reduce the number of HEAD requests to improve performance, use one of these options:
- Use the
--use-server-modtime
flag to get the server's modification time without a separate HEAD request for each object. Also consider combining--use-server-modtime
with--fast-list
to further improve performance. - Use the
ls
command instead oflsl
. Thels
command doesn't retrieve the modification time, avoiding the extra HEAD requests.
However, these rclone optimizations do not apply if you use the --s3-versions
flag, because rclone still needs to make individual HEAD requests to retrieve version metadata.
Delete markers and soft deletes
When a file is deleted in a versioned S3 bucket, it is not permanently removed. Instead, a special object called a delete marker is created. This delete marker indicates that the object has been deleted, but all previous versions of the object are still retained in the bucket. This behavior is known as a "soft delete." In contrast, a "hard delete" happens when you delete a specific version.
When listing a versioned bucket with rclone, use the --s3-version-deleted
flag to include delete markers in the output. Rclone reports these delete markers as zero-byte objects. Crucially, these zero-byte entries represent the delete markers but are not actual files that can be retrieved. If you issue a standard GET
request for that object without providing a specific version ID, Object Storage will return a 404
error, even though the older versions are still there.
The difference between a delete marker and rclone's zero-byte representation of that marker is important to understand because it can lead to confusion. Seeing a zero-byte entry in the listing might suggest that the object is still accessible, but in reality, it is not retrievable without specifying a version ID of a previous version.
Deleting a delete marker with rclone restores access to the most recent version of the object prior to the deletion. This is because removing the delete marker makes the latest non-deleted version the current version again.
When to use AWS CLI
The AWS CLI provides definitive information about delete markers and versioned objects. It directly exposes the S3 API's structured output, which clearly distinguishes between actual object versions and delete markers.
When attempting to GET
an object that has been soft deleted, the response will include the x-amz-delete-marker: true
header, clearly indicating that the object is a delete marker. Additionally, if you attempt to GET
the version of the delete marker itself, the Last-Modified: timestamp
shows when the delete marker was created. This is a more reliable way to confirm the presence of a delete marker than relying on rclone's 404 Not Found response.
The AWS CLI list-object-versions
command returns structured output that shows Versions[]
(real object versions) and DeleteMarkers[]
(soft deletes). The current state of an object is the item with IsLatest=true
. If the latest is a DeleteMarker, then the object is soft-deleted.
To test a single object with the AWS CLI, you can use either head-object
or get-object
, specifying the version ID if needed.
- If used without a version ID soft-deleted objects return 404 Not Found with an
x-amz-delete-marker: true
header. - If you specify a version ID that is a delete marker, you get 200 OK with the
x-amz-delete-marker: true
header. - If you specify a version ID that is a real version, you get 200 OK and the object content.
Using rclone with the --s3-versions
and --s3-version-deleted
flags together will show all object versions (as filenames with v+DATE+TIME+SEQUENCE
appended) and delete markers (as zero-byte objects).
However, in case of any doubt, you should rely on the AWS CLI as the authoritative source.
Recovery patterns
Rclone provides some convenient commands for recovering soft deleted objects and cleaning up old versions. However, for precise recovery, especially in complex scenarios, direct AWS CLI commands are often necessary.
Once you've detected a soft deleted object or an older version using the --s3-versions
flag, you can copy the old version by using the version file name (such as filename-vYYYY-MM-DD-HHMMSS-000.ext
) that rclone showed you. Keep in mind this version file name is an rclone-specific construct and not the actual S3 version ID.
For better control over recovery and cleanup, use the AWS CLI commands. They provide more transparency and precision, especially when dealing with compliance or audit requirements.
- Identify the latest non-deleted version with
aws s3api list-object-versions
. - Restore it to a new location using
aws s3api copy-object
, or retrieve it withget-object
then re-upload it.
Self-service recovery
For many common recovery scenarios and day-to-day operational contexts, rclone offers a convenient self-service interface that strikes an excellent balance between power and simplicity. It's a good choice when you need to get a file back quickly or reset the bucket to a clean slate of active versions without worrying about an accumulation of older, potentially unnecessary versions cluttering things up.
However, direct AWS CLI commands may be necessary when:
- You need to recover a specific version for compliance or audit purposes
- Ensure precise control over which versions are restored
- Handle scenarios where delete markers are involved
For these situations, the AWS CLI should be your diagnostic tool of choice. It allows you to confidently discern between different versions, including those explicitly marked as deleted, and to execute precise, auditable recovery actions.
When to request storage engineering intervention
Removing policy-controlled delete markers may require assistance from CoreWeave's storage engineering team. If you encounter issues removing delete markers, you should contact CoreWeave support.