Skip to main content

Use rclone with CoreWeave Object Storage

Rclone is a command-line program that syncs files and directories with various Object Storage providers, including CoreWeave.

This guide describes how to configure rclone for CoreWeave Object Storage and use basic commands. It also explains common command-line options, and introduces advanced features like mounting volumes and the Web-based GUI. Finally, it outlines a realistic scenario of migrating data from AWS to CoreWeave.

Install rclone

Rclone is a single Go binary that can be installed on Linux, macOS, and Windows. It can be installed through a variety of methods, including scripted downloads, package managers like Brew and Chocolatey, or downloaded manually.

To install rclone, follow the installation instructions on the rclone website for your preferred method.

To verify if rclone is installed correctly, open your command-line interface and run:

$ rclone version

You should see the version information for rclone, such as:

rclone v1.63.1
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 6.2.0-26-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.20.6
- go/linking: dynamic
- go/tags: none

Configuration

To configure rclone for CoreWeave Object Storage, you need a CoreWeave Object Storage token with the Access Key and Secret Key. You also need to choose a desired Endpoint URL.

RegionEndpoint
New York - LGA1https://object.lga1.coreweave.com/
Chicago - ORD1https://object.ord1.coreweave.com/
Las Vegas - LAS1https://object.las1.coreweave.com/

Using the configuration wizard

To begin, start the configuration wizard at the command line.

$ rclone config
tip

Most of the rclone options below are selected by entering an option number. However, these option numbers vary depending on the rclone version. Find the option number that corresponds to the text shown in your version.

  1. Enter n ("New remote") to create a new remote connection.
  2. Enter a descriptive name for your remote (e.g., mycoreweave). This is used to identify the remote in future rclone commands.
  3. Storage: Choose "Amazon S3 Compliant Storage Providers including..."
  4. provider: Choose "Ceph Object Storage"
  5. env_auth: Choose "Enter AWS credentials in the next step."
  6. Enter the access_key_id.
  7. Enter the secret_access_key.
  8. region: Choose "Will use v4 signatures and an empty region."
  9. Enter the desired Object Storage Endpoint URL from the table above.
  10. location_constraint: Press enter to leave empty.
  11. acl: Choose the ACL option that best suits your needs. If unsure, choose the default option, "Owner gets FULL_CONTROL. No one else has access rights."
  12. server_side_encryption and sse_kms_key_id: For both of these, chose "None" unless you have specific encryption requirements.
  13. Choose no when asked to edit advanced config.
  14. Select yes to confirm the configuration.

Manual configuration

The rclone configuration file is stored in:

  • Linux & macOS: ~/.config/rclone/rclone.conf
  • Windows: C:\Users\<username>\.config\rclone\rclone.conf

To manually configure rclone, edit rclone.conf in one the location for your platform. Here is a sample CoreWeave configuration.

rclone.conf
[mycoreweave]
type = s3
provider = Ceph
access_key_id = REDACTED
secret_access_key = REDACTED
endpoint = https://object.lga1.coreweave.com/
acl = private

CoreWeave Object Storage is fully-compatible with rclone's Ceph configuration. Please refer to the rclone Ceph configuration documentation for advanced options.

Multiple configurations

Rclone can use multiple configurations to manage multiple endpoints. To create a second configuration, initiate the Configuration Wizard again.

$ rclone config

Choose n) New remote when the configuration wizard starts. Name it differently (e.g., mycoreweave2), and follow the same steps as before to add your new storage credentials.

The complete rclone configuration file with multiple CoreWeave endpoints looks similar to this:

[mycoreweave]
type = s3
provider = Ceph
access_key_id = REDACTED
secret_access_key = REDACTED
endpoint = https://object.lga1.coreweave.com/

[mycoreweave2]
type = s3
provider = Ceph
access_key_id = REDACTED
secret_access_key = REDACTED
endpoint = https://object.ord1.coreweave.com/

When using rclone, specify the desired endpoint.

# List buckets for mycoreweave
$ rclone lsd mycoreweave:

# List buckets for mycoreweave2
$ rclone lsd mycoreweave2:

Encrypted configuration

When running rclone config, it's possible to encrypt the configuration file by choosing Set configuration password. This protects sensitive information like the secret key.

If the rclone configuration file is encrypted, the password must be provided to rclone through the RCLONE_CONFIG_PASS environment variable or on the command line.

To set the environment variable:

$ export RCLONE_CONFIG_PASS=mysecretpassword
$ rclone ls mycoreweave:bucket-name

If the environment variable isn't provided, rclone will prompt for the password at the command line.

$ rclone ls mycoreweave:bucket-name
Enter configuration password:
password:

Avoid exposing the configuration password in scripts or environment variables. Make sure that only authorized users have access to the information. Don't commit scripts containing sensitive passwords to public repositories.

Basic commands

Rclone has a rich command syntax, and these descriptions are not intended to be comprehensive. For complete documentation, refer the rclone website or generate the documentation locally:

$ rclone gendocs [output_directory]

List available buckets

$ rclone lsd mycoreweave:

List files in a bucket

$ rclone ls mycoreweave:bucket-name

Sync files

Copies files from the source to the destination, and removes any extra files from the destination that are not present in the source.

After sync, the destination will mirror the source exactly.

Sync from a remote bucket to local:

$ rclone sync mycoreweave:bucket-name /local/directory

Sync from local to a remote bucket:

$ rclone sync /local/directory mycoreweave:bucket-name

Copy files

Copies files from the source to the destination, but does not remove any files from the destination. It's a safer option if you want to retain files in the destination.

Copy from a remote bucket to local:

$ rclone copy mycoreweave:bucket-name /local/directory

Copy from local to a remote bucket:

$ rclone copy /local/directory mycoreweave:bucket-name

Delete files

The delete command removes files from the remote that don't exist in the local source. Note that rclone delete only affects files; it will not remove any remote directories. If you want to delete a directory and all of its contents use the purge command.

For example, to delete all files bigger than 100 MiB:

$ rclone --dry-run --min-size 100M delete mycoreweave:bucket-name

Other options

Safe operation with --interactive and --dry-run

The --interactive and --dry-run flags can provide a safety net for potentially destructive operations. These flags are particularly useful in a production environment where accidental data loss could be catastrophic.

The --interactive or -i flag asks for confirmation before performing any deletions. This adds an extra layer of safety, especially when using destructive commands like sync or delete.

$ rclone sync /local/directory mycoreweave:bucket-name --interactive

The --dry-run flag simulates the operation without actually making any changes. This allows you to preview what will happen without committing to it.

$ rclone sync /local/directory mycoreweave:bucket-name --dry-run

Filters

The --include and --exclude flags filter the files in a sync or copy operation.

$ rclone sync /local/directory mycoreweave:bucket-name --include "*.txt"
$ rclone copy /local/directory mycoreweave:bucket-name --exclude "*.zip"

External filter file

An external file can manage what should be excluded during transfer. This is particularly helpful for complex filter rules or when sharing the rules between different commands.

Create a text file (e.g., exclude-file.txt) and list the patterns for files you want to exclude, one per line.

*.tmp
*.bak
logs/

Then, use the --exclude-from flag and specify the path to the exclude file.

$ rclone copy /local/directory mycoreweave:bucket-name --exclude-from exclude-file.txt

Rclone also allows filtering with --max-size, and --min-size.

rclone copy /local/directory mycoreweave:bucket-name --max-size 10M --exclude *.tmp

Rclone filters are powerful and complex. Please see the documentation for more information.

Logging and progress

Use the --log-file and --progress flags to keep track of actions taken during the data transfer process.

$ rclone copy /local/directory mycoreweave:bucket-name --log-file=rclone.log --progress

Bandwidth limits

Limiting the upload or download bandwidth with the --bwlimit flag. Units are in Bytes per second, and suffixes like k (kilobytes) and M (megabytes) can be used.

$ rclone copy mycoreweave:bucket-name /local/directory --bwlimit 1M

Advanced features

Check

The rclone check command is an essential tool for ensuring data integrity when you're transferring files between your local machine and Object Storage, or between two remote providers. This command compares the contents of the source and the destination, verifying that they are identical.

When you run rclone check, the following occurs:

  1. File Comparison: rclone compares the list of files in the source and destination.
  2. Size and Hash: For each matched pair of files, rclone compares the file size and hash (usually MD5 or SHA-1, depending on the storage provider).
  3. Reporting: The command produces a summary report that lists mismatches, missing files, or errors encountered during the check.

To compare a local directory with a bucket in CoreWeave Object Storage:

$ rclone check /local/directory mycoreweave:bucket-name

The --one-way flag can be used to only report files that are present in the source but missing or different in the destination.

The output will display results in three categories:

  1. Matched: The files that are identical in both source and destination.
  2. Differences: The files that exist in both but are not identical.
  3. Missing: Files that are in the source but not in the destination (or vice versa if not using --one-way).

Here's a simplified output example:

2019/07/25 14:21:06 NOTICE: 23 files are matched
2019/07/25 14:21:06 NOTICE: 2 differences found
2019/07/25 14:21:06 NOTICE: 1 files missing

rclone check is a best practice to confirm that data transfer operations have been carried out correctly.

Mount

It's possible to mount Object Storage as a filesystem with rclone mount, making it available to browse like a local drive.

For Linux and macOS:

$ mkdir ~/my-mount
$ rclone mount mycoreweave:bucket-name ~/my-mount &

For Windows, install WinFsp from the official site, then to mount as drive X:, run:

C:\> rclone mount mycoreweave:bucket-name X: --vfs-cache-mode writes

The terminal must stay open for the mount to remain active. To unmount, press Ctrl+C in the terminal.

Text-based browser

Rclone provides an interactive text-based navigation user interface, similar to the ncdu tool for rclone remotes.

rclone ncdu mycoreweave:

Use the arrow keys to navigate through the directories and files. More options are displayed at the bottom of the screen, allowing various operations like sorting and deleting.

Web GUI

Rclone offers an experimental web-based GUI that provides an intuitive way to perform tasks. To start the Web GUI:

rclone rcd --rc-web-gui

Then navigate to the provided URL in your web browser.

The rclone Web GUI

From the Web GUI, it's possible to mount volumes, explore remotes, edit configurations, and perform other operations.

Migrate data between clouds

Migrating data between different cloud storage providers is a common use-case for rclone.

Here's a realistic scenario: A company uses Amazon S3 but decides to move to CoreWeave for cost efficiency. The challenge is to migrate terabytes of data without any loss or downtime. Here are the migration steps.

Configure rclone

If not already configured, set up rclone to interact with both Amazon S3 and CoreWeave Object Storage (e.g., create mys3 and mycoreweave configs).

$ rclone config

Test connectivity

Test that it's possible to list the buckets from both providers.

$ rclone lsd mys3:
$ rclone lsd mycoreweave:

Perform a dry run

Before the actual migration, perform a dry run to see what would be transferred.

$ rclone sync mys3:old-bucket mycoreweave:new-bucket --dry-run

Execute the migration

Once confident, execute the migration.

$ rclone sync mys3:old-bucket mycoreweave:new-bucket

Optimize performance

Rclone has many performance options available. These are only a few of the available options, see the documentation for a complete list.

For example, to use 32 simultaneous transfers and 50MB/s bandwidth:

$ rclone sync mys3:my-production-bucket \
mycoreweave:my-new-bucket \
--transfers=32 --bwlimit=50M

Verify the data

Ensure all data has been transferred and is consistent.

$ rclone check mys3:old-bucket mycoreweave:new-bucket

Summary

Migrating data between cloud providers can be challenging, but with proper planning, testing, and validation, rclone can make the process efficient and secure.

Best practices

  1. Security: Always keep your access and secret keys secure.
  2. Logging: Use the --log-file flag for logging.
  3. Error Handling: Use the --retries flag to set the number of retries for failed operations.
  4. Automation: Automate repetitive tasks using cron jobs (Linux) or Task Scheduler (Windows).
  5. Troubleshooting: If you run into issues, use the --verbose flag for detailed output.

More information

For more information about rclone and its various features, please refer to the following resources: