File operations - CoreWeave Docs

This guide covers reading and writing files in sandboxes.

Basic operations

File operations return OperationRef objects. Use .result() to block for completion.

Writing files

from cwsandbox import Sandbox

with Sandbox.run() as sandbox:
    # Write bytes to a file
    sandbox.write_file("/app/data.txt", b"Hello, World!").result()

    # Write JSON
    import json
    config = {"key": "value", "count": 42}
    sandbox.write_file(
        "/app/config.json",
        json.dumps(config).encode()
    ).result()

Reading files

# Read file contents as bytes
content = sandbox.read_file("/app/data.txt").result()
print(content.decode())  # "Hello, World!"

# Read JSON
config_bytes = sandbox.read_file("/app/config.json").result()
config = json.loads(config_bytes.decode())

Parallel operations

File operations return immediately, enabling natural parallelism.

Parallel uploads

from cwsandbox import results

# Start all uploads simultaneously
write_refs = [
    sandbox.write_file("/app/config.json", config_bytes),
    sandbox.write_file("/app/data.csv", data_bytes),
    sandbox.write_file("/app/model.pkl", model_bytes),
]

# Wait for all to complete
results(write_refs)

Parallel downloads

# Start all downloads simultaneously
read_refs = [
    sandbox.read_file("/app/output.json"),
    sandbox.read_file("/app/metrics.json"),
    sandbox.read_file("/app/logs.txt"),
]

# Get all results
output, metrics, logs = results(read_refs)

Upload-process-download pattern

A common workflow: upload input files, run processing, download results.

from cwsandbox import Sandbox, results

with Sandbox.run() as sandbox:
    # 1. Parallel uploads
    results([
        sandbox.write_file("/app/config.json", config_bytes),
        sandbox.write_file("/app/input.csv", input_bytes),
    ])

    # 2. Sequential processing
    sandbox.exec(["pip", "install", "-r", "requirements.txt"]).result()
    sandbox.exec(["python", "/app/process.py"]).result()

    # 3. Parallel downloads
    output, metrics = results([
        sandbox.read_file("/app/output.json"),
        sandbox.read_file("/app/metrics.json"),
    ])

Error handling

File not found

from cwsandbox import SandboxFileError

try:
    content = sandbox.read_file("/nonexistent/file.txt").result()
except SandboxFileError as e:
    print(f"File error: {e.filepath}")

Write errors

try:
    sandbox.write_file("/readonly/path.txt", b"data").result()
except SandboxFileError as e:
    print(f"Cannot write to: {e.filepath}")

Binary files

File operations work with any binary content:

# Images
with open("image.png", "rb") as f:
    sandbox.write_file("/app/image.png", f.read()).result()

# Pickle files (only unpickle data from trusted sources)
import pickle
model_bytes = pickle.dumps(my_model)
sandbox.write_file("/app/model.pkl", model_bytes).result()

# Download and unpickle
model_bytes = sandbox.read_file("/app/trained_model.pkl").result()
trained_model = pickle.loads(model_bytes)

Text encoding

Files are transferred as bytes. Handle encoding explicitly:

# Write text
text = "Hello, Unicode! "
sandbox.write_file("/app/text.txt", text.encode("utf-8")).result()

# Read text
content = sandbox.read_file("/app/text.txt").result()
text = content.decode("utf-8")

Large file considerations

For very large files:

Files are transferred through the API - consider bandwidth
Use streaming for large datasets when possible
Consider mounting S3/object storage for large data

# For large data, use S3 mount instead
with Sandbox.run(
    s3_mount={
        "bucket": "my-data-bucket",
        "mount_path": "/data",
        "read_only": False,
    }
) as sandbox:
    # Files in /data are backed by S3
    sandbox.exec(["ls", "/data"]).result()

Documentation Index

​Basic operations

​Writing files

​Reading files

​Parallel operations

​Parallel uploads

​Parallel downloads

​Upload-process-download pattern

​Error handling

​File not found

​Write errors

​Binary files

​Text encoding

​Large file considerations