This guide shows you how to read, write, and manage files inside a CoreWeave sandbox instance using the CWSandbox client. It’s intended for developers who build workflows that move data into and out of sandboxes, including parallel uploads and downloads, error handling, and patterns for large files.
Basic operations
The examples in this section cover the most common file operations: writing data to a path in the sandbox and reading data back. File operations return OperationRef objects. Use .result() to block for completion.
Write files
from cwsandbox import Sandbox
with Sandbox.run() as sandbox:
# Write bytes to a file
sandbox.write_file("/app/data.txt", b"Hello, World!").result()
# Write JSON
import json
config = {"key": "value", "count": 42}
sandbox.write_file(
"/app/config.json",
json.dumps(config).encode()
).result()
Read files
# Read file contents as bytes
content = sandbox.read_file("/app/data.txt").result()
print(content.decode()) # "Hello, World!"
# Read JSON
config_bytes = sandbox.read_file("/app/config.json").result()
config = json.loads(config_bytes.decode())
Parallel operations
File operations return immediately, enabling natural parallelism. Running uploads or downloads in parallel can reduce total transfer time when you have multiple independent files.
Parallel uploads
from cwsandbox import results
# Start all uploads simultaneously
write_refs = [
sandbox.write_file("/app/config.json", config_bytes),
sandbox.write_file("/app/data.csv", data_bytes),
sandbox.write_file("/app/model.pkl", model_bytes),
]
# Wait for all to complete
results(write_refs)
Parallel downloads
# Start all downloads simultaneously
read_refs = [
sandbox.read_file("/app/output.json"),
sandbox.read_file("/app/metrics.json"),
sandbox.read_file("/app/logs.txt"),
]
# Get all results
output, metrics, logs = results(read_refs)
Upload-process-download pattern
A common workflow: upload input files, run processing, download results.
from cwsandbox import Sandbox, results
with Sandbox.run() as sandbox:
# 1. Parallel uploads
results([
sandbox.write_file("/app/config.json", config_bytes),
sandbox.write_file("/app/input.csv", input_bytes),
])
# 2. Sequential processing
sandbox.exec(["pip", "install", "-r", "requirements.txt"]).result()
sandbox.exec(["python", "/app/process.py"]).result()
# 3. Parallel downloads
output, metrics = results([
sandbox.read_file("/app/output.json"),
sandbox.read_file("/app/metrics.json"),
])
Error handling
File operations can fail if a path doesn’t exist or isn’t writable. Catch SandboxFileError to handle these cases and inspect the affected path.
File not found
from cwsandbox import SandboxFileError
try:
content = sandbox.read_file("/nonexistent/file.txt").result()
except SandboxFileError as e:
print(f"File error: {e.filepath}")
Write errors
try:
sandbox.write_file("/readonly/path.txt", b"data").result()
except SandboxFileError as e:
print(f"Cannot write to: {e.filepath}")
Binary files
File operations work with any binary content, including images and serialized Python objects:
# Images
with open("image.png", "rb") as f:
sandbox.write_file("/app/image.png", f.read()).result()
# Pickle files (only unpickle data from trusted sources)
import pickle
model_bytes = pickle.dumps(my_model)
sandbox.write_file("/app/model.pkl", model_bytes).result()
# Download and unpickle
model_bytes = sandbox.read_file("/app/trained_model.pkl").result()
trained_model = pickle.loads(model_bytes)
Only unpickle data from sources you trust. Loading a pickle file can execute arbitrary code.
Text encoding
Files are transferred as bytes, so you must encode and decode text explicitly to avoid corruption with non-ASCII characters.
# Write text
text = "Hello, Unicode! "
sandbox.write_file("/app/text.txt", text.encode("utf-8")).result()
# Read text
content = sandbox.read_file("/app/text.txt").result()
text = content.decode("utf-8")
Large file considerations
The read_file and write_file APIs are best suited for small to moderately sized files. For large files, consider the following:
- Account for bandwidth, because files are transferred through the API.
- Use streaming for large datasets when possible.
- Mount S3 or object storage for large data.
# For large data, use S3 mount instead
with Sandbox.run(
s3_mount={
"bucket": "my-data-bucket",
"mount_path": "/data",
"read_only": False,
}
) as sandbox:
# Files in /data are backed by S3
sandbox.exec(["ls", "/data"]).result()