CGI Rendering with Argo Workflows
Create your own Blender render farm with thousands of GPUs using Argo Workflows
In this example, a complete Cloud rendering solution is deployed onto a CoreWeave Virtual Workstation. This tutorial uses Argo Workflows for GPU-rendering jobs with Blender.
By following along with this example, at the end you will have:
- a Web-based file management solution for uploading assets and downloading render output, and
- a highly parallel workflow template with which to launch your render jobs.
In this tutorial, all resources are deployed using the Kubernetes command line (Kubectl).
Prerequisites
This guide assumes that the Argo CLI tools have already been configured in your CoreWeave Cloud namespace.
Overview
This tutorial covers the following procedure:
Create a storage volume
Render assets and outputs must be stored somewhere that is accessible to multiple workers and to other services in our namespace. For this purpose, a shared filesystem is created by way of a Persistent Volume Claim.
In the example manifest below, the resource storage request is set to create a storage volume with a size of 100Gi
.
apiVersion: v1kind: PersistentVolumeClaimmetadata:name: shared-data-pvcspec:storageClassName: shared-hdd-ord1accessModes:- ReadWriteManyresources:requests:storage: 100Gi # 100Gi total volume size
Once the PVC is created and saved in a clearly-named file, such as pvc.yaml
, run kubectl apply
to deploy it:
$kubectl apply -f pvc.yamlpersistentvolumeclaim/shared-data-pvc created
A shared filesystem of 100Gi
is now deployed in the namespace with the name shared-data-pvc
. This volume is utilized throughout this rendering example.
Install FileBrowser
The end goal of this tutorial is to create an easy-to-use service for rendering a Blender animation. The simplest solution for accessing output assets is to use the open-source utility FileBrowser, which is available through the Applications Catalog by searching for filebrowser.
For more information on installing and using FileBrowser, see the FileBrowser installation guide.
It is recommended that the name of the FileBrowser application be kept short. Names that are too long may run into SSL CNAME issues.
The newly-created storage volume (PVC) is used as the backend for FileBrowser. Under the "Attach existing volumes to your FileBrowser," select the new volume (here, shared-data-pvc
) by clicking the small blue plus sign to the right of the Volume name.
Configure how you'd like the volume to appear once mounted, then click the Deploy button.
During the deployment of the application, you'll be redirected to the application's status page. This status page also provides the default login credentials for the FileBrowser application.
It is strongly recommended to change the default login credentials for FileBrowser.
Navigate to the Access URLs box on the status page to find the Ingress URL (for example, https://filebrowser-name.tenant-coreweave-clientname.ord1.ingress.coreweave.cloud/
).
This Ingress URL is used to access the FileBrowser application in using a Web browser.
In this example, one of the typical Blender benchmarks, BMW_27, is uploaded as the unpacked file bmw27_gpu.blend
to the root path in the FileBrowser application. Once logged into FileBrowser, the file may be uploaded using its Web UI.
Create the render workflow
The Argo workflow file provided below does a number of things. First, it defines the parameters of the overall job, including the name of the file to render, the frame range, how many frames to render per Pod, and the maximum number of parallel Pods.
Next, it auto-generates "slices" to render in parallel on each Pod, defines the type of hardware on which we would like our job to be executed, supplies Blender commands, and finally passes in a custom Python script to ensure we render on GPU.
Some of the workflow steps included in this file may be considered advanced. Comments are included where possible to clarify their purpose.
The complete Workflow file is saved as blender-gpu-render.yaml
, and is now set up to render in parallel using 10
Pods using four NV_Pascal GPUs
per instance.
Click to expand - blender-gpu-render.yaml
apiVersion: argoproj.io/v1alpha1kind: Workflowmetadata:generateName: render-spec:entrypoint: mainparallelism: 10 # Maximum number of parallel pods to run (pods x gpu limit = total GPUs)activeDeadlineSeconds: 864000 # Cancel operation if not finished in 24 hoursttlSecondsAfterFinished: 86400arguments:parameters: # These parameters are available as variables throughout our template.- name: filename # The location of our blend file, /data/ is the root directory of our Filebrowser appvalue: '/data/bmw27_gpu.blend'- name: sliceSize # How many frames to render per pod, let's set it to 1value: 1- name: start # Start frame of entire sequence to rendervalue: 1- name: stop # Stop frame of entire sequence to render, let's render 10value: 10- name: outputLocation # Location to write the output tovalue: "/data/output/bmw27_gpu/"volumes:- name: data-storagepersistentVolumeClaim:claimName: shared-data-pvc # Mounting in our shared data PVCtolerations: # This is here so that our generate slices script only runs on a CPU node.- key: is_cpu_computeoperator: Existstemplates: # This defines the steps in our workflow.- name: mainsteps:- - name: slice # Step to generate frame ranges "slices" to run on each pod.template: gen-slices- - name: rendertemplate: render-blenderarguments:parameters:- name: startvalue: "{{item.start}}"- name: stopvalue: "{{item.stop}}"withParam: "{{steps.slice.outputs.result}}"- name: gen-slices # This is our custom slicing script that runs as bare code in a python container.script:image: python:alpine3.6command: [python]source: |import jsonimport sysframes = range({{workflow.parameters.start}}, {{workflow.parameters.stop}}+1)n = {{workflow.parameters.sliceSize}}slices = [frames[i * n:(i + 1) * n] for i in range((len(frames) + n - 1) // n )]intervals = map(lambda x: {'start': min(x), 'stop': max(x)}, slices)json.dump(list(intervals), sys.stdout)- name: render-blendermetadata:labels:coreweave.com/role: renderinputs:parameters:- name: start- name: stopartifacts: # Artifacts are directly mounted inside the container for use by our program.- name: blender_gpu # We are mounting a python script that ensures all GPUs are used for our render.path: /blender_gpu.py # The python script will be mounted at /blender_gpu.py and accessible by Blender.raw:data: |import bpy# Set GPU renderingbpy.context.scene.cycles.device = 'GPU'bpy.context.preferences.addons['cycles'].preferences.compute_device_type = 'CUDA'# Force turn off progressive refine, since we are not in viewportbpy.context.scene.cycles.use_progressive_refine = False# Enable all available GPUsfor devices in bpy.context.preferences.addons['cycles'].preferences.get_devices():for d in devices:d.use = Trueif d.type == 'CPU':d.use = False# Disable placeholder frame filesbpy.context.scene.render.use_placeholder = False# Force process to over-write existing filesbpy.context.scene.render.use_overwrite = TrueretryStrategy: # It is important that we define retry logic, in case Blender fails. It fails sometimes. Out of nowhere.limit: 1container:image: nytimes/blender:2.82-gpu-ubuntu18.04 # We are using the Docker container graciously provided by NYT.command: ["blender"]workingDir: /# These are the command line arguments that will be supplied to our Blender process, including the python script above.args: ["-b","{{workflow.parameters.filename}}","--engine", "CYCLES","--factory-startup", "-noaudio","--use-extension", "1","-o", "{{workflow.parameters.outputLocation}}","--python", "blender_gpu.py","-s", "{{inputs.parameters.start}}","-e", "{{inputs.parameters.stop}}","-a"]resources: # This is where we request our pod resources.requests:memory: 8Gi # Requesting a minimum of 8GB system ramcpu: 1 # Requesting a minimum of 1 vCPUlimits:cpu: 2 # Requesting a maximum of 2 vCPUnvidia.com/gpu: 4 # Requesting 4 GPUsvolumeMounts:- name: data-storage # Mounting in our PVC as /data so it's accessible to our pod.mountPath: /dataaffinity:nodeAffinity:requiredDuringSchedulingIgnoredDuringExecution:nodeSelectorTerms:- matchExpressions:- key: gpu.nvidia.com/modeloperator: Invalues: # This is where we identify what GPU type we want to run on.- Quadro_RTX_4000
Argo's retry logic is considered a best practice when running rendering in parallel. Due to the constant advancements in CGI rendering platforms and GPU compute, sometimes these things break "for no reason." Retries as defined in the Argo Workflow template will ensure frames are lost due to an unknown cause.
To begin rendering, pass this Workflow file to argo submit
:
$argo submit --watch blender-gpu-render.yaml
Immediately after the command is invoked, the Argo command line will begin processing, while displaying the inputs and the status of the Workflow.
You may see some Unschedulable
warnings at first; this is due to Kubernetes evicting idle containers in order to prepare the specified systems to run.
After about a minute, output similar to the following should be generated:
Name: render-sjf6tNamespace: tenant-testServiceAccount: defaultStatus: SucceededCreated: Fri May 29 22:26:01 -0400 (2 minutes ago)Started: Fri May 29 22:26:01 -0400 (2 minutes ago)Finished: Fri May 29 22:28:12 -0400 (now)Duration: 2 minutes 11 secondsParameters:filename: /data/bmw27_gpu.blendsliceSize: 1start: 1stop: 10outputLocation: /data/output/bmw27_gpu/STEP PODNAME DURATION MESSAGE✔ render-sjf6t (main)├---✔ slice (gen-slices) render-sjf6t-2863198607 3s└-·-✔ render(0:start:1,stop:1)(0) (render-blender) render-sjf6t-1206241518 1m├-✔ render(1:start:2,stop:2)(0) (render-blender) render-sjf6t-2071804633 1m├-✔ render(2:start:3,stop:3)(0) (render-blender) render-sjf6t-2756225068 1m├-✔ render(3:start:4,stop:4)(0) (render-blender) render-sjf6t-2726811839 1m├-✔ render(4:start:5,stop:5)(0) (render-blender) render-sjf6t-3220888738 1m├-✔ render(5:start:6,stop:6)(0) (render-blender) render-sjf6t-3319286957 1m├-✔ render(6:start:7,stop:7)(0) (render-blender) render-sjf6t-577269840 1m├-✔ render(7:start:8,stop:8)(0) (render-blender) render-sjf6t-3336690355 1m├-✔ render(8:start:9,stop:9)(0) (render-blender) render-sjf6t-3980468470 2m└-✔ render(9:start:10,stop:10)(0) (render-blender) render-sjf6t-2756728893 1m
This output shows the status of the 10 frames specified being rendered on 10 different GPU instances with four NV_Pascal GPUs
per instance.
Now, using FileBrowser via the provided Ingress URL, a new folder named outputs
has been generated, along with a subdirectory inside of it that contains the newly rendered frames. In this example, the subdirectory is named bmw27_gpu
.
Using this Argo Workflow as a template or starting point, it is easy to run Blender GPU rendering on thousands of GPUs simultaneously!