Use the power of the command line to control Argo Workflows
The Argo Command Line Interface (CLI) is a powerful tool used to create, submit, manage, and monitor workflows directly from the command line. The CLI also allows creating reusable templates, making it easier to define and share common workflow patterns. Many of CoreWeave's example projects use Argo CLI as the primary way to demonstrate different techniques.
Next, create a working folder and a file named workflow.yaml.
$ mkdir example
$ cd example
$ nano workflow.yaml
Expand the section below, and copy/paste the contents into the file.
Click to expand - workflow.yaml
apiVersion:argoproj.io/v1alpha1kind:Workflowmetadata:generateName:gpu-sayspec:entrypoint:mainactiveDeadlineSeconds:300# Cancel operation if not finished in 5 minutesttlStrategy:secondsAfterCompletion:86400# Clean out old workflows after a day# Parameters can be passed/overridden via the argo CLI.# To override the printed message, run `argo submit` with the -p option:# $ argo submit examples/arguments-parameters.yaml -p messages='["CoreWeave", "Is", "Fun"]'arguments:parameters: - name:messagesvalue:'["Argo", "Is", "Awesome"]' - name:foovalue:"bar"templates: - name:mainsteps: - - name:echotemplate:gpu-echoarguments:parameters: - name:messagevalue:"{{item}}"withParam:"{{workflow.parameters.messages}}" - name:gpu-echoinputs:parameters: - name:messageretryStrategy:limit:1script:image:nvidia/cuda:11.4.1-runtime-ubuntu20.04command: [bash]source:| nvidia-smi echo "Input was: {{inputs.parameters.message}}" resources:requests:memory:128Micpu:500m# Half a corelimits:nvidia.com/gpu:1# Allocate one GPUaffinity:nodeAffinity:requiredDuringSchedulingIgnoredDuringExecution:# This will REQUIRE the Pod to be run on a system with a GPU with 8 or 16GB VRAMnodeSelectorTerms: - matchExpressions: - key:gpu.nvidia.com/vramoperator:Invalues: - "8" - "16"
This workflow uses the JSON array as parameters to spin up three Pods, with one GPU allocated for each, in parallel.
There are two parameters in this file:
messages, which is an array of strings
foo, which has the value bar
How to override parameters
One technique used often in CoreWeave examples is setting default values in the workflow YAML, and then overriding a few of them on the command line or with a parameters file.
Override parameters on the command line
Override workflow parameters on the command line with the -p option:
Name: gpu-saygppcq
Namespace: tenant-96362f-dev
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Pending
Created: Thu May 04 17:43:57 -0400 (now)
Progress:
Parameters:
foo: Another new string
messages: ["Use","Anything","Here"]
List workflows
After submitting a job, use the argo command to interact with it.
To find a job's name, use the list command.
argolist
The output shows the name, gpu-sayn5p6w in this case.
NAME STATUS AGE DURATION PRIORITY MESSAGE
gpu-sayn5p6w Succeeded 41m 37s 0
View Argo logs
To see the logs for a workflow job, use the logs command with workflow name we fetched earlier. It looks like this:
argologsgpu-sayn5p6w
Expand the log output below to see the result:
Click to expand - Argo log output
gpu-sayn5p6w-gpu-echo-1273146457: Wed May 3 21:30:56 2023
gpu-sayn5p6w-gpu-echo-1273146457: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-1273146457: | NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
gpu-sayn5p6w-gpu-echo-1273146457: |-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-1273146457: | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
gpu-sayn5p6w-gpu-echo-1273146457: | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
gpu-sayn5p6w-gpu-echo-1273146457: | | | MIG M. |
gpu-sayn5p6w-gpu-echo-1273146457: |===============================+======================+======================|
gpu-sayn5p6w-gpu-echo-1273146457: | 0 NVIDIA RTX A4000 On | 00000000:02:00.0 Off | Off |
gpu-sayn5p6w-gpu-echo-1273146457: | 46% 33C P8 17W / 140W | 65MiB / 16376MiB | 0% Default |
gpu-sayn5p6w-gpu-echo-1273146457: | | | N/A |
gpu-sayn5p6w-gpu-echo-1273146457: +-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-1273146457:
gpu-sayn5p6w-gpu-echo-1273146457: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-1273146457: | Processes: |
gpu-sayn5p6w-gpu-echo-1273146457: | GPU GI CI PID Type Process name GPU Memory |
gpu-sayn5p6w-gpu-echo-1273146457: | ID ID Usage |
gpu-sayn5p6w-gpu-echo-1273146457: |=============================================================================|
gpu-sayn5p6w-gpu-echo-1273146457: | 0 N/A N/A 2181 G 63MiB |
gpu-sayn5p6w-gpu-echo-1273146457: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-1273146457: Input was: Is
gpu-sayn5p6w-gpu-echo-1273146457: time="2023-05-03T21:30:57.804Z" level=info msg="sub-process exited" argo=true error="<nil>"
gpu-sayn5p6w-gpu-echo-3614462693: Wed May 3 21:31:16 2023
gpu-sayn5p6w-gpu-echo-3614462693: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-3614462693: | NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
gpu-sayn5p6w-gpu-echo-3614462693: |-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-3614462693: | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
gpu-sayn5p6w-gpu-echo-3614462693: | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
gpu-sayn5p6w-gpu-echo-3614462693: | | | MIG M. |
gpu-sayn5p6w-gpu-echo-3614462693: |===============================+======================+======================|
gpu-sayn5p6w-gpu-echo-3614462693: | 0 NVIDIA RTX A4000 On | 00000000:01:00.0 Off | Off |
gpu-sayn5p6w-gpu-echo-3614462693: | 47% 33C P8 15W / 140W | 65MiB / 16376MiB | 0% Default |
gpu-sayn5p6w-gpu-echo-3614462693: | | | N/A |
gpu-sayn5p6w-gpu-echo-3614462693: +-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-3614462693:
gpu-sayn5p6w-gpu-echo-3614462693: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-3614462693: | Processes: |
gpu-sayn5p6w-gpu-echo-3614462693: | GPU GI CI PID Type Process name GPU Memory |
gpu-sayn5p6w-gpu-echo-3614462693: | ID ID Usage |
gpu-sayn5p6w-gpu-echo-3614462693: |=============================================================================|
gpu-sayn5p6w-gpu-echo-3614462693: | 0 N/A N/A 2290 G 63MiB |
gpu-sayn5p6w-gpu-echo-3614462693: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-3614462693: Input was: Awesome
gpu-sayn5p6w-gpu-echo-3614462693: time="2023-05-03T21:31:17.624Z" level=info msg="sub-process exited" argo=true error="<nil>"
gpu-sayn5p6w-gpu-echo-2418828045: Wed May 3 21:31:22 2023
gpu-sayn5p6w-gpu-echo-2418828045: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-2418828045: | NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
gpu-sayn5p6w-gpu-echo-2418828045: |-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-2418828045: | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
gpu-sayn5p6w-gpu-echo-2418828045: | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
gpu-sayn5p6w-gpu-echo-2418828045: | | | MIG M. |
gpu-sayn5p6w-gpu-echo-2418828045: |===============================+======================+======================|
gpu-sayn5p6w-gpu-echo-2418828045: | 0 NVIDIA RTX A4000 On | 00000000:C1:00.0 Off | Off |
gpu-sayn5p6w-gpu-echo-2418828045: | 42% 31C P8 13W / 140W | 94MiB / 16376MiB | 0% Default |
gpu-sayn5p6w-gpu-echo-2418828045: | | | N/A |
gpu-sayn5p6w-gpu-echo-2418828045: +-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-2418828045:
gpu-sayn5p6w-gpu-echo-2418828045: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-2418828045: | Processes: |
gpu-sayn5p6w-gpu-echo-2418828045: | GPU GI CI PID Type Process name GPU Memory |
gpu-sayn5p6w-gpu-echo-2418828045: | ID ID Usage |
gpu-sayn5p6w-gpu-echo-2418828045: |=============================================================================|
gpu-sayn5p6w-gpu-echo-2418828045: | 0 N/A N/A 2218 G 92MiB |
gpu-sayn5p6w-gpu-echo-2418828045: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-2418828045: Input was: Argo
gpu-sayn5p6w-gpu-echo-2418828045: time="2023-05-03T21:31:23.231Z" level=info msg="sub-process exited" argo=true error="<nil>"
Watch the submission
Use the --watch option to observe the workflow in progress.
argo submit workflow.yaml --watch
If the --watch option wasn't used when submitting the workflow, the watch command is used with the workflow name to observe it.
In the previous example, notice that the third line says:
ServiceAccount: unset (will run with the default ServiceAccount)
If no specific Kubernetes ServiceAccount was declared, the workflow uses the default one for the namespace. To use a specific ServiceAccount, create one and then use the --serviceaccount option by following the steps that follow. The security best practices guide has more details.
First, define a ServiceAccount and assign it some defined Roles. Then, define a RoleBinding to bind them together.
This can be done in a single YAML file. Create my-example-sa.yaml and paste the contents below.
Notice that the third line now shows the new ServiceAccount name. Using a specific ServiceAccount with limited permissions is a security best practice.
More information
These are only a few of the most common Argo CLI commands used in CoreWeave's examples and demonstrations. For a full list, please see the Argo documentation or see these Argo Workflows resources: