Use the Argo Workflows CLI
Use the power of the command line to control Argo Workflows
The Argo Command Line Interface (CLI) is a powerful tool used to create, submit, manage, and monitor workflows directly from the command line. The CLI also allows creating reusable templates, making it easier to define and share common workflow patterns. Many of CoreWeave's example projects use Argo CLI as the primary way to demonstrate different techniques.
Next, download the latest Argo CLI from the GitHub releases page and follow their Quick Start installation instructions.
To test the installation, run:
argo version
The result should be similar to:
argo: v3.4.7
BuildDate: 2023-04-11T17:19:48Z
GitCommit: f2292647c5a6be2f888447a1fef71445cc05b8fd
GitTreeState: clean
GitTag: v3.4.7
GoVersion: go1.19.7
Compiler: gc
Platform: linux/amd64
Next, create a working folder and a file named
workflow.yaml
.$ mkdir example
$ cd example
$ nano workflow.yaml
Expand the section below, and copy/paste the contents into the file.
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: gpu-say
spec:
entrypoint: main
activeDeadlineSeconds: 300 # Cancel operation if not finished in 5 minutes
ttlStrategy:
secondsAfterCompletion: 86400 # Clean out old workflows after a day
# Parameters can be passed/overridden via the argo CLI.
# To override the printed message, run `argo submit` with the -p option:
# $ argo submit examples/arguments-parameters.yaml -p messages='["CoreWeave", "Is", "Fun"]'
arguments:
parameters:
- name: messages
value: '["Argo", "Is", "Awesome"]'
- name: foo
value: "bar"
templates:
- name: main
steps:
- - name: echo
template: gpu-echo
arguments:
parameters:
- name: message
value: "{{item}}"
withParam: "{{workflow.parameters.messages}}"
- name: gpu-echo
inputs:
parameters:
- name: message
retryStrategy:
limit: 1
script:
image: nvidia/cuda:11.4.1-runtime-ubuntu20.04
command: [bash]
source: |
nvidia-smi
echo "Input was: {{inputs.parameters.message}}"
resources:
requests:
memory: 128Mi
cpu: 500m # Half a core
limits:
nvidia.com/gpu: 1 # Allocate one GPU
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
# This will REQUIRE the Pod to be run on a system with a GPU with 8 or 16GB VRAM
nodeSelectorTerms:
- matchExpressions:
- key: gpu.nvidia.com/vram
operator: In
values:
- "8"
- "16"
Then, submit the workflow file.
argo submit workflow.yaml
Here's the output.
ame: gpu-say64d9m
Namespace: tenant-96362f-dev
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Pending
Created: Thu May 04 17:38:04 -0400 (now)
Progress:
Parameters:
messages: ["Argo", "Is", "Awesome"]
foo: bar
This workflow uses the JSON array as parameters to spin up three Pods, with one GPU allocated for each, in parallel.
There are two parameters in this file:
messages
, which is an array of stringsfoo
, which has the valuebar
One technique used often in CoreWeave examples is setting default values in the workflow YAML, and then overriding a few of them on the command line or with a parameters file.
Override workflow parameters on the command line with the
-p
option:argo submit workflow.yaml \
-p messages='["CoreWeave", "Is", "Fun"]' \
-p foo='Something Else'
In this case, the output shows the overridden parameters:
Name: gpu-sayxgsrv
Namespace: tenant-96362f-dev
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Pending
Created: Thu May 04 17:40:30 -0400 (now)
Progress:
Parameters:
messages: ["CoreWeave", "Is", "Fun"]
foo: Something Else
Another technique is using a YAML file to override parameters. To do that, create
params.yaml
in the example folder and paste this into it.messages: ["Use", "Anything", "Here"]
foo: "Another new string"
Then, use the
--parameter-file
option to pass params.yaml
to the workflow.argo submit workflow.yaml --parameter-file params.yaml
The output shows the new parameters:
Name: gpu-saygppcq
Namespace: tenant-96362f-dev
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Pending
Created: Thu May 04 17:43:57 -0400 (now)
Progress:
Parameters:
foo: Another new string
messages: ["Use","Anything","Here"]
After submitting a job, use the
argo
command to interact with it. To find a job's name, use the
list
command.argo list
The output shows the name,
gpu-sayn5p6w
in this case.NAME STATUS AGE DURATION PRIORITY MESSAGE
gpu-sayn5p6w Succeeded 41m 37s 0
To see the logs for a workflow job, use the
logs
command with workflow name we fetched earlier. It looks like this:argo logs gpu-sayn5p6w
Expand the log output below to see the result:
gpu-sayn5p6w-gpu-echo-1273146457: Wed May 3 21:30:56 2023
gpu-sayn5p6w-gpu-echo-1273146457: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-1273146457: | NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
gpu-sayn5p6w-gpu-echo-1273146457: |-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-1273146457: | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
gpu-sayn5p6w-gpu-echo-1273146457: | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
gpu-sayn5p6w-gpu-echo-1273146457: | | | MIG M. |
gpu-sayn5p6w-gpu-echo-1273146457: |===============================+======================+======================|
gpu-sayn5p6w-gpu-echo-1273146457: | 0 NVIDIA RTX A4000 On | 00000000:02:00.0 Off | Off |
gpu-sayn5p6w-gpu-echo-1273146457: | 46% 33C P8 17W / 140W | 65MiB / 16376MiB | 0% Default |
gpu-sayn5p6w-gpu-echo-1273146457: | | | N/A |
gpu-sayn5p6w-gpu-echo-1273146457: +-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-1273146457:
gpu-sayn5p6w-gpu-echo-1273146457: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-1273146457: | Processes: |
gpu-sayn5p6w-gpu-echo-1273146457: | GPU GI CI PID Type Process name GPU Memory |
gpu-sayn5p6w-gpu-echo-1273146457: | ID ID Usage |
gpu-sayn5p6w-gpu-echo-1273146457: |=============================================================================|
gpu-sayn5p6w-gpu-echo-1273146457: | 0 N/A N/A 2181 G 63MiB |
gpu-sayn5p6w-gpu-echo-1273146457: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-1273146457: Input was: Is
gpu-sayn5p6w-gpu-echo-1273146457: time="2023-05-03T21:30:57.804Z" level=info msg="sub-process exited" argo=true error="<nil>"
gpu-sayn5p6w-gpu-echo-3614462693: Wed May 3 21:31:16 2023
gpu-sayn5p6w-gpu-echo-3614462693: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-3614462693: | NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
gpu-sayn5p6w-gpu-echo-3614462693: |-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-3614462693: | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
gpu-sayn5p6w-gpu-echo-3614462693: | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
gpu-sayn5p6w-gpu-echo-3614462693: | | | MIG M. |
gpu-sayn5p6w-gpu-echo-3614462693: |===============================+======================+======================|
gpu-sayn5p6w-gpu-echo-3614462693: | 0 NVIDIA RTX A4000 On | 00000000:01:00.0 Off | Off |
gpu-sayn5p6w-gpu-echo-3614462693: | 47% 33C P8 15W / 140W | 65MiB / 16376MiB | 0% Default |
gpu-sayn5p6w-gpu-echo-3614462693: | | | N/A |
gpu-sayn5p6w-gpu-echo-3614462693: +-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-3614462693:
gpu-sayn5p6w-gpu-echo-3614462693: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-3614462693: | Processes: |
gpu-sayn5p6w-gpu-echo-3614462693: | GPU GI CI PID Type Process name GPU Memory |
gpu-sayn5p6w-gpu-echo-3614462693: | ID ID Usage |
gpu-sayn5p6w-gpu-echo-3614462693: |=============================================================================|
gpu-sayn5p6w-gpu-echo-3614462693: | 0 N/A N/A 2290 G 63MiB |
gpu-sayn5p6w-gpu-echo-3614462693: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-3614462693: Input was: Awesome
gpu-sayn5p6w-gpu-echo-3614462693: time="2023-05-03T21:31:17.624Z" level=info msg="sub-process exited" argo=true error="<nil>"
gpu-sayn5p6w-gpu-echo-2418828045: Wed May 3 21:31:22 2023
gpu-sayn5p6w-gpu-echo-2418828045: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-2418828045: | NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
gpu-sayn5p6w-gpu-echo-2418828045: |-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-2418828045: | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
gpu-sayn5p6w-gpu-echo-2418828045: | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
gpu-sayn5p6w-gpu-echo-2418828045: | | | MIG M. |
gpu-sayn5p6w-gpu-echo-2418828045: |===============================+======================+======================|
gpu-sayn5p6w-gpu-echo-2418828045: | 0 NVIDIA RTX A4000 On | 00000000:C1:00.0 Off | Off |
gpu-sayn5p6w-gpu-echo-2418828045: | 42% 31C P8 13W / 140W | 94MiB / 16376MiB | 0% Default |
gpu-sayn5p6w-gpu-echo-2418828045: | | | N/A |
gpu-sayn5p6w-gpu-echo-2418828045: +-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-2418828045:
gpu-sayn5p6w-gpu-echo-2418828045: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-2418828045: | Processes: |
gpu-sayn5p6w-gpu-echo-2418828045: | GPU GI CI PID Type Process name GPU Memory |
gpu-sayn5p6w-gpu-echo-2418828045: | ID ID Usage |
gpu-sayn5p6w-gpu-echo-2418828045: |=============================================================================|
gpu-sayn5p6w-gpu-echo-2418828045: | 0 N/A N/A 2218 G 92MiB |
gpu-sayn5p6w-gpu-echo-2418828045: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-2418828045: Input was: Argo
gpu-sayn5p6w-gpu-echo-2418828045: time="2023-05-03T21:31:23.231Z" level=info msg="sub-process exited" argo=true error="<nil>"
Use the
--watch
option to observe the workflow in progress.argo submit workflow.yaml --watch
If the
--watch
option wasn't used when submitting the workflow, the watch
command is used with the workflow name to observe it.argo watch gpu-sayn5p6w
In either case, the output looks the same.
Name: gpu-sayn5p6w
Namespace: tenant-d0b59f-dfinst
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Running
Conditions:
PodRunning False
Created: Wed May 03 17:30:50 -0400 (26 seconds ago)
Started: Wed May 03 17:30:50 -0400 (26 seconds ago)
Duration: 26 seconds
Progress: 1/3
ResourcesDuration: 3s*(100Mi memory),1s*(1 nvidia.com/gpu),3s*(1 cpu)
Parameters:
messages: ["Argo", "Is", "Awesome"]
STEP TEMPLATE PODNAME DURATION MESSAGE
● gpu-sayn5p6w main
└─┬─◷ echo(0:Argo)(0) gpu-echo gpu-sayn5p6w-gpu-echo-2418828045 26s PodInitializing
├─✔ echo(1:Is)(0) gpu-echo gpu-sayn5p6w-gpu-echo-1273146457 6s
└─◷ echo(2:Awesome)(0) gpu-echo gpu-sayn5p6w-gpu-echo-3614462693 25s PodInitializing
In the previous example, notice that the third line says:
ServiceAccount: unset (will run with the default ServiceAccount)
If no specific Kubernetes ServiceAccount was declared, the workflow uses the default one for the namespace. To use a specific ServiceAccount, create one and then use the
--serviceaccount
option by following the steps that follow. The security best practices guide has more details.First, define a ServiceAccount and assign it some defined Roles. Then, define a RoleBinding to bind them together.
This can be done in a single YAML file. Create
my-example-sa.yaml
and paste the contents below.my-example-sa.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-example
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: my-example-role
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- 'patch'
- apiGroups:
- serving.kubeflow.org
resources:
- inferenceservices
verbs:
- '*'
- apiGroups:
- serving.knative.dev
resources:
- services
- revisions
verbs:
- '*'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: my-example-rolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: my-example-role
subjects:
- kind: ServiceAccount
name: my-example
There are three sections in this file, separated with
---
. - First, it creates a new ServiceAccount named
my-example
. - Next, it creates a Role named
my-example-role
with several permissions. - Finally, it creates a RoleBinding named
my-example-rolebinding
which binds the ServiceAccount to the Role.
Apply the YAML to the cluster with
kubectl
.kubectl apply -f my-example-sa.yaml
The output looks like this:
serviceaccount/my-example created
role.rbac.authorization.k8s.io/my-example-role created
rolebinding.rbac.authorization.k8s.io/my-example-rolebinding created
Now, use the new ServiceAccount with Argo, like this:
argo submit workflow.yaml --serviceaccount my-example
The output looks like this:
Name: gpu-say72dm5
Namespace: tenant-96362f-dev
ServiceAccount: my-example
Status: Pending
Created: Thu May 04 18:24:58 -0400 (now)
Progress:
Parameters:
messages: ["Argo", "Is", "Awesome"]
foo: bar
Notice that the third line now shows the new ServiceAccount name. Using a specific ServiceAccount with limited permissions is a security best practice.
These are only a few of the most common Argo CLI commands used in CoreWeave's examples and demonstrations. For a full list, please see the Argo documentation or see these Argo Workflows resources:
Last modified 4mo ago