Skip to main content

Use the Argo Workflows CLI

Use the power of the command line to control Argo Workflows

The Argo Command Line Interface (CLI) is a powerful tool used to create, submit, manage, and monitor workflows directly from the command line. The CLI also allows creating reusable templates, making it easier to define and share common workflow patterns. Many of CoreWeave's example projects use Argo CLI as the primary way to demonstrate different techniques.

Use Argo CLI

To submit a job with the CLI, first verify the CoreWeave Kubernetes environment is set up.

Next, download the latest Argo CLI from the GitHub releases page and follow their Quick Start installation instructions.

To test the installation, run:

argo version

The result should be similar to:

argo: v3.4.7
BuildDate: 2023-04-11T17:19:48Z
GitCommit: f2292647c5a6be2f888447a1fef71445cc05b8fd
GitTreeState: clean
GitTag: v3.4.7
GoVersion: go1.19.7
Compiler: gc
Platform: linux/amd64

Next, create a working folder and a file named workflow.yaml.

$ mkdir example
$ cd example
$ nano workflow.yaml

Expand the section below, and copy/paste the contents into the file.

Click to expand - workflow.yaml
apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
generateName: gpu-say
spec:
entrypoint: main
activeDeadlineSeconds: 300 # Cancel operation if not finished in 5 minutes
ttlStrategy:
secondsAfterCompletion: 86400 # Clean out old workflows after a day
# Parameters can be passed/overridden via the argo CLI.
# To override the printed message, run `argo submit` with the -p option:
# $ argo submit examples/arguments-parameters.yaml -p messages='["CoreWeave", "Is", "Fun"]'
arguments:
parameters:
- name: messages
value: '["Argo", "Is", "Awesome"]'
- name: foo
value: "bar"

templates:
- name: main
steps:
- - name: echo
template: gpu-echo
arguments:
parameters:
- name: message
value: "{{item}}"
withParam: "{{workflow.parameters.messages}}"

- name: gpu-echo
inputs:
parameters:
- name: message
retryStrategy:
limit: 1
script:
image: nvidia/cuda:11.4.1-runtime-ubuntu20.04
command: [bash]
source: |
nvidia-smi
echo "Input was: {{inputs.parameters.message}}"

resources:
requests:
memory: 128Mi
cpu: 500m # Half a core
limits:
nvidia.com/gpu: 1 # Allocate one GPU
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
# This will REQUIRE the Pod to be run on a system with a GPU with 8 or 16GB VRAM
nodeSelectorTerms:
- matchExpressions:
- key: gpu.nvidia.com/vram
operator: In
values:
- "8"
- "16"

Then, submit the workflow file.

argo submit workflow.yaml

Here's the output.

ame:                gpu-say64d9m
Namespace: tenant-96362f-dev
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Pending
Created: Thu May 04 17:38:04 -0400 (now)
Progress:
Parameters:
messages: ["Argo", "Is", "Awesome"]
foo: bar

This workflow uses the JSON array as parameters to spin up three Pods, with one GPU allocated for each, in parallel.

There are two parameters in this file:

  • messages, which is an array of strings
  • foo, which has the value bar

How to override parameters

One technique used often in CoreWeave examples is setting default values in the workflow YAML, and then overriding a few of them on the command line or with a parameters file.

Override parameters on the command line

Override workflow parameters on the command line with the -p option:

argo submit workflow.yaml \
-p messages='["CoreWeave", "Is", "Fun"]' \
-p foo='Something Else'

In this case, the output shows the overridden parameters:

Name:                gpu-sayxgsrv
Namespace: tenant-96362f-dev
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Pending
Created: Thu May 04 17:40:30 -0400 (now)
Progress:
Parameters:
messages: ["CoreWeave", "Is", "Fun"]
foo: Something Else

Override parameters with a YAML file

Another technique is using a YAML file to override parameters. To do that, create params.yaml in the example folder and paste this into it.

messages: ["Use", "Anything", "Here"]
foo: "Another new string"

Then, use the --parameter-file option to pass params.yaml to the workflow.

argo submit workflow.yaml --parameter-file params.yaml

The output shows the new parameters:

Name:                gpu-saygppcq
Namespace: tenant-96362f-dev
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Pending
Created: Thu May 04 17:43:57 -0400 (now)
Progress:
Parameters:
foo: Another new string
messages: ["Use","Anything","Here"]

List workflows

After submitting a job, use the argo command to interact with it.

To find a job's name, use the list command.

argo list

The output shows the name, gpu-sayn5p6w in this case.

NAME             STATUS      AGE   DURATION   PRIORITY   MESSAGE
gpu-sayn5p6w Succeeded 41m 37s 0

View Argo logs

To see the logs for a workflow job, use the logs command with workflow name we fetched earlier. It looks like this:

argo logs gpu-sayn5p6w

Expand the log output below to see the result:

Click to expand - Argo log output
gpu-sayn5p6w-gpu-echo-1273146457: Wed May  3 21:30:56 2023
gpu-sayn5p6w-gpu-echo-1273146457: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-1273146457: | NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
gpu-sayn5p6w-gpu-echo-1273146457: |-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-1273146457: | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
gpu-sayn5p6w-gpu-echo-1273146457: | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
gpu-sayn5p6w-gpu-echo-1273146457: | | | MIG M. |
gpu-sayn5p6w-gpu-echo-1273146457: |===============================+======================+======================|
gpu-sayn5p6w-gpu-echo-1273146457: | 0 NVIDIA RTX A4000 On | 00000000:02:00.0 Off | Off |
gpu-sayn5p6w-gpu-echo-1273146457: | 46% 33C P8 17W / 140W | 65MiB / 16376MiB | 0% Default |
gpu-sayn5p6w-gpu-echo-1273146457: | | | N/A |
gpu-sayn5p6w-gpu-echo-1273146457: +-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-1273146457:
gpu-sayn5p6w-gpu-echo-1273146457: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-1273146457: | Processes: |
gpu-sayn5p6w-gpu-echo-1273146457: | GPU GI CI PID Type Process name GPU Memory |
gpu-sayn5p6w-gpu-echo-1273146457: | ID ID Usage |
gpu-sayn5p6w-gpu-echo-1273146457: |=============================================================================|
gpu-sayn5p6w-gpu-echo-1273146457: | 0 N/A N/A 2181 G 63MiB |
gpu-sayn5p6w-gpu-echo-1273146457: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-1273146457: Input was: Is
gpu-sayn5p6w-gpu-echo-1273146457: time="2023-05-03T21:30:57.804Z" level=info msg="sub-process exited" argo=true error="<nil>"
gpu-sayn5p6w-gpu-echo-3614462693: Wed May 3 21:31:16 2023
gpu-sayn5p6w-gpu-echo-3614462693: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-3614462693: | NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
gpu-sayn5p6w-gpu-echo-3614462693: |-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-3614462693: | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
gpu-sayn5p6w-gpu-echo-3614462693: | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
gpu-sayn5p6w-gpu-echo-3614462693: | | | MIG M. |
gpu-sayn5p6w-gpu-echo-3614462693: |===============================+======================+======================|
gpu-sayn5p6w-gpu-echo-3614462693: | 0 NVIDIA RTX A4000 On | 00000000:01:00.0 Off | Off |
gpu-sayn5p6w-gpu-echo-3614462693: | 47% 33C P8 15W / 140W | 65MiB / 16376MiB | 0% Default |
gpu-sayn5p6w-gpu-echo-3614462693: | | | N/A |
gpu-sayn5p6w-gpu-echo-3614462693: +-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-3614462693:
gpu-sayn5p6w-gpu-echo-3614462693: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-3614462693: | Processes: |
gpu-sayn5p6w-gpu-echo-3614462693: | GPU GI CI PID Type Process name GPU Memory |
gpu-sayn5p6w-gpu-echo-3614462693: | ID ID Usage |
gpu-sayn5p6w-gpu-echo-3614462693: |=============================================================================|
gpu-sayn5p6w-gpu-echo-3614462693: | 0 N/A N/A 2290 G 63MiB |
gpu-sayn5p6w-gpu-echo-3614462693: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-3614462693: Input was: Awesome
gpu-sayn5p6w-gpu-echo-3614462693: time="2023-05-03T21:31:17.624Z" level=info msg="sub-process exited" argo=true error="<nil>"
gpu-sayn5p6w-gpu-echo-2418828045: Wed May 3 21:31:22 2023
gpu-sayn5p6w-gpu-echo-2418828045: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-2418828045: | NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6 |
gpu-sayn5p6w-gpu-echo-2418828045: |-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-2418828045: | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
gpu-sayn5p6w-gpu-echo-2418828045: | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
gpu-sayn5p6w-gpu-echo-2418828045: | | | MIG M. |
gpu-sayn5p6w-gpu-echo-2418828045: |===============================+======================+======================|
gpu-sayn5p6w-gpu-echo-2418828045: | 0 NVIDIA RTX A4000 On | 00000000:C1:00.0 Off | Off |
gpu-sayn5p6w-gpu-echo-2418828045: | 42% 31C P8 13W / 140W | 94MiB / 16376MiB | 0% Default |
gpu-sayn5p6w-gpu-echo-2418828045: | | | N/A |
gpu-sayn5p6w-gpu-echo-2418828045: +-------------------------------+----------------------+----------------------+
gpu-sayn5p6w-gpu-echo-2418828045:
gpu-sayn5p6w-gpu-echo-2418828045: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-2418828045: | Processes: |
gpu-sayn5p6w-gpu-echo-2418828045: | GPU GI CI PID Type Process name GPU Memory |
gpu-sayn5p6w-gpu-echo-2418828045: | ID ID Usage |
gpu-sayn5p6w-gpu-echo-2418828045: |=============================================================================|
gpu-sayn5p6w-gpu-echo-2418828045: | 0 N/A N/A 2218 G 92MiB |
gpu-sayn5p6w-gpu-echo-2418828045: +-----------------------------------------------------------------------------+
gpu-sayn5p6w-gpu-echo-2418828045: Input was: Argo
gpu-sayn5p6w-gpu-echo-2418828045: time="2023-05-03T21:31:23.231Z" level=info msg="sub-process exited" argo=true error="<nil>"

Watch the submission

Use the --watch option to observe the workflow in progress.

argo submit workflow.yaml --watch

If the --watch option wasn't used when submitting the workflow, the watch command is used with the workflow name to observe it.

argo watch gpu-sayn5p6w

In either case, the output looks the same.

Name:                gpu-sayn5p6w
Namespace: tenant-d0b59f-dfinst
ServiceAccount: unset (will run with the default ServiceAccount)
Status: Running
Conditions:
PodRunning False
Created: Wed May 03 17:30:50 -0400 (26 seconds ago)
Started: Wed May 03 17:30:50 -0400 (26 seconds ago)
Duration: 26 seconds
Progress: 1/3
ResourcesDuration: 3s*(100Mi memory),1s*(1 nvidia.com/gpu),3s*(1 cpu)
Parameters:
messages: ["Argo", "Is", "Awesome"]

STEP TEMPLATE PODNAME DURATION MESSAGE
● gpu-sayn5p6w main
└─┬─◷ echo(0:Argo)(0) gpu-echo gpu-sayn5p6w-gpu-echo-2418828045 26s PodInitializing
├─✔ echo(1:Is)(0) gpu-echo gpu-sayn5p6w-gpu-echo-1273146457 6s
└─◷ echo(2:Awesome)(0) gpu-echo gpu-sayn5p6w-gpu-echo-3614462693 25s PodInitializing

Use a specific ServiceAccount

In the previous example, notice that the third line says:

ServiceAccount:      unset (will run with the default ServiceAccount)

If no specific Kubernetes ServiceAccount was declared, the workflow uses the default one for the namespace. To use a specific ServiceAccount, create one and then use the --serviceaccount option by following the steps that follow. The security best practices guide has more details.

First, define a ServiceAccount and assign it some defined Roles. Then, define a RoleBinding to bind them together.

This can be done in a single YAML file. Create my-example-sa.yaml and paste the contents below.

Click to expand - my-example-sa.yaml
my-example-sa.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: my-example
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: my-example-role
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- 'patch'
- apiGroups:
- serving.kubeflow.org
resources:
- inferenceservices
verbs:
- '*'
- apiGroups:
- serving.knative.dev
resources:
- services
- revisions
verbs:
- '*'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: my-example-rolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: my-example-role
subjects:
- kind: ServiceAccount
name: my-example

There are three sections in this file, separated with ---.

  • First, it creates a new ServiceAccount named my-example.
  • Next, it creates a Role named my-example-role with several permissions.
  • Finally, it creates a RoleBinding named my-example-rolebinding which binds the ServiceAccount to the Role.

Create the ServiceAccount

Apply the YAML to the cluster with kubectl.

kubectl apply -f my-example-sa.yaml

The output looks like this:

serviceaccount/my-example created
role.rbac.authorization.k8s.io/my-example-role created
rolebinding.rbac.authorization.k8s.io/my-example-rolebinding created

Use the ServiceAccount

Now, use the new ServiceAccount with Argo, like this:

argo submit workflow.yaml --serviceaccount my-example

The output looks like this:

Name:                gpu-say72dm5
Namespace: tenant-96362f-dev
ServiceAccount: my-example
Status: Pending
Created: Thu May 04 18:24:58 -0400 (now)
Progress:
Parameters:
messages: ["Argo", "Is", "Awesome"]
foo: bar

Notice that the third line now shows the new ServiceAccount name. Using a specific ServiceAccount with limited permissions is a security best practice.

More information

These are only a few of the most common Argo CLI commands used in CoreWeave's examples and demonstrations. For a full list, please see the Argo documentation or see these Argo Workflows resources: