Run notebooks on SUNK

Learn how to run interactive notebooks on SUNK

Interactive notebooks provide a powerful environment for data exploration, model development, and visualization on SUNK. This guide covers running notebooks on SUNK's Slurm-managed clusters for both interactive development and batch job execution.

Consider using notebooks on SUNK if:

You need an interactive environment for data exploration or model prototyping.
You want to iterate quickly on code while leveraging GPU resources.
You need to visualize results during development before running full batch jobs.

This guide uses marimo as the primary example. marimo notebooks are pure Python scripts that can run both interactively and as batch jobs. The same port forwarding and container techniques also work for Jupyter notebooks.

Prerequisites

Before completing the steps in this guide, be sure you have the following:

Access to a SUNK cluster with permissions to submit jobs.
Familiarity with Slurm commands, such as srun, sbatch, and squeue.
A shared directory for storing notebooks, such as /mnt/data or your home directory.

Install marimo

You can install marimo within a container or conda environment. The recommended approach is to use containers for reproducibility.

Using containers

To create a container with marimo installed, pull a base Python container and save it as a squash file.

From the login node, pull a Python container and install marimo:

Example

$ srun --container-image=python:3.11-slim \
       --container-remap-root --container-mounts=/mnt/home:/mnt/home \
       --container-save ${HOME}/marimo.sqsh --pty bash -i

Within the container, install marimo:

Example

$ pip install marimo # ... and other packages like torch, jax, etc..

Exit the container to save it. The container is now available at ${HOME}/marimo.sqsh.

For more information about working with containers on SUNK, see the SUNK Training guide.

Using conda

Alternatively, you can create a conda environment with marimo:

Example

$ conda create --name marimo-env python=3.11 pip
$ conda activate marimo-env
$ pip install marimo

Interactive development

For interactive notebook development, you can run marimo in headless mode and connect via port forwarding.

Submit an interactive job

Create a script named run_marimo.sh:

Example

#!/bin/bash
#SBATCH --job-name=marimo
#SBATCH --output=marimo-%j.out
#SBATCH --cpus-per-task=4
#SBATCH --mem=16GB
#SBATCH --time=4:00:00

# Activate your environment (choose one)
# Option 1: Using conda
# eval "$(conda shell.bash hook)"
# conda activate marimo-env

# Option 2: Using container (add --container flags to srun below)

# Start marimo in headless mode
python -m marimo edit /mnt/home/${USER}/notebook.py --headless --host 0.0.0.0 --port 3000

Submit the job:

Example

$ sbatch run_marimo.sh

Connect via port forwarding

Find your compute node using squeue:

Example

$ squeue -u $USER

You should see output similar to:

Example

JOBID  PARTITION  NAME    USER       ST  TIME   NODES  NODELIST(REASON)
1234   h100       marimo  user_name  R   0:30   1      slurm-h100-231-147

Establish an SSH tunnel from your local machine to the compute node through the login node:

Example
```
$ ssh -L 3000:slurm-h100-231-147:3000 username@login-node
```
Replace slurm-h100-231-147 with your actual compute node name.
Access the marimo interface at http://localhost:3000 in your web browser.

Using containers for interactive sessions

To run marimo interactively within a container:

Example

$ srun -p h100 --exclusive \
       --container-image=${HOME}/marimo.sqsh \
       --container-mounts=/mnt/home:/mnt/home \
       --pty bash -c "python -m marimo edit /mnt/home/${USER}/notebook.py --headless --host 0.0.0.0 --port 3000"

Batch job execution

marimo notebooks can run as batch jobs using command-line arguments via mo.cli_args().

Create a notebook for batch execution

Create a marimo notebook that accepts command-line arguments:

Example

import marimo
...
app = marimo.App()

with app.setup:
    import marimo as mo

def _():
    # Access command-line arguments
    args = mo.cli_args()
    learning_rate = float(args.get("learning-rate", 0.01))
    epochs = int(args.get("epochs", 100))
    print(f"Training with learning_rate={learning_rate}, epochs={epochs}")

# Your training code here
...

if __name__ == "__main__":
    app.run()

Submit as a batch job

Create a batch script batch_notebook.sh:

Example

#!/bin/bash
#SBATCH --job-name=marimo-batch
#SBATCH --output=marimo-batch-%j.out
#SBATCH --cpus-per-task=8
#SBATCH --mem=32GB
#SBATCH --time=2:00:00

eval "$(conda shell.bash hook)"
conda activate marimo-env

python /mnt/home/${USER}/notebook.py -- --learning-rate 0.01 --epochs 100

Submit the job:

Example

$ sbatch batch_notebook.sh

GPU configuration

To run notebooks with GPU access, add GPU resources to your SBATCH directives:

Example

#!/bin/bash
#SBATCH --job-name=marimo-gpu
#SBATCH --output=marimo-gpu-%j.out
#SBATCH --partition=h100
#SBATCH --gpus-per-task=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=64GB
#SBATCH --time=4:00:00

eval "$(conda shell.bash hook)"
conda activate marimo-env

python -m marimo edit /mnt/home/${USER}/gpu_notebook.py --headless --host 0.0.0.0 --port 3000

For multi-GPU workloads:

Example

#SBATCH --gpus-per-task=8
#SBATCH --exclusive

VS Code integration

You can use notebooks with VS Code tunnels on SUNK for a more integrated development experience. After setting up a VS Code tunnel to your compute node, install the marimo VS Code extension or use Jupyter notebooks directly within VS Code.

Prerequisites​

Install marimo​

Using containers​

Using conda​

Interactive development​

Submit an interactive job​

Connect via port forwarding​

Using containers for interactive sessions​

Batch job execution​

Create a notebook for batch execution​

Submit as a batch job​

GPU configuration​

VS Code integration​

Prerequisites

Install marimo

Using containers

Using conda

Interactive development

Submit an interactive job

Connect via port forwarding

Using containers for interactive sessions

Batch job execution

Create a notebook for batch execution

Submit as a batch job

GPU configuration

VS Code integration