How to bash into my workloads

This section explains how to bash into your workloads.

NVIDIA run:ai is a Kubernetes based platform, every workload you submit at the end is a batch of containers, pods in Kubernetes. You can bash into your workloads / pods using run:ai CLI. You can find instructions for installation and configuration of run:ai CLI here > Installing run:ai.

List your workloads:

runai workload list

Bash into workload:

runai <workload_type> bash <workload_name>
runai workspace bash jupypter-lab-1
runai training bash train-workload-1
runai inference bash inference-1

Bash into you distributed training workload:

runai training pytorch bash training-cluster --pod training-cluster-master-0

Important: Containers in Kubernetes are stateless by default. This means that any changes made inside a running container (such as modifying files or installing software) will be lost when the container is stopped and started again. When a container is restarted, it is recreated from the original image, and no previous runtime changes are preserved. To retain data or state, you must use external storage mechanisms such as Persistent Volumes, ConfigMaps, or Secrets.