Guide to Using roar from Start to Finish
On this page
Installing roar
See Installation for the platform support matrix, prerequisites, tracer-backend requirements, macOS SIP notes, and build-from-source instructions.
Configuring roar (roar init)
In your project repo:
cd my-ml-project
roar init
This:
- creates
.roar/(local DB + cache) - offers to add
.roar/to.gitignore - configures defaults (including the default GLaaS server, if set)
You can inspect and modify configuration via:
roar config list
roar config get <key>
roar config set <key> <value>
Building (roar build)
Use roar build for setup steps you want tracked separately from your "main" DAG steps—e.g., compiling extensions or installing local packages.
roar build pip install -e .
roar build make -j4
Build steps can be replayed as part of reproduction when needed.
Running (roar run)
roar run python train.py --epochs 10 --lr 1e-3
roar run ./scripts/preprocess.sh
roar run torchrun --nproc_per_node=4 train.py
roar observes file I/O across the full process tree, then updates jobs, artifacts, and the inferred DAG.
Getting and Putting Data (roar get and roar put)
While roar run is used for standard processing tasks, you can explicitly track data movement into and out of your workspace:
roar get s3://my-bucket/dataset ./data/
roar put ./model/ s3://my-bucket/models/
roar get: Records a Get job. This is a retrieval operation that explicitly tracks data being pulled into your workspace as an input artifact.roar put: Records a Put job. This is a storage operation that explicitly tracks data being pushed out as an output artifact.
These commands are especially useful when working with Composite Artifacts (Datasets)—directories or collections of files treated as a single tracked unit. By using roar get or roar put, an entire dataset directory maintains its lineage and content hash identity as it moves into or out of your DAG.
The current DAG (roar dag)
As you iterate, you may re-run commands, overwrite outputs, or change downstream results. roar retains history, but roar dag shows what is true now.
roar dag
The current DAG:
- collapses re-runs
- keeps only the most recent equivalent job
- hides downstream work that depends on overwritten inputs
This is a projection of history, not a deletion of it.
Setting up authentication with glaas.ai (roar login)
To register artifacts under your GLaaS identity (rather than anonymously), log in once:
roar login
This opens a device-code flow in your browser: sign in with GitHub, approve the device, done. The auth state is stored under ~/.config/roar/ so all your roar workspaces share it.
After roar login the workspace's scope auto-flips from anonymous to private if it was still at the init default — so the next roar register is access-controlled rather than public-by-default.
To check what's stored:
roar whoami
To clear the auth state:
roar logout
Legacy: SSH-key auth (
roar auth). An older auth path lets you pairroarwith a GitHub SSH key registered on glaas.ai (roar auth keyto print the public key,roar auth testto verify). It's still functional and useful for non-interactive environments where the device-code flow is awkward. For interactive setups,roar loginis the recommended path.
Registering DAGs with GLaaS (roar register)
When you run:
roar register <path to artifacts>
roar registers a Registered DAG:
- includes the selected artifact(s), which can be single files or Composite Artifacts (Datasets)
- includes upstream jobs and artifacts required to explain them
- forms a self-contained recipe for reproduction
This is what GLaaS stores and makes searchable.
!NOTE
roar register ...can be performed without setting up an account. New workspaces default to theanonymousscope, which publishes publicly without attribution and persistence isn't guaranteed. For attributed registration and a real personal namespace, runroar loginonce — this is the recommended workflow. See Scopes.
After registration, use glaas.ai to visualize and navigate the Registered DAG by clicking between artifacts, jobs, and sessions.
Reproduction (roar reproduce)
When you run:
roar reproduce <artifact-hash>
roar reconstructs a recipe DAG:
- describes steps required to recreate an artifact
- contains planned steps, not completed jobs
- may reference artifacts that do not yet exist
As you execute steps:
- planned steps become real jobs
- artifacts are created
- the recipe merges into your active session
On glaas.ai, artifact pages provide a "Reproduce with roar" action to generate the reproduce command.