## Installing roar

See [Installation](/docs/installation) for the platform support matrix, prerequisites, tracer-backend requirements, macOS SIP notes, and build-from-source instructions.

## Configuring roar (`roar init`)

In your project repo:

```bash
cd my-ml-project
roar init
```

This:

- creates `.roar/` (local DB + cache)
- offers to add `.roar/` to `.gitignore`
- configures defaults (including the default GLaaS server, if set)

You can inspect and modify configuration via:

```bash
roar config list
roar config get <key>
roar config set <key> <value>
```

## Building (`roar build`)

Use `roar build` for setup steps you want tracked separately from your "main" DAG steps—e.g., compiling extensions or installing local packages.

```bash
roar build pip install -e .
roar build make -j4
```

Build steps can be replayed as part of reproduction when needed.

## Running (`roar run`)

```bash
roar run python train.py --epochs 10 --lr 1e-3
roar run ./scripts/preprocess.sh
roar run torchrun --nproc_per_node=4 train.py
```

`roar` observes file I/O across the full process tree, then updates jobs, artifacts, and the inferred DAG.

## Getting and Putting Data (`roar get` and `roar put`)

While `roar run` is used for standard processing tasks, you can explicitly track data movement into and out of your workspace:

```bash
roar get s3://my-bucket/dataset ./data/
roar put ./model/ s3://my-bucket/models/
```

- `roar get`: Records a **Get** job. This is a retrieval operation that explicitly tracks data being pulled into your workspace as an input artifact.
- `roar put`: Records a **Put** job. This is a storage operation that explicitly tracks data being pushed out as an output artifact.

These commands are especially useful when working with **Composite Artifacts (Datasets)**—directories or collections of files treated as a single tracked unit. By using `roar get` or `roar put`, an entire dataset directory maintains its lineage and content hash identity as it moves into or out of your DAG.

## The current DAG (`roar dag`)

As you iterate, you may re-run commands, overwrite outputs, or change downstream results. `roar` retains history, but `roar dag` shows **what is true now**.

```bash
roar dag
```

The current DAG:

- collapses re-runs
- keeps only the most recent equivalent job
- hides downstream work that depends on overwritten inputs

This is a projection of history, not a deletion of it.

## Setting up authentication with glaas.ai (`roar login`)

To register artifacts under your GLaaS identity (rather than anonymously), log in once:

```bash
roar login
```

This opens a device-code flow in your browser: sign in with GitHub, approve the device, done. The auth state is stored under `~/.config/roar/` so all your `roar` workspaces share it.

After `roar login` the workspace's [scope](/docs/scopes) auto-flips from `anonymous` to `private` if it was still at the init default — so the next `roar register` is access-controlled rather than public-by-default.

To check what's stored:

```bash
roar whoami
```

To clear the auth state:

```bash
roar logout
```

> **Legacy: SSH-key auth (`roar auth`).** An older auth path lets you pair `roar` with a GitHub SSH key registered on glaas.ai (`roar auth key` to print the public key, `roar auth test` to verify). It's still functional and useful for non-interactive environments where the device-code flow is awkward. For interactive setups, `roar login` is the recommended path.

## Registering DAGs with GLaaS (`roar register`)

When you run:

```bash
roar register <path to artifacts>
```

`roar` registers a **Registered DAG**:

- includes the selected artifact(s), which can be single files or Composite Artifacts (Datasets)
- includes upstream jobs and artifacts required to explain them
- forms a self-contained recipe for reproduction

This is what GLaaS stores and makes searchable.

> [!NOTE]
> `roar register ...` can be performed without setting up an account. New workspaces default to the `anonymous` scope, which publishes publicly without attribution and persistence isn't guaranteed. For attributed registration and a real personal namespace, run `roar login` once — this is the recommended workflow. See [Scopes](/docs/scopes).

After registration, use **glaas.ai** to visualize and navigate the Registered DAG by clicking between artifacts, jobs, and sessions.

## Reproduction (`roar reproduce`)

When you run:

```bash
roar reproduce <artifact-hash>
```

`roar` reconstructs a **recipe DAG**:

- describes steps required to recreate an artifact
- contains planned steps, not completed jobs
- may reference artifacts that do not yet exist

As you execute steps:

- planned steps become real jobs
- artifacts are created
- the recipe merges into your active session

On **glaas.ai**, artifact pages provide a "Reproduce with roar" action to generate the reproduce command.