The structural building blocks of `roar` and GLaaS. Read top-to-bottom if you're new; every section is also independently scannable.

## Jobs

A **job** is one recorded command execution. Each `roar run`, `roar build`, `roar get`, and `roar put` produces a job. Jobs are the *active* nodes in the DAG — they consume input artifacts and produce output artifacts.

Three job types are distinguished in the record:

- **Run / build jobs** — the standard processing tasks, produced by `roar run` and `roar build`. The bulk of any DAG.
- **Get jobs** — retrieval operations (`roar get`), recording input data that came from outside the workspace.
- **Put jobs** — upload operations (`roar put`), recording outputs that were pushed somewhere external.

The split matters because Get and Put participate in the lineage as ordinary jobs but represent boundary crossings — data entering or leaving the workspace.

Each job record captures:

- A unique ID, the command and arguments
- Files read (input artifacts), files written (output artifacts)
- Execution time and exit status
- OS, hardware (CPU/GPU when available), Python and system packages
- Environment variables that were read
- Git repository, branch, commit, and dirty/clean state

Inspect a job:

```bash
roar show <job-id>
# or:  roar show @<step>          e.g. roar show @5
```

See [Tracers](/docs/tracers) for what the observation layer actually does.

## Artifacts

**Artifacts** are the files (or composite artifacts) that connect jobs. They're the *passive* nodes in the DAG.

Artifacts are identified by their **content hash**, not by:

- filename
- file path
- storage location

If two files have the same bytes, they are the same artifact — even if they live in different places or have different names. The whole DAG model rests on this.

Inspect an artifact:

```bash
roar show <hash>
# or:  roar show <path>
```

See [Hashes](/docs/hashes) for the default algorithm choice and the multi-algorithm storage model, and [Composite Artifacts](/docs/composite-artifacts) for how `roar` represents directory-shaped artifacts (datasets) as single nodes.

## Sessions

On the command line you are always working inside a **session**. A session is the sequence of jobs you've recorded since the last reset, and it's what `roar` derives the local DAG from. Each `roar run` adds another job to the active session.

> **On the CLI, the session and the DAG are effectively the same thing.**
> `roar status` shows the session at a glance; `roar dag` is how you inspect its structure.

You don't "start" a session explicitly — one begins naturally as soon as you run your first command with `roar`. To wipe state and begin a new line of work (without affecting previously registered artifacts):

```bash
roar reset
```

Think of it as: **"start fresh from here."**

## Lineage and DAGs

`roar` tracks a recreatable **lineage** of artifacts. Because the term is shorter, we usually just call it the **DAG**.

An artifact's lineage is the graph of jobs and artifacts that created the artifact. Because job records are rich and artifacts connect job dependencies, the lineage DAG captures a **recipe** for how to recreate that artifact.

In many pipeline tools, jobs are the only nodes. In `roar`, **artifacts are also nodes.** That's essential because, while a run is happening, `roar` cannot know which output files might later become inputs to other jobs — so it records the artifact regardless.

```bash
roar dag                       # the inferred DAG of the active session
roar show <artifact-hash>      # the lineage behind a specific artifact
```

## Labels

**Labels** are user-attached metadata — key/value annotations on a session, job, or artifact when the thing you want to record isn't observable: a metric, a comment, an experiment ID, a human judgment. Where lineage answers *what happened*, labels capture *how it performed* or *what it's for*.

For the full label model, configuration, and how labels become searchable on glaas.ai: [Labels](/docs/labels).

## Registration and GLaaS

When you use `roar` locally, jobs, artifacts, and DAG state live in your workspace's `.roar/` directory. To make an artifact's lineage globally lookup-able — and reproducible by anyone with the hash — register it to GLaaS:

```bash
roar register <artifact>
```

This publishes the artifact, the jobs that produced it, and the chain of upstream inputs as a **registered DAG**. Visibility is governed by the workspace's [scope](/docs/scopes); the artifact's *existence* is global by design, while the metadata, jobs, DAGs, and labels follow the scope rule.

At the end of `roar register`, `roar` prints a `roar reproduce …` command you can use later.

## Global dereferenceability

Once a DAG is registered, you can start from any hash — artifact, job, or session — and navigate **backward** or **forward**:

- **artifact → job(s) → session → other artifacts**
- **job → inputs → upstream jobs**
- **artifact → "Reproduce with roar"**

This works in the CLI (`roar show`, `roar reproduce`) and on glaas.ai via clickable links between artifacts, jobs, and sessions.

---

> For a practical walkthrough, see the [End-to-End Example](/examples/end-to-end-intro).
