## Lineage without orchestration

Modern ML teams move fast — often directly from the command line. That speed is powerful, but it comes with a cost: **work becomes hard to trace**, results are difficult to reproduce, and context is hard to share with others.

`roar` and **GLaaS** exist to help, without asking you to slow down, change how you work, or write more YAML. They do this by making **context automatic**.

The promise is simple:

- Install `roar`
- Put `roar run` in front of a command
- Keep working as you normally do

Everything else is derived from what actually happened. See [Quick Start](/docs/) for the install and first-run walkthrough.

## What roar is

`roar` is a CLI that adds **implicit observation** to your existing workflows. You run the same commands you already run — training scripts, preprocessing jobs, shell scripts — but prefixed with `roar run`. From that point on, `roar` quietly records:

- The command and its parameters
- Which files were read and written
- Which artifacts were created
- The git repository, branch, and commit (and whether the working tree was dirty)
- Package dependencies and environment variables
- Execution time, exit status, and errors

You don't declare inputs. You don't define pipelines. You don't write config files.

`roar` **infers inputs, outputs, and the entire DAG from observing runs.**

## The core idea: two "what ifs"

### What if observation were implicit?

What if, when you ran a command, you could automatically capture what it read, what it wrote, and use that to infer how multiple commands depended on each other — without declaring any of it?

That's what `roar run` does. It *observes* data as it flows through a pipeline. No declarations. No orchestration.

### What if artifacts were dereferenceable?

What if you had a great big lookup table from content hashes to their lineage? Then you could answer:

- Where did this come from? (Even if the chain of custody is broken.)
- What data did it depend on?
- Which code and parameters produced it?

You'd be able to trust models even when the original pipeline, logs, or orchestration context are gone — because the model itself carries a dereferenceable link back to how it was produced.

This becomes possible when artifacts are tracked by **content hashes**, not filenames. That's where GLaaS comes in.

## What GLaaS is

**GLaaS (Global Lineage-as-a-Service)** is a content-addressable **lineage registry**.

When you run `roar register`, GLaaS records:

- The artifact's hash
- The job(s) that produced it
- The inputs it depended on
- The surrounding context (commit, environment, packages)

> GLaaS never stores your artifacts.
> It only stores **how they were created**.

This means:

- No storage lock-in
- No copying your data
- No replacing your existing object stores

GLaaS gives you the ability to **look up provenance globally by hash** — like a time machine for models and data. You don't need a chain-of-custody guarantee on a binary to know where it came from. Put it anywhere, name it anything; as long as the bytes don't change, the lineage is reachable.

The CLI is great for day-to-day work; the GLaaS website is great for navigation and visualization — paste a hash (artifact, job, or session) into the search bar at **glaas.ai** and click through artifact → job → DAG → other artifacts. See the [roar Guide](/docs/roar-guide) for the registration walkthrough.

## Safety without friction

`roar` is designed to support **self-governance**, not enforcement. It records git state and warns about uncommitted changes, with optional hygiene rules (e.g., committed-code-only) you can opt into. The goal: make the *happy path* the *compliant path* — without slowing builders down.

## What roar and GLaaS are not

- Not an experiment tracker
- Not a workflow orchestrator
- Not a model registry that stores your artifacts
- Not a replacement for git, cloud storage, or training frameworks
- Not a system that requires declarations to be useful

They are tools for carrying context forward, built on **implicit observation**, not ceremony.

## Why this matters

Together, `roar` and GLaaS build good practices into your AI development workflow:

- **Traceability** — follow models and data across their full lifecycle
- **Reproducibility** — recreate results later without guesswork
- **Attributability** — understand what inputs and decisions drove outcomes
- **Collaboration** — reason together using shared context instead of private state

Or more simply: **they make audits trivial, recovery fast, and coordination cheap.**

## Where to go next

- [Quick Start](/docs/) — install and run your first commands
- [Core Concepts](/docs/core-concepts) — jobs, artifacts, sessions, DAGs
- [Common Use Cases](/docs/use-cases) — workflows for real problems
- [FAQ](/docs/faq) — implementation details and edge cases