```bash
roar telemetry --status      # show what's set on this machine
roar telemetry --print       # show the exact JSON that would be uploaded
roar telemetry --disable     # turn it off, persistently
```

`DO_NOT_TRACK=1` or `ROAR_NO_TELEMETRY=1` in your shell environment disables it too. Once off, no foreground command queues anything; the background uploader exits early.

## What product telemetry is

Anonymous, bounded, milestone-based stats about how `roar` is being used — coarse usage counters, the tracer your install resolves to, basic capability flags, version, and platform class. Local-first: foreground commands never block on network calls. A detached background uploader handles transmission. Designed to be inspectable end-to-end and easy to turn off.

## What's collected

The payload is built by a strict allowlist in `roar/telemetry/payload.py`. The top-level fields are:

| Field | What it carries |
|---|---|
| `payload_schema_version` | Integer; bumps when the payload shape changes. |
| `stats_schema_version` | Integer; same idea for the on-disk stats file. |
| `event_id` | Random UUID per event — for the uploader's dedup, not identity. |
| `install_id` | Random UUID generated once at first roar install on this machine. Not derived from anything stable about you. |
| `sequence` | Monotonic counter for events from this install. |
| `first_seen` | Timestamp of the first event from this install. |
| `trigger` | Which milestone fired this event (see the next section). |
| `trigger_ts` | Timestamp the trigger fired. |
| `counters_as_of` | Timestamp of the most recent counter update. |
| `version` | `roar` version string (e.g. `0.2.12`). |
| `platform` | OS class — `linux` / `macos` / `windows` — plus architecture (`x86_64`, `arm64`). Not hostname, not distribution name. |
| `subcommand_counts` | Counters for which subcommands you've run. Allowlisted set: `run`, `build`, `init`, `reset`, `register`, `show`, `dag`, `log`, `status`, `get`, `put`, `inputs`, `diff`, `lineage`, plus per-outcome counters for `run` (`run_success`, `run_failed_dirty`, `run_failed_tracer_setup`, …). |
| `tracer_selections` | Counters for which tracer backend ended up active. Allowlisted set: `preload`, `ptrace`, `ebpf`, `native`, `auto`, `disabled`, `unknown`. |
| `feature_capabilities` | Boolean flags about what's available on this install: blake3 installed, eBPF kernel support, proxy enabled, Ray enabled, that sort of thing. |

Run `roar telemetry --print` to see the exact JSON for *your* machine before any upload happens.

## What's *not* collected

```
no commands you ran (`python train.py …` etc.)
no command arguments or flag values
no file paths, file names, or file contents
no environment variables
no repository names, remotes, or branch names
no hostnames or usernames
no IP addresses (beyond what TCP itself reveals to the endpoint)
no lineage payloads, artifact hashes, DAG structures, job IDs
no GLaaS auth tokens or any account identifiers
no stdout or stderr from your runs
no experiment-tracker metadata (wandb / mlflow / etc.)
```

This is the full negative list as enforced by the allowlist in `payload.py` — there is no exception path that bypasses it.

## When telemetry uploads

Telemetry events are emitted on a bounded set of milestones, not on every command:

- **First time `roar` runs after install** (`first_seen_for_install_id`).
- **First time `roar` runs after upgrading to a new version** (`first_seen_for_version`).
- **First, second, and third successful `roar run`** ever, per install.
- **Successful `roar init`**.
- **Successful `roar reset`**.
- **Non-dry-run `roar register`**.

After the per-version and first-N-runs milestones fire, the foreground CLI is essentially silent for that version. There is no per-command upload, no heartbeat, no periodic sync.

## No foreground network

Foreground command paths never block on telemetry network I/O. The sequence is:

1. A milestone fires.
2. The CLI writes an event to a local queue directory.
3. The CLI exits.

The background uploader is a separately-spawned, detached process. It reads the queue, posts events to the endpoint, and **only deletes a queued event after a `204 No Content` response**. If the endpoint is unreachable or returns any other status, the event stays on disk until the next attempt.

This means: a slow or down telemetry endpoint cannot slow down your `roar` commands.

## Where state lives on disk

XDG-aware. The paths (see `roar/telemetry/paths.py`):

| Location | What it holds |
|---|---|
| `$XDG_CONFIG_HOME/roar/global.toml` | The global config file — where `--enable`/`--disable`/`--endpoint` persist. |
| `$XDG_CACHE_HOME/roar/telemetry/stats.json` | Stats counters (the source of `subcommand_counts`, `tracer_selections`, etc.). |
| `$XDG_CACHE_HOME/roar/telemetry/queue/` | Per-event JSON files awaiting upload. |
| `$XDG_CACHE_HOME/roar/telemetry/.lock` | Lock file to serialize uploader runs. |

You can delete any of these by hand. `roar telemetry --disable` wipes the queue and stops further events from being written.

## Inspecting before uploading

`roar telemetry --print` runs the same payload builder the uploader uses and writes the resulting JSON to stdout. It's the trust-but-verify path: the printed JSON is *exactly* what would be uploaded for a `preview` trigger, byte-for-byte, no summarization.

```bash
roar telemetry --print
```

The `preview` trigger is explicitly excluded from `UPLOAD_TRIGGERS` — nothing is queued or sent by running this command.

## Opt-out controls

Three paths, all equivalent in effect:

| Method | Scope | Notes |
|---|---|---|
| `roar telemetry --disable` | Persistent, per-user | Writes to the global config file above. Survives reboots and upgrades. |
| `DO_NOT_TRACK=1` | Per-shell | Respects the [Console Do Not Track](https://consoledonottrack.com/) convention. Useful as a session-wide opt-out. |
| `ROAR_NO_TELEMETRY=1` | Per-shell | Explicit roar-specific opt-out. Same effect as `DO_NOT_TRACK=1`. |

When any of these is in effect, the foreground CLI stops queuing new events and the background uploader exits without contacting the endpoint.

## Auto-suppression

Telemetry never fires in these environments — no explicit opt-out needed:

- **CI environments**, detected via standard env vars (`CI`, `GITHUB_ACTIONS`, `GITLAB_CI`, `BUILDKITE`, `CIRCLECI`, `TRAVIS`, `APPVEYOR`, `TEAMCITY_VERSION`, `JENKINS_URL`, `TF_BUILD`).
- **pytest test runs**.
- **Roar-instrumented jobs** — the subprocess that `roar run` wraps, not the wrapping `roar run` itself. (Otherwise a single user invocation would double-count.)
- **Ray, OSMO, and registered-backend worker environments**.

`ROAR_NO_TELEMETRY=1` is propagated into Ray and OSMO worker/task environments automatically. Disabling on the driver disables on workers without per-worker config.

## Why opt-out, not opt-in?

Honestly: feedback is expensive. Most people don't file issues, fill out forms, or reply to surveys — and that share has only fallen now that agents evaluate tools by trying them and moving on. Anonymous, milestone-bounded telemetry is what lets us see which install paths actually get exercised, which tracers preflight clean, where users bail. That's where engineering time gets pointed.

If feedback were free, opt-in would be the obvious default. It isn't, so we made the opt-out fast, persistent, account-free, and never blocking on a network call. `roar telemetry --print` shows you what would be sent; the negative list above tells you what won't be. We think that's a fair-enough trade. If you don't, the eject button is two commands up the page.

## Upload mechanics

The background uploader is a separately spawned, detached process — it survives even if your foreground command exits. It walks `$XDG_CACHE_HOME/roar/telemetry/queue/`, posts each event as JSON to the configured endpoint, and deletes only on `204 No Content`. Anything else (network error, 4xx, 5xx) leaves the event in place for the next attempt.

Default endpoint: `https://api.glaas.ai/api/v1/telemetry/roar`. Override with `roar telemetry --endpoint <url>` (writes to the global config). Set to an empty string to forbid uploads outright while still keeping local stats — equivalent to `--disable` for transmission purposes.

## Schema evolution

`payload_schema_version` is included on every event so a server can route old and new payloads to the right handler. New fields land in new schema versions; existing fields don't silently change shape. The set of allowed top-level fields, allowed subcommand counters, and allowed tracer selections lives in `roar/telemetry/payload.py` and `roar/telemetry/stats.py` — auditable in one place.

Disabling stays disabled across upgrades. Telemetry config is in the per-user global config file, not in the `roar` package, so reinstalling won't re-enable it.

## Telemetry vs. lineage

Telemetry and lineage are separate channels with different rules:

- **Telemetry** is anonymous, opt-out, milestone-bounded, and carries the allowlisted product-usage fields above.
- **Lineage** is what `roar register` publishes to GLaaS — artifacts, jobs, sessions, labels — under whatever [scope](/docs/scopes) you've set for the workspace.

Disabling telemetry has no effect on lineage publishing. Changing your scope (e.g. `roar scope use private`) has no effect on telemetry. They opt independently.

## Where to look next

- [Scopes](/docs/scopes) — the separate visibility model for lineage data.
- [roar Guide](/docs/roar-guide) — full CLI reference, including the `roar telemetry` command surface.
- The implementation: [PR #101 on `treqs/roar`](https://github.com/treqs/roar/pull/101).
