This page is symptom-indexed: scan for what you're seeing in the terminal, then jump. For deeper context on individual subsystems, follow the cross-references to [Tracers](/docs/tracers), [Scopes](/docs/scopes), [Ray](/docs/ray), or the [roar Guide](/docs/roar-guide). ## First-time setup ### `Error: roar is not initialized in this directory.` What it means: you're running `roar` (or `roar run`, `roar dag`, etc.) in a directory that doesn't have a `.roar/` subdirectory. ```bash roar init ``` Run once per project. `.roar/` lives next to your `.git/`. The init step is idempotent — re-running on an already-initialized workspace just reports it's done. ### `Error: Not in a git repository.` `roar` requires the working tree to be inside a git repo so it can tag every recorded job with the exact commit it ran from. Initialize git first, then `roar`: ```bash git init && git add -A && git commit -m "initial" roar init ``` ### `roar: command not found` after `pip install roar-cli` Your shell can't see the `roar` binary on `PATH`. Two cases: - **Installed via `uv tool install roar-cli`** (the recommended path) — `uv` puts shims under `~/.local/bin` by default. Add that to your `PATH`, or run `uv tool update-shell` once to wire it up. - **Installed via `pipx install roar-cli`** — pipx similarly uses `~/.local/bin`. Run `pipx ensurepath` once. - **Installed via `pip install roar-cli` into a project venv** — `roar` only exists when that venv is activated. The other two are how you keep `roar` always-available. See [Installation](/docs/installation) for the recommended setup. ## Running commands ### `Git repository has uncommitted changes` By design. `roar run` refuses to start when the working tree is dirty so every recorded job carries an honest commit SHA — otherwise re-running later wouldn't actually re-run the same code. ```bash git add -A && git commit -m "wip" roar run python train.py … ``` If you need to iterate fast without committing each loop, see the `--allow-dirty` discussion in the [roar Guide](/docs/roar-guide). ### `Tracer preflight failed for '': …` The selected tracer can't initialize. Start with the deep preflight to see exactly why: ```bash roar tracer check # ebpf | preload | ptrace ``` Common causes by backend: - **eBPF** — kernel < 5.8, missing BTF, or `CAP_BPF` not set. Run `roar tracer enable ebpf` once to apply the capability, or pick a different backend with `roar tracer use preload`. - **preload** — the shared library isn't built. `cd rust && cargo build --release` in the `roar` repo. - **ptrace** — `kernel.yama.ptrace_scope` is blocking us. Check `sysctl kernel.yama.ptrace_scope`; values `0` or `1` typically work, `2` requires `CAP_SYS_PTRACE`. `roar tracer` (no subcommand) shows the readiness table for all three backends — useful when you don't yet know which one is failing. ### `Operation not permitted creating BPF map` You're seeing the raw kernel error from the eBPF backend trying to allocate a BPF map without sufficient privilege. Two paths: ```bash roar tracer check ebpf # confirms the specific cause (CAP_BPF / kernel / BTF) roar tracer enable ebpf # one-shot: applies CAP_BPF to the eBPF binary ``` If the host doesn't allow `CAP_BPF` (managed container runtimes, locked-down CI), eBPF won't work here regardless. Fall back to preload: ```bash roar tracer use preload ``` See [Tracers → Cloud and managed GPU platforms](/docs/tracers#cloud-and-managed-gpu-platforms) for which environments support eBPF. ### `roar run` completes but the DAG is empty (no outputs reported) The first diagnostic is which backend actually ran, and whether it could see your workload at all: ```bash roar tracer # which backend is active + the readiness table roar tracer check # deep preflight if the readiness table is unclear ``` The usual causes once you know the backend: - **Static binaries via preload** — preload only sees dynamically-linked libc calls. Switch to eBPF or ptrace for static binaries. - **Setuid binaries** — the dynamic linker scrubs `LD_PRELOAD`. Same fix. - **Container without privileges** — eBPF requires kernel access; preload usually works. Most CI runners need preload. - **Auto-fallback chose a less-capable backend silently** — `roar tracer` will tell you which backend was active for the run; if it wasn't the one you expected, run preflight on the one you want and fix the underlying issue. The [Tracers cloud-platform table](/docs/tracers#cloud-and-managed-gpu-platforms) lists what's available where. ### The wrapped command's exit code seems wrong `roar run` forwards the wrapped command's exit code verbatim. If you're seeing the wrong one, check whether you have a shell pipeline rewriting it — `roar run cmd | tee file` will exit with `tee`'s status, not `cmd`'s. Either use `set -o pipefail`, or run `roar run` without the pipeline and have it write the output file directly. ## Inspect and show ### `No artifact found for path: ` `roar show ` couldn't find a tracked artifact at that path. Three common reasons: - **The file was never recorded** — only files the tracer saw being read or written get tracked. A file you copied into the directory by hand isn't an artifact until a tracked command touches it. - **Filtered as noise** — temp files in `/tmp/`, package files in `site-packages/`, and `.roar/`/`.git/` internals are filtered out by default. Set `filters.ignore_tmp_files = false` in `.roar/config.toml` to recover temp-file tracking. - **Path mismatch** — `roar show` resolves relative paths against your current `cwd`. `roar show ./model.pkl` and `roar show $(pwd)/model.pkl` should both work; `roar show model.pkl` when you're in a different directory won't. `roar dag --show-artifacts` lists every artifact in the active session; scan there to find the right path or hash. ### `roar show ` reports `(missing)` next to a tracked file The artifact is registered in `roar`'s DB but the file no longer exists on disk. This is *signal*, not an error — you can still call `roar reproduce ` to recreate it. To clear missing entries entirely: ```bash roar reset ``` …which wipes the active session. If you only want to drop one missing artifact, do it in a fresh session and don't rerun the producing step. ### A file shows as both input and output of the same job In current `roar` builds this shouldn't happen — the cross-tracer normalization in `tracer-fd` ensures `O_RDWR|O_TRUNC` opens (the `numpy.savez` / `zipfile.ZipFile("w")` pattern) classify as write-only. If you're seeing it: confirm you're on a recent build. Historical DB rows from older roar versions can carry the bug; new jobs recorded with current roar are clean. `roar reset` wipes the historical state. ## Registration / scope / auth ### `No GLaaS repo binding found for this publish. Link the repo to a TReqs owner/project first, or rerun with --public to publish publicly.` You ran `roar register` (or `roar put`) from a workspace whose scope isn't set, and you're either logged out or your scope expects a project binding. Three resolutions, depending on intent: ```bash roar scope use anonymous # publish publicly without an account roar scope use private # personal scope, needs roar login roar scope use / # team / org project scope ``` See [Scopes](/docs/scopes) for the full picture. ### Every `roar register` prompts for confirmation, even after the first By design under `anonymous` scope — every publication is irreversible and worth a conscious confirmation. Bypass for one invocation: ```bash roar register @5 -y ``` Or switch to a scope that doesn't prompt (`roar login` then any of `private` / `public` / `/`). ### Auth state is rejected (`Stored auth state at is not valid JSON` / similar) Your local auth state was corrupted or partially written. Wipe it and log back in: ```bash roar logout roar login ``` ### `Registration aborted.` after answering "no" to the anonymous prompt That's the expected exit when you decline the prompt. The registration didn't happen; the rest of your local state is unchanged. Either pass `-y` next time, or pick a different scope (`roar scope use …`). ## Reproduce ### `roar reproduce ` fails with "GLaaS server not configured" The artifact isn't in your local `.roar/roar.db` so `roar` tried to fetch its lineage from GLaaS, but the workspace doesn't have a GLaaS URL configured. Two options: - Reproduce against your local DB (the artifact must already be tracked locally), or - Set the GLaaS URL: `roar config set glaas.url https://glaas.ai`, then retry. ### `roar reproduce` partial replay — some steps succeed, others fail The reproduce engine runs every recorded job in topological order. If a step references a tool or path that doesn't exist in the new environment, that step fails and the chain stops. - Check what's missing: the failing step's stderr usually tells you (missing python package, missing input file, etc.). - For environment differences: `roar show @` shows the captured environment (packages, env vars) so you can recreate it. ## Tracers See [Tracers > Debugging](/docs/tracers#debugging) for the common tracer-side failure modes. The most frequent: preflight failures (caught above under "Running commands"), platform mismatches (see the [comparison table](/docs/tracers#comparison)), and edge cases like setuid/static binaries. ## Ray See [Ray > Limitations](/docs/ray#limitations) for the cluster-side gotchas. Common ones: - **Partial worker lineage** — likely a `runtime_env` policy blocking `worker_process_setup_hook`. Check your cluster's policy. - **S3 ETag mismatch** — multipart uploads produce hash-of-part-hashes, not a content digest. `roar` records both ETag and a content blake3 from local `open` captures. - **Cluster URL mismatch** — driver uses `localhost` but workers need a routable address. Set `ROAR_CLUSTER_GLAAS_URL` and/or `ROAR_CLUSTER_AWS_ENDPOINT_URL` to the worker-visible URLs. The Ray integration is currently beta — see the callout at the top of the [Ray page](/docs/ray). ## Performance ### `roar run` adds noticeable latency on small commands The overhead breakdown: - **Startup**: 50–200 ms for backend selection + tracer init. - **Per-syscall**: very low under eBPF/preload, higher under ptrace (two context switches per syscall). - **Post-run hashing**: content hashes are computed at write time (Python `open`) or post-run (rename/replace). Big files dominate. For sub-second commands, the overhead is a meaningful fraction of total runtime. If you need lean overhead for many tiny commands, batch them under a single `roar run` rather than wrapping each. ### Hashing throughput looks slow on large files `roar` uses blake3 by default with sha256 as a fallback. If blake3 isn't installed in your environment (`pip install blake3`), you'll get sha256, which is roughly 5× slower on modern CPUs. Check with `roar show ` — the `Hash (blake3): …` line tells you which algorithm ran. ## Where to look next - [Tracers](/docs/tracers) — backend behavior, capabilities, the full debug surface. - [Scopes](/docs/scopes) — privacy and registration visibility. - [Ray](/docs/ray) — Ray cluster setup and limitations. - [roar Guide](/docs/roar-guide) — full CLI reference for the commands referenced above. - [FAQ](/docs/faq) — implementation details and the "why does it work this way" questions.