Glaas minimal logo, light

Tracers

On this page

What is a tracer?

A tracer is the component that observes what your command does as it runs. When you type roar run python train.py, the tracer is what records the read/write/mmap/rename/unlink events that flow through your process — and through any subprocess it spawns. From those events roar derives the inputs, outputs, and DAG.

roar ships three backend tracers and picks one automatically. Each makes different tradeoffs around platform support, privileges, and overhead.

Privacy. Tracers observe metadata only — file paths, sizes, syscall events, content hashes. File contents are never captured by the tracer.

Quick use

roar tracer                        # show the active backend + per-backend status
roar tracer use <mode>             # pick auto | ebpf | preload | ptrace
roar tracer check <mode>           # deep preflight for one backend (CAP_BPF, BTF, etc.)
roar tracer enable ebpf            # configure eBPF capabilities / sysctls

The rest of this page explains why you'd pick each, what they observe, and how to debug them when something goes sideways.

The three backends

eBPF

A small kernel-side probe attaches to syscall tracepoints (openat, read, write, mmap, rename, unlink, sendfile, copy_file_range, clone/fork/exec). The probe is registered system-wide — it fires on those syscalls from every process on the host — but the in-kernel program's first action is to check the calling PID against a BPF map of roar-tracked PIDs. On a miss it returns immediately; on a hit it ships a compact event to a userspace daemon over a perf ring buffer.

Strengths. Lowest per-syscall overhead, even versus other in-process tracers. Sees every tracked process regardless of how it was built — statically-linked binaries, setuid binaries, processes that bypass libc are all covered.

Requirements. Linux kernel ≥ 5.8 with BPF Type Format (BTF) available. CAP_BPF (or root) on the calling user.

Footprint on the host. The BPF program executes briefly in kernel context for unrelated processes' syscalls too — the PID-map lookup is the first thing it does, and a miss is a couple of instructions. No userspace events are emitted for those processes, and no path or content data leaves the kernel. The probe is open source and the audit surface is small.

preload

A cdylib shared library that wraps libc's I/O functions via LD_PRELOAD on Linux (and DYLD_INSERT_LIBRARIES on macOS — the dynamic-linker mechanisms differ by name and a few rules, but the single shared library covers both). Each hooked function emits a structured event to a daemon over a Unix socket before delegating to the real libc symbol.

Strengths. Very low overhead. Works on macOS where eBPF isn't available. No kernel privileges required. Crystal-clear semantics: an event fires when libc's open/read/write/etc. are called.

Fundamental constraint. Only sees calls that go through the dynamic libc. Statically-linked binaries and processes that issue raw syscalls are invisible. The dynamic linker also scrubs the preload variable from privileged binaries — LD_PRELOAD from setuid binaries on Linux, DYLD_INSERT_LIBRARIES from any SIP-protected or library-validated process on macOS — so those are invisible too.

ptrace

Uses PTRACE_O_TRACESYSGOOD to stop the traced process on every syscall entry and exit, reads the registers to decide the syscall and arguments, classifies, then resumes.

Strengths. Works without privileges on most Linux systems and sees every syscall regardless of how the process was built (static, raw syscalls, etc.). The fallback when neither eBPF nor preload is usable.

Cost. Two context switches per syscall. On heavy I/O workloads this is noticeable.

What gets captured

Across all three backends, roar's tracer records:

  • Readsread, pread, preadv, readv.
  • Writeswrite, pwrite, pwritev, writev.
  • Opensopen, openat, creat, fopen (preload only).
  • Path-publicationrename, renameat, link, linkat, unlink, unlinkat, truncate, ftruncate. Each marks the destination path as written even when no write() ever fires (e.g. bash's echo > x).
  • mmap — both PROT_READ and PROT_WRITE mappings classified appropriately. MAP_PRIVATE writes do not count as output (they're copy-on-write).
  • Cross-fd I/Osendfile and copy_file_range are recorded as a read on the input fd and a write on the output fd.
  • Subprocessesclone, fork, vfork, exec. The tracer follows every child automatically; see How forks are followed below.

Backends agree. Cross-backend classification (read vs. write, what counts as a publication, how mmap is recorded) lives in the shared tracer-fd crate, so the three backends produce the same DAG for the same syscall stream.

Not captured. File contents (only hashes and sizes). GPU compute issued via driver IOCTLs. Operations on anonymous fds (memfd, pipes, sockets) — those don't correspond to artifacts. Filesystem activity in nested mount namespaces the daemon can't see.

How to choose

Comparison

eBPFpreloadptrace
Linux✓ (≥ 5.8)
macOS
Required privilegesCAP_BPF or rootnonenone (unless YAMA blocks)
Static binaries
Setuid binaries
Subprocess follow
Containershost-kernel access
Per-syscall overheadvery lowvery lowhigh
mmap capture

Quick guide

  • Linux ≥ 5.8 with root or CAP_BPF → eBPF. Lowest overhead, most coverage.
  • macOS → preload. The only viable option.
  • No privileges, no kernel-version control → preload if your workload only uses dynamically-linked tools (the common case), ptrace otherwise.
  • Static binaries or setuid-sensitive workflows on Linux → eBPF or ptrace.
  • CI → preload. No kernel deps, no privileges needed, scales fine.

auto mode

The default. roar runs a per-backend preflight at startup, picks the most capable backend that passes, and falls back to the next one if the chosen backend errors out during the run (configurable — tracer.fallback_enabled).

The fallback order is: eBPF → preload → ptrace. The first one whose preflight succeeds becomes the active backend. roar tracer shows what auto resolved to and why the others were rejected.

Limitations

  • Containers and namespaces. eBPF requires the host kernel to be readable from where the daemon runs — nested namespaces and locked-down container runtimes can interfere. Preload and ptrace work without that constraint.
  • Setuid binaries. Preload is blocked by the dynamic linker; ptrace fails the privilege check. eBPF observes them transparently.
  • Statically-linked binaries that issue raw syscalls without going through libc are invisible to preload. eBPF and ptrace see them.
  • GPU compute issued via driver IOCTLs is recorded as opaque syscall activity — the tracer can't peer into the framebuffer or compute kernel to identify artifacts.
  • Browser-launched / detached processes that don't inherit the tracer's environment (preload) or attachment (ptrace) escape observation. The eBPF backend sees them because it's system-wide.

Platform notes

Linux

  • eBPF requires kernel ≥ 5.8 with BTF. Recent Ubuntu / Debian / RHEL ship this by default.
  • kernel.yama.ptrace_scope controls who can ptrace what. The default of 1 blocks ptrace across users; 0 allows it; 2 requires CAP_SYS_PTRACE. roar tracer enable ebpf sets CAP_BPF so eBPF works regardless.
  • CAP_BPF (kernel ≥ 5.8) is the minimal capability for eBPF; roar tracer enable ebpf configures it as a one-time setup.

macOS

Preload is the only backend. eBPF doesn't exist; ptrace is blocked on hardened (SIP-protected, library-validated) processes, which covers almost everything you'd want to trace.

The same hardening also strips DYLD_INSERT_LIBRARIES — so Apple-signed binaries (the system /usr/bin/python3, /bin/zsh, etc.) escape the preload tracer too. In practice this isn't a problem: ML workflows almost always run an unhardened Python (uv, Homebrew, MacPorts, Conda, a project venv), which preload covers fine.

Containers / CI

  • Preload is the safest default for CI. No kernel deps, no privileges, no host-side configuration.
  • eBPF in containers works only if the kernel is host-visible (e.g., docker with --privileged or specific capabilities). Most managed CI runners don't allow this.
  • ptrace in containers works in most setups but is slower; double-check that kernel.yama.ptrace_scope isn't blocking it.

Cloud and managed GPU platforms

eBPF needs a real Linux kernel with CAP_BPF (or root). Whether you've got that depends almost entirely on how the platform virtualizes you.

PlatformeBPF?Notes
AWS EC2 (incl. GPU instances)Standard Linux VMs; root by default.
GCP Compute Engine GPU VMsKVM VMs with root; Google itself ships eBPF/Cilium on GKE.
Azure GPU VMsRecent Azure kernels ship with CONFIG_BPF=y.
Lambda CloudUbuntu 22.04 LTS VMs with sudo.
CoreWeaveBare-metal Linux; privileged pods can load BPF.
Latitude.shBare-metal Linux, full kernel and root.
DigitalOcean GPU Droplets / Bare Metal GPUsRoot VMs; bare metal allows kernel upgrades.
Crusoe Cloudlikely ✓KVM VMs with sudo; not explicitly verified.
Together AI Instant Clusterslikely ✓Bare-metal nodes with user-installed K8s/Slurm; not explicitly verified.
Anyscale✓ if pods are privilegedNeeds securityContext.privileged: true on the K8s cluster.
Vast.aidepends on the host's Docker configContainers on third-party hosts; privileged mode varies per template.
RunPodUnprivileged containers; no CAP_BPF.
ModalWorkloads run under gVisor; the bpf() syscall isn't forwarded to the host kernel.
ReplicateManaged container runtime; no documented privileged escape hatch.
Paperspace Gradient NotebooksManaged notebook containers without host-kernel access. (DigitalOcean Paperspace GPU Droplets are different — those work.)

Rule of thumb. Full VMs and bare metal → eBPF works. Container-as-a-service → either an opt-in privileged mode or nothing. Sandboxed serverless runtimes (Modal's gVisor, etc.) → blocked entirely; roar's auto-fallback picks preload or ptrace.

If your platform isn't listed, the empirical test is sudo bpftool feature — one command, tells you whether the kernel supports the BPF features roar needs.

Distributed runners (Ray, …)

Ray jobs run in worker processes that may be on remote nodes; each worker needs its own tracer attached. roar handles this via its Ray backend (roar.backends.ray.*), which wraps Ray worker startup so the tracer is present from the first task. See Ray for the full integration story, including fragment-store outputs and host-submit vs in-cluster modes.

Setup

Building the binaries

The tracers live in the roar repo under rust/tracers/. Build all three with:

cd rust && cargo build --release

roar looks for the built binaries in rust/target/release/. A from-scratch build takes about a minute on a modern laptop.

eBPF privileges

roar tracer enable ebpf

This is a one-time setup. It applies the CAP_BPF capability to the eBPF binary and ensures the kernel exposes the tracepoints we need. Re-run if the binary moves or the kernel changes.

Configuration

The key knobs in ~/.roar/config.toml (or per-repo .roar/config.toml):

KeyDefaultEffect
tracer.modeautoOverride the default backend selection.
tracer.fallback_enabledtrueIf false, a backend failure during the run aborts instead of falling back.
tracer.preflight_timeout_ms2000How long preflight will wait for each backend to respond before rejecting it.

CLI flags take precedence: roar run --tracer preload --no-tracer-fallback python train.py.

Debugging

Preflight

roar tracer

Shows the currently-active backend, the preflight result for each backend, and (when auto is in play) why the others were rejected. First place to look when "it's not tracing" — usually preflight has a story.

Common failure modes

  • tracer preflight failed for 'ebpf' → check kernel.yama.ptrace_scope, kernel version, and CAP_BPF. Re-run roar tracer enable ebpf if needed.
  • Permission denied on a setuid binary under preload → expected. Either rebuild with capabilities you control, switch to eBPF, or wrap the setuid step in a different command.
  • Some outputs are missing → likely a static binary plus preload. Switch to eBPF or ptrace.
  • DAG shows a file as both input and output → the cross-tracer normalization should prevent this, but see the recent O_TRUNC fix. If it still happens, file an issue with the tracer report.

Verbose logs

roar run -vv ... enables debug-level tracer logging to .roar/tracer.log. Useful for issue reports; not for daily use.

How they're built (and why they're fast)

All three backends share a small Rust workspace under rust/. The split is:

  • crates/tracer-schema — wire-format types (TraceEvent, FileRecord, TracerReport).
  • crates/tracer-fd — per-fd state aggregator. Receives raw Read/Write/OpenRead/OpenWrite events, applies the cross-backend classification rules (O_TRUNC = write-only, MAP_SHARED+PROT_WRITE = real write, etc.), and emits a canonical FileSummary regardless of which backend produced the events.
  • tracers/ebpf — the kernel-side BPF program (written in Rust via aya) plus a userspace daemon. The probe filters in-kernel so the hot path doesn't enter userspace.
  • tracers/preload — a cdylib shared library. Each interposed libc function dispatches through dlsym(RTLD_NEXT) to call the real symbol, with the event emission on the side via a thread-local Unix socket.
  • tracers/ptrace — a standalone binary that forks the target, attaches with PTRACE_O_TRACESYSGOOD | TRACEFORK | TRACEVFORK | TRACECLONE, and runs a syscall-stop loop.

Three deliberate choices that keep the hot paths cheap:

  • No allocations on the syscall path. Event structs are fixed-size and pre-allocated; reports are batched and serialized via rmp-serde after the run.
  • Cross-backend code lives in one crate. Adding a new policy (like the O_TRUNC fix) means changing one place — tracer-fd — and all three backends inherit it.
  • Trace transport is dumb. preload uses a raw Unix socket; eBPF uses a perf ring buffer; ptrace just collects state in the parent process. No serialization until the run completes.

How forks are followed

All three backends follow fork/vfork/clone/exec automatically. The mechanism is different per backend:

  • eBPF. The kernel probe attaches to the sched_process_fork and sched_process_exec tracepoints. When a tracked PID forks, the in-kernel program adds the child PID to the tracked-PID BPF map so its syscalls are observed from the next instruction onward, then emits a Clone event to userspace with parent + child PIDs. The userspace daemon inherits per-pid state. Coverage is automatic.
  • preload. LD_PRELOAD is part of the environment, so children inherit it automatically. A pthread_atfork handler in the library increments a per-process generation counter on fork(), so each child opens its own trace socket without confusing the parent.
  • ptrace. Attaching with PTRACE_O_TRACEFORK | TRACEVFORK | TRACECLONE tells the kernel to auto-attach us to every child the traced process spawns. We get a stop on each new PID and add it to our table.

In practice: roar run python train.py with 100 worker processes Just Works on all three backends, with no configuration.

For perf numbers across backends and workloads, see Benchmarks.