Glaas minimal logo, light

Getting Started with roar and GLaaS

On this page

roar automatically tracks the reproducible lineage of your work by observing your commands as they run — without requiring you to define a pipeline or write extra configuration.

When paired with GLaaS (Global Lineage-as-a-Service), this observation approach scales to a global registry: search for any artifact by its hash, navigate the connections between jobs, and reproduce results on other machines.

Get Started

The on-ramp for new readers — install roar, run your first traced command, understand the pitch.

Guides

Hands-on walkthroughs and the conceptual model. Read these in order if you're new; come back when working through new patterns.

Reference

Deep dives and lookup material.

  • Tracers — eBPF / preload / ptrace, comparison, cloud-platform compatibility.
  • Hashes — content-addressable identity, blake3 vs sha256, multi-algo storage.
  • Composite Artifacts — datasets as single artifacts; Bloom-filter membership for large composites.
  • Authentication — access control and token management.
  • Scopes — privacy and visibility model for registered lineage.
  • Labels — metadata attached to sessions, jobs, and artifacts; search on glaas.ai.
  • Glossary — definitions of key terms.

Integrations

  • Cloud Storage Proxy — the S3 proxy that captures cloud-storage lineage.
  • Ray — multi-node Ray integration; per-task lineage and fragment streaming.
  • Experiment Tracking — W&B / MLflow / Neptune integration, not replacement.

Appendix

  • Troubleshooting — symptom-indexed fixes for common errors.
  • Telemetry — anonymous product telemetry; opt-out controls and full payload allowlist.
  • FAQ — implementation details and "why does it work this way" questions.

Examples

  • End-to-End Introduction — a six-step walkthrough that generates and combines small binary artifacts with roar run, inspects the inferred DAG, registers the final artifact with GLaaS, and reproduces it in a fresh directory from its hash.