Labels: roar and GLaaS

What labels are

Labels are metadata. Key/value annotations you attach to a session, job, or artifact when the thing you want to record isn't something roar can observe automatically — a metric, a comment, a tag, an experiment ID, a human decision.

roar observes structure automatically: which file was read, which command ran, what its output was. It can't observe meaning: that this artifact is your mnist-v3-baseline, that the run took 0.94 accuracy, that you tried lr=1e-4 and it didn't help. Labels are how that information becomes part of the record.

Naming note. roar label is the verb — the CLI surface for attaching, updating, and inspecting metadata. The concept is metadata. We didn't call the command roar metadata for length / hand-feel reasons, but everywhere this page says "labels" it means "user-attached metadata."

Quick use

roar label set job @2 lr=0.001 accuracy=0.94      # attach to a job
roar label set artifact model.pkl stage=baseline  # attach to an artifact
roar label set dag current project=mnist          # attach to the active session
roar label show job @2                            # show current labels
roar label history artifact model.pkl             # show every revision
roar label rm job @2 lr                           # remove a key

Labels persist locally and travel with the artifact, job, or session record when you roar register it to GLaaS.

What you can label

Three things, in roar terms:

Sessions / DAGs — roar label set dag current ... or roar label set dag <hash> .... Useful for tagging the whole pipeline run (experiment name, hypothesis, owner).
Jobs — roar label set job @<step> ... or roar label set job <uid> .... Useful for per-step metadata (training metrics, hyperparameters, stage names).
Artifacts — roar label set artifact <path-or-hash> .... Useful for content-level annotations (dataset version, model class, evaluation score).

Each has its own label record in .roar/roar.db; updating one doesn't touch the others.

Label structure

Labels are key/value pairs. The value side is typed loosely — strings, numbers, and booleans are all stored as their natural types where possible. Nested keys use dot notation:

roar label set artifact model.pkl \
  model.name=resnet50 \
  model.version=3 \
  metrics.accuracy=0.94 \
  metrics.loss=0.18 \
  trained=true

This produces a structured document with nested model and metrics sub-objects. roar label show renders it back flat for readability; the underlying storage preserves the structure.

There's no fixed schema — you can name keys whatever you want — but consistent naming across your team is what makes search on glaas.ai useful (next section).

Search by label value on glaas.ai

Once a record with labels is registered to GLaaS, you can search for it by label value on glaas.ai. Paste a hash to find an item; from there, filter by labels to find related items. The search box accepts label predicates:

metrics.accuracy>0.9
model.name=resnet50
stage=baseline AND trained=true

This is what makes labels worth attaching in the first place. The lineage tells you what happened; labels tell you which ones to compare. Without labels, "find me every run where accuracy beat 0.92 on the v2 dataset" is a SQL query against a private DB; with labels, it's a search bar.

Label views on glaas.ai also support leaderboard-style aggregation — sort and filter across many sessions by a label key, surface the top N, see the lineage behind each.

Visibility

Labels follow the scope of the artifact, job, or session they're attached to:

Scope of the parent	Label readable to …
`anonymous`	public
`public`	public
`private`	you only
`<owner>/<project>`	project members only

Practical consequence: a label set on a private artifact is invisible to anyone outside that scope. There's no way to "make this artifact private but this label public" — the label inherits the parent's visibility.

This is why labels are an attractive place for things like experiment metrics on private-scope work: the record is access-controlled, the labels are too, and you can still search across them within your scope.

Versioning and history

Every roar label set is a new version. The old version isn't replaced — it's archived. roar label history shows the chronological list; you can compare or roll back without losing prior state.

roar label history job @2
# Version 1: lr=0.001
# Version 2: lr=0.001, accuracy=0.94
# Version 3: lr=0.001, accuracy=0.94, comments="overfit on val"

This matters for accountability: you can see when a label changed, which is often useful when reviewing experiment results months later.

Auto-labels

Some labels are attached by roar automatically:

Dataset auto-labels — when roar detects an output directory that looks like a dataset (multiple files of the same shape, partitioned naming, etc.), it adds dataset.id, dataset.modality, dataset.type labels to the resulting composite artifact. Search by dataset.modality=tabular for instance to find every traced dataset of that flavor.
System job labels — internal labels like roar.tracer.backend, roar.exit_status are attached to jobs for filtering on the dashboard side.

Auto-labels are written with write_origin=system so they're distinguishable from human-set ones. They never overwrite an existing user-set label at the same key.

Configuration

There's not much to configure for labels — they're additive metadata. The relevant config knobs:

Key	Default	Effect
`labels.auto_dataset_detect`	`true`	Detects dataset-shaped outputs and attaches `dataset.*` labels.
`labels.history_retention`	(unlimited)	How many versions of a record's labels to retain. Default keeps everything.

Where to look next

Core Concepts: Labels — the conceptual placement of labels in the broader DAG model.
Scopes — how label visibility follows the parent's scope.
roar Guide — the full roar label command reference.