Skip to content
GLaaS global lineage as a service

Every model becomes explainable on demand.

GLaaS is the registry for what roar captures. Every job, every input, every output — stored as a queryable graph. Resolve any hash back to the code, data, and environment that produced it.

inside glaas
ARTIFACT part-000000.parquet hash ad9c125 JOB RECORDED train.py commit 726f617 pytorch-2.11.0 ARTIFACT model.pt hash 7c3a8de
How it works

Three steps. That's the mental model.

roar observes a run. GLaaS records it. Anyone on the team can query and reproduce — by hash, by tag, by time.

01 — observe

roar captures the run.

At runtime, roar run records every input, output, arg, and dependency — content-hashed at the source.

02 — register

GLaaS stores the recipe.

The job and its DAG get committed to your registry. Inputs and outputs become addressable by hash.

03 — resolve

Anyone can query the graph.

Given a model hash, walk back to the code, data, and environment. Given a dataset hash, walk forward to every model it produced.

Dereference anything

Point at a hash. Get the full recipe.

A mystery checkpoint in S3 is no longer mysterious. Ask GLaaS how it got made, and it'll tell you.

glaas lookup
% roar show model.pt
  artifact model.pt 7c3a8de produced by job a4f1092 command python train.py --lr 0.01 commit 726f617 inputs part-000000.parquet ad9c125             params.yaml f0e2b41 env pytorch-2.11.0, cuda-12.2 run at 2026-04-14 14:22:03 UTC

Every answer is a traversal.

GLaaS stores a graph, not a log. Every artifact points to the job that produced it; every job points to the artifacts it consumed. The result is a web of provenance you can walk in either direction.

Want to know if this model saw a particular dataset version? That's one hop. Want to know every model that ever trained on customer_data_v2? Also one hop.

Why this design

Four properties that matter.

A registry is easy to build and hard to build correctly. Here's what makes GLaaS different from a metadata database or a logs backend.

content-addressable

Hashes, not names.

Every artifact and every job is keyed by a content hash. Rename a file, move it, copy it — the hash doesn't care. Identity is structural, not nominal.

graph, not log

A DAG of what produced what.

Provenance queries are edge traversals, not full-text searches. "What data made this model?" is one hop. "What models were trained on this dataset?" is another.

append-only

Immutable history.

Past runs don't get rewritten. Audits are meaningful because the record is trustworthy — and reproduction works because nothing moved underneath you.

pluggable storage

Your artifacts stay where they are.

GLaaS records how artifacts were made, not the artifacts themselves. Your S3 / GCS / on-prem blob store remains the home for your data and models.

#

No storage lock-in.

GLaaS never stores your artifacts. It stores a record of how they were created. If you ever leave, you keep your data, your models, and every recipe — exported as an open JSON graph.

GLaaS is one piece

Pair it with the rest of TReqs.

The registry gets populated by roar and read by TReqs. Each piece is useful alone; together they're a source of truth for AI-native teams.

roar

What writes to the registry.

roar is the CLI that observes your training and pushes lineage to GLaaS. Without roar, the graph is empty.

Read about roar →
TReqs

What reads from the registry.

Training requests pull context from GLaaS — what data, what recipe, what changed — so reviewers don't have to ask.

Why TReqs →
A source of truth your team can query

Stop guessing how your models got made.

GLaaS is included with every TReqs team plan. Start with the free tier, point roar at your workflows, and watch the graph populate itself.