Lineage & Metadata
On this page
Every run records column-level lineage, schema changes, dependencies, and git context in DuckLake. Query it all with SQL.
CLI
ondatrasql lineage overview # All models with dependencies
ondatrasql lineage staging.orders # Column lineage for one model
ondatrasql lineage staging.orders.total # Trace one column
Output is rendered as ASCII art with box-drawing characters.
Transformation Types
Each column is classified by how it was derived:
| Type | Meaning | Example |
|---|---|---|
IDENTITY |
Direct copy | SELECT name |
AGGREGATION |
Aggregated value | SUM(amount) |
ARITHMETIC |
Computed | price * quantity |
CONDITIONAL |
Logic applied | CASE WHEN ... |
CAST |
Type conversion | CAST(id AS VARCHAR) |
FUNCTION |
Function call | UPPER(name) |
Lineage is extracted from the SQL AST, across CTEs, joins, and subqueries.
Commit Metadata
Every run stores metadata in commit_extra_info on the DuckLake snapshot. All fields are JSON.
| Field | Description |
|---|---|
model |
Target table name |
kind |
table, append, merge, scd2, partition, tracked, events |
run_type |
backfill, incremental, full, skip, flush |
rows_affected |
Rows written |
start_time |
Run start (ISO 8601) |
end_time |
Run end (ISO 8601) |
duration_ms |
Execution time in milliseconds |
steps |
Sub-step breakdown (array of {name, duration_ms, status}) |
column_lineage |
Source columns with transformation types |
depends |
Upstream table dependencies |
columns |
Output column definitions |
schema_hash |
Detects schema evolution |
sql_hash |
Triggers backfill on change |
dag_run_id |
Run identifier |
source_file |
Model source path |
duckdb_version |
DuckDB version used |
git_commit |
Commit SHA |
git_branch |
Branch name |
git_repo_url |
Repository URL |
error |
Error message (on failure) |
Querying Metadata
Metadata is stored in DuckLake snapshots. Query with SQL:
-- Recent runs
SELECT
commit_extra_info->>'model' AS model,
commit_extra_info->>'run_type' AS run_type,
commit_extra_info->>'rows_affected' AS rows,
commit_extra_info->>'duration_ms' AS ms
FROM lake.snapshots()
ORDER BY snapshot_id DESC
LIMIT 10;
-- Column lineage for a model
SELECT
commit_extra_info->>'model' AS model,
commit_extra_info->>'column_lineage' AS lineage
FROM lake.snapshots()
WHERE LOWER(commit_extra_info->>'model') = 'mart.revenue'
ORDER BY snapshot_id DESC
LIMIT 1;
-- Models that changed schema
SELECT
commit_extra_info->>'model' AS model,
commit_extra_info->>'schema_hash' AS hash
FROM lake.snapshots()
WHERE commit_extra_info->>'run_type' = 'backfill';
OndatraSQL