Concepts

How OndatraSQL works — the mental model

OndatraSQL runs your data pipeline.

You write SQL files. Run one command. Your data is materialized.

The Mental Model

Think of OndatraSQL as a program that runs your data.

Files = models
ondatrasql run = execution
Output = tables in DuckLake

You don’t configure pipelines. You run them.

What Happens When You Run

ondatrasql run

OndatraSQL:

Finds all models
Builds a dependency graph
Detects what changed
Runs only what’s needed
Writes results to DuckLake

Done.

Example

raw.events → staging.sessions → mart.daily_traffic

You don’t define this graph. It’s extracted from your SQL.

The Runtime Model

Everything is built into one system:

Dependency graph → automatic
Change detection → automatic
Incremental loading → automatic
Schema evolution → automatic
Validation → automatic

No config files. No orchestration layer.

Data Flow

Data is collected or loaded
Stored durably on disk
Loaded into DuckDB
Transformed via SQL
Written to DuckLake

If anything fails, nothing is lost.

Change Detection

Every model gets a run type:

Run Type	Meaning
`skip`	Nothing changed
`incremental`	New data in source
`full`	Upstream changed
`backfill`	First run or definition changed

You don’t write incremental logic. It’s built in.

Safety

Constraints block bad data before it’s written
Audits catch regressions after
Failures roll back automatically via time-travel

Every run is reproducible.

What OndatraSQL Is Not

OndatraSQL runs on a single machine. It is not:

A distributed system
A streaming platform
A cloud warehouse
An orchestrator

You don’t orchestrate data pipelines. You run them.

Models

Everything is a model — ingestion, transformation, and events in one system

Directives

Directives turn SQL into a data pipeline

Model Kinds

Choose data behavior with one directive — not pipeline logic

Smart CDC

Write once. Run incrementally forever.

Schema Evolution

No migrations. Ever.

Incremental Models

Incremental is not a feature — it's the default

Dependency Graph

Your queries define the graph

Sandbox Mode

See the result before you commit it

Maintenance

Keep your data fast and your storage under control

Event Collection

Built-in event ingestion — no Kafka, no message broker