Concepts
How OndatraSQL works — the mental model
OndatraSQL runs your data pipeline.
You write SQL files. Run one command. Your data is materialized.
The Mental Model
Think of OndatraSQL as a program that runs your data.
- Files = models
ondatrasql run= execution- Output = tables in DuckLake
You don’t configure pipelines. You run them.
What Happens When You Run
ondatrasql run
OndatraSQL:
- Finds all models
- Builds a dependency graph
- Detects what changed
- Runs only what’s needed
- Writes results to DuckLake
Done.
Example
raw.events → staging.sessions → mart.daily_traffic
You don’t define this graph. It’s extracted from your SQL.
The Runtime Model
Everything is built into one system:
- Dependency graph → automatic
- Change detection → automatic
- Incremental loading → automatic
- Schema evolution → automatic
- Validation → automatic
No config files. No orchestration layer.
Data Flow
- Data is collected or loaded
- Stored durably on disk
- Loaded into DuckDB
- Transformed via SQL
- Written to DuckLake
If anything fails, nothing is lost.
Change Detection
Every model gets a run type:
| Run Type | Meaning |
|---|---|
skip | Nothing changed |
incremental | New data in source |
full | Upstream changed |
backfill | First run or definition changed |
You don’t write incremental logic. It’s built in.
Safety
- Constraints block bad data before it’s written
- Audits catch regressions after
- Failures roll back automatically via time-travel
Every run is reproducible.
What OndatraSQL Is Not
OndatraSQL runs on a single machine. It is not:
- A distributed system
- A streaming platform
- A cloud warehouse
- An orchestrator
You don’t orchestrate data pipelines. You run them.
Models
Everything is a model — ingestion, transformation, and events in one system
Directives
Directives turn SQL into a data pipeline
Model Kinds
Choose data behavior with one directive — not pipeline logic
Smart CDC
Write once. Run incrementally forever.
Schema Evolution
No migrations. Ever.
Incremental Models
Incremental is not a feature — it's the default
Dependency Graph
Your queries define the graph
Sandbox Mode
See the result before you commit it
Maintenance
Keep your data fast and your storage under control
Event Collection
Built-in event ingestion — no Kafka, no message broker
Ondatra Labs