
What OndatraSQL is, why it exists, and the design decisions behind it.
OndatraSQL is a data runtime that handles ingestion, transformation, validation, and scheduling in a single binary. It is built on DuckDB for query execution and DuckLake for catalog management, snapshots, and time-travel.
One executable with no external dependencies. Install it, create a project, run it.
Pipelines execute inside the binary itself. No separate scheduler service, no separate query engine, no separate metadata database. The same process runs on a laptop, in CI, or on a server.
Transformations are SQL files. The runtime handles materialization, change detection, schema evolution, and dependency ordering.
Constraints, audits, and warnings run as part of the pipeline, not as a separate step.
Every run creates a DuckLake snapshot. Previous states are queryable via time-travel. Failed runs leave no trace.
Only changed data is processed. The runtime uses DuckLake's table_changes() to detect which rows changed between snapshots.
Run the full pipeline against a temporary catalog copy. See row diffs, schema changes, and downstream impact before committing.





OndatraSQL is released under the GNU AGPL v3. Source code is available on GitHub.
Ondatra is the genus name for the muskrat (Ondatra zibethicus), a semi-aquatic rodent that builds tunnels and channels through lakes and wetlands. It lives in the same lakes as ducks, and it builds pipes through them. DuckDB. DuckLake. Pipelines.
OndatraSQL is created and maintained by Marcus Hernandez, who spent nine years working on publisher revenue in ad tech (ad servers, SSPs, header bidding, and reporting). The constraint was always the same: data had to be correct before the morning standup, running on whatever was available.
That experience shaped OndatraSQL's design: a single system that collects, transforms, and validates data without requiring separate infrastructure for each concern.