Every data project starts the same way.
You sit down to build a pipeline, and before you write a single line of SQL, you have to make a decision:
Which warehouse are we using?
Snowflake? BigQuery? Redshift? Postgres?
That decision pulls in everything else:
- How you ingest data
- How you transform it
- How you query it
- How you pay for it
Before you’ve even started solving the actual problem.
What if that decision didn’t exist?
That’s the idea behind OndatraSQL.
There is no warehouse to provision. No service to spin up. No infrastructure to manage.
Instead, there’s just a catalog.
So what is the catalog?
The simplest way to think about it:
The catalog is where your data lives.
Not behind an API. Not inside a managed service.
Just:
- Metadata
- Files
That’s it.
Two parts, one system
The catalog is made up of two things:
1. Metadata
This is the “brain”:
- Table definitions
- Schemas
- Snapshots
- Lineage
- Execution history
2. Data files
This is the “body”:
- Stored as Parquet
- Versioned over time
- Append-only
Together, they behave like a warehouse.
But they’re not one.
No warehouse, no problem
In a traditional setup, you provision a warehouse:
- Create a database
- Allocate compute
- Configure storage
- Manage access
- Keep it running
In OndatraSQL:
You point the runtime at a location.
That location can be:
- A local file
- A folder
- An S3 bucket
- A server-backed catalog
And that’s enough.
What the runtime does
You don’t interact with the catalog directly.
OndatraSQL takes care of everything:
- Creating tables
- Evolving schemas
- Detecting changes
- Managing snapshots
- Ensuring consistency
Every time you run your pipeline, it produces a new version of your data.
No migrations. No manual state tracking. No orchestration glue.
Why this matters
This changes how you think about data systems.
You don’t start with infrastructure anymore.
You start with:
SELECT ...
FROM ...
And the rest follows.
Compare that to the modern data stack
A typical setup looks like this:
- Airbyte for ingestion
- dbt for transformations
- Airflow for orchestration
- Kafka for events
- Snowflake for storage
Each tool solves one piece.
But each one assumes the rest already exists.
The catalog flips that model
With OndatraSQL:
- Storage is just files
- Metadata is just a catalog
- Execution happens locally
There’s no central service holding everything together.
The runtime does that instead.
One sentence
The catalog is your warehouse — but without the warehouse.
And that’s the point
You don’t need to spend a week setting up infrastructure just to answer a question.
You don’t need a stack just to run SQL.
You don’t need a warehouse to have one.
Ondatra Labs