Blueprints Blog Contact About
← Back to blog

You don't need a data warehouse — you need a catalog

Every data project starts with a warehouse decision. What if that decision didn't exist?

Every data project starts the same way.

You sit down to build a pipeline, and before you write a single line of SQL, you have to make a decision:

Which warehouse are we using?

Snowflake? BigQuery? Redshift? Postgres?

That decision pulls in everything else:

  • How you ingest data
  • How you transform it
  • How you query it
  • How you pay for it

Before you’ve even started solving the actual problem.

What if that decision didn’t exist?

That’s the idea behind OndatraSQL.

There is no warehouse to provision. No service to spin up. No infrastructure to manage.

Instead, there’s just a catalog.

So what is the catalog?

The simplest way to think about it:

The catalog is where your data lives.

Not behind an API. Not inside a managed service.

Just:

  • Metadata
  • Files

That’s it.

Two parts, one system

The catalog is made up of two things:

1. Metadata

This is the “brain”:

  • Table definitions
  • Schemas
  • Snapshots
  • Lineage
  • Execution history

2. Data files

This is the “body”:

  • Stored as Parquet
  • Versioned over time
  • Append-only

Together, they behave like a warehouse.

But they’re not one.

No warehouse, no problem

In a traditional setup, you provision a warehouse:

  • Create a database
  • Allocate compute
  • Configure storage
  • Manage access
  • Keep it running

In OndatraSQL:

You point the runtime at a location.

That location can be:

  • A local file
  • A folder
  • An S3 bucket
  • A server-backed catalog

And that’s enough.

What the runtime does

You don’t interact with the catalog directly.

OndatraSQL takes care of everything:

  • Creating tables
  • Evolving schemas
  • Detecting changes
  • Managing snapshots
  • Ensuring consistency

Every time you run your pipeline, it produces a new version of your data.

No migrations. No manual state tracking. No orchestration glue.

Why this matters

This changes how you think about data systems.

You don’t start with infrastructure anymore.

You start with:

SELECT ...
FROM ...

And the rest follows.

Compare that to the modern data stack

A typical setup looks like this:

  • Airbyte for ingestion
  • dbt for transformations
  • Airflow for orchestration
  • Kafka for events
  • Snowflake for storage

Each tool solves one piece.

But each one assumes the rest already exists.

The catalog flips that model

With OndatraSQL:

  • Storage is just files
  • Metadata is just a catalog
  • Execution happens locally

There’s no central service holding everything together.

The runtime does that instead.

One sentence

The catalog is your warehouse — but without the warehouse.

And that’s the point

You don’t need to spend a week setting up infrastructure just to answer a question.

You don’t need a stack just to run SQL.

You don’t need a warehouse to have one.