The modern data stack didn’t evolve to make data work.
It evolved to make data scale.
Those are not the same thing.
The Assumption Everything Is Built On
Most tools assume:
- You already have a warehouse
- You already have pipelines
- You already have orchestration
And your problem is: how do we make this scale?
But for a lot of teams, the problem is simpler:
How do we just get the data to work?
What You Actually Need to Do
In most real-world cases, the job is:
- Collect some data
- Clean it
- Combine it
- Run a few aggregations
- Get an answer
That’s it.
Not: manage a streaming system, orchestrate DAGs, maintain connectors, debug infrastructure.
But Look What the Stack Became
To do something simple, you’re told to use:
- dbt
- Airbyte
- Apache Airflow
- Snowflake
- Apache Kafka
Each tool is reasonable. Together, they’re absurd.
The Hidden Cost
The problem isn’t just complexity. It’s where your time goes.
Instead of writing queries, understanding data, and delivering results — you’re wiring systems together, debugging pipelines, and managing state.
You’re not doing data work anymore. You’re doing infrastructure work.
The Inversion
What if we flipped the assumption?
Instead of: build a system, then run data through it
You start with: run data, and the system emerges from it.
A simpler model:
- SQL defines transformations
- Queries define dependencies
- Changes define execution
No DAGs. No orchestration layer. No incremental logic.
Just: data in → data out.
The Real Optimization
The industry optimized for horizontal scale.
But most teams need vertical simplicity.
Something that runs on one machine, requires no setup, and works immediately.
Why This Matters
Because the cost of complexity isn’t just technical. It’s cognitive.
Every extra tool adds friction, adds failure modes, and slows iteration.
And most of the time, it’s unnecessary.
The Question Worth Asking
Before you add another tool to your stack:
What problem am I actually solving?
And: Could I solve it with less?
Final Thought
The best data system isn’t the one that scales the most.
It’s the one that lets you get the answer, on time, every time.
Ondatra Labs