Model Kinds
Choose data behavior with one directive — not pipeline logic
Model kinds define how your data behaves. With one directive, you choose how data is stored, how changes are handled, and how updates are applied.
Mental Model
Same SQL — different behavior:
SELECT * FROM raw.orders
-- @kind: append → only adds new rows
-- @kind: merge → updates existing, inserts new
-- @kind: scd2 → keeps full history of every change
-- @kind: table → rebuilds when something changed
Which Kind Should You Use?
| Kind | Use for |
|---|---|
table | Aggregates, marts, lookup tables |
view | Lightweight transforms, staging |
append | Event logs, fact tables, audit trails |
merge | User/customer tables (upsert) |
scd2 | History tracking, dimension tables |
partition | Large partitioned fact tables |
events | HTTP event ingestion |
view
No storage. Always up-to-date. Resolves at query time.
-- @kind: view
SELECT order_id, customer_id, total
FROM raw.orders WHERE total > 0
Recreated if definition changes. Skipped otherwise.
table (default)
Rebuilds only when something changed. Tracks upstream dependencies — if nothing changed upstream, nothing runs.
-- @kind: table
SELECT * FROM raw.products WHERE is_active = true
| Condition | What happens |
|---|---|
| First run | Full build |
| SQL changed | Rebuild |
| Upstream changed | Rebuild |
| Nothing changed | Skip — no query, no I/O |
append
Add new data. Never updates existing rows. Smart CDC on incremental runs.
-- @kind: append
SELECT * FROM raw.events
Source tables are rewritten to change-only queries via DuckLake time-travel.
merge
Like an UPSERT. Updates existing rows, inserts new ones. Smart CDC on incremental runs.
-- @kind: merge
-- @unique_key: customer_id
SELECT customer_id, name, email, tier
FROM raw.customers
Requires @unique_key.
scd2
Keeps every version of a row. Full history, permanent. Uses full-state comparison — not Smart CDC.
-- @kind: scd2
-- @unique_key: product_id
SELECT product_id, name, price, category
FROM raw.products
OndatraSQL adds valid_from_snapshot, valid_to_snapshot, and is_current automatically. Changed rows get closed and a new version inserted.
Requires @unique_key. History survives snapshot expiration.
partition
Replaces only affected partitions. Smart CDC on incremental runs.
-- @kind: partition
-- @unique_key: year, month
SELECT order_id, total,
EXTRACT(YEAR FROM order_date) AS year,
EXTRACT(MONTH FROM order_date) AS month
FROM raw.orders
Requires @unique_key (supports multiple columns).
events
Send events via HTTP. No Kafka.
-- @kind: events
event_name VARCHAR NOT NULL,
page_url VARCHAR,
user_id VARCHAR,
received_at TIMESTAMPTZ
Defines a column schema, not a SELECT. See Event Collection.
How Changes Are Detected
| Kind | Strategy |
|---|---|
view | Definition hash |
table | Dependency tracking — skip if unchanged |
append / merge / partition | Smart CDC via time-travel |
scd2 | Full-state comparison |
events | Always flush |
Storage Hints
DuckLake optimizations for any materializing kind:
-- @sorted_by: updated_at → better statistics, faster queries
-- @partitioned_by: region → automatic file organization
Why Model Kinds Exist
In most tools, materialization logic is spread across SQL, orchestration config, and custom scripts. Model kinds encode it directly in the model — one line replaces pipeline logic.
| Need | Traditional stack | OndatraSQL |
|---|---|---|
| Append data | Custom SQL + scheduler | @kind: append |
| Upsert rows | MERGE + orchestration | @kind: merge |
| Track history | Complex SCD logic | @kind: scd2 |
| Collect events | Kafka + consumer | @kind: events |
Ondatra Labs