Blueprints Blog Contact About

Model Kinds

Choose data behavior with one directive — not pipeline logic

Model kinds define how your data behaves. With one directive, you choose how data is stored, how changes are handled, and how updates are applied.

Mental Model

Same SQL — different behavior:

SELECT * FROM raw.orders

-- @kind: append    → only adds new rows
-- @kind: merge     → updates existing, inserts new
-- @kind: scd2      → keeps full history of every change
-- @kind: table     → rebuilds when something changed

Which Kind Should You Use?

KindUse for
tableAggregates, marts, lookup tables
viewLightweight transforms, staging
appendEvent logs, fact tables, audit trails
mergeUser/customer tables (upsert)
scd2History tracking, dimension tables
partitionLarge partitioned fact tables
eventsHTTP event ingestion

view

No storage. Always up-to-date. Resolves at query time.

-- @kind: view

SELECT order_id, customer_id, total
FROM raw.orders WHERE total > 0

Recreated if definition changes. Skipped otherwise.

table (default)

Rebuilds only when something changed. Tracks upstream dependencies — if nothing changed upstream, nothing runs.

-- @kind: table

SELECT * FROM raw.products WHERE is_active = true
ConditionWhat happens
First runFull build
SQL changedRebuild
Upstream changedRebuild
Nothing changedSkip — no query, no I/O

append

Add new data. Never updates existing rows. Smart CDC on incremental runs.

-- @kind: append

SELECT * FROM raw.events

Source tables are rewritten to change-only queries via DuckLake time-travel.

merge

Like an UPSERT. Updates existing rows, inserts new ones. Smart CDC on incremental runs.

-- @kind: merge
-- @unique_key: customer_id

SELECT customer_id, name, email, tier
FROM raw.customers

Requires @unique_key.

scd2

Keeps every version of a row. Full history, permanent. Uses full-state comparison — not Smart CDC.

-- @kind: scd2
-- @unique_key: product_id

SELECT product_id, name, price, category
FROM raw.products

OndatraSQL adds valid_from_snapshot, valid_to_snapshot, and is_current automatically. Changed rows get closed and a new version inserted.

Requires @unique_key. History survives snapshot expiration.

partition

Replaces only affected partitions. Smart CDC on incremental runs.

-- @kind: partition
-- @unique_key: year, month

SELECT order_id, total,
    EXTRACT(YEAR FROM order_date) AS year,
    EXTRACT(MONTH FROM order_date) AS month
FROM raw.orders

Requires @unique_key (supports multiple columns).

events

Send events via HTTP. No Kafka.

-- @kind: events

event_name VARCHAR NOT NULL,
page_url VARCHAR,
user_id VARCHAR,
received_at TIMESTAMPTZ

Defines a column schema, not a SELECT. See Event Collection.

How Changes Are Detected

KindStrategy
viewDefinition hash
tableDependency tracking — skip if unchanged
append / merge / partitionSmart CDC via time-travel
scd2Full-state comparison
eventsAlways flush

Storage Hints

DuckLake optimizations for any materializing kind:

-- @sorted_by: updated_at       → better statistics, faster queries
-- @partitioned_by: region       → automatic file organization

Why Model Kinds Exist

In most tools, materialization logic is spread across SQL, orchestration config, and custom scripts. Model kinds encode it directly in the model — one line replaces pipeline logic.

NeedTraditional stackOndatraSQL
Append dataCustom SQL + scheduler@kind: append
Upsert rowsMERGE + orchestration@kind: merge
Track historyComplex SCD logic@kind: scd2
Collect eventsKafka + consumer@kind: events