Directives
Directives turn SQL into a data pipeline
Directives turn SQL into a data pipeline. They are comments that control how data is materialized, how changes are detected, and how data is validated.
You write SQL. Directives define everything else.
Mental Model
- SQL → what the data is
- Directives → how it behaves
Together, they define a complete pipeline step.
Before vs After
Without directives:
SELECT * FROM raw.orders
Just a query.
With directives:
-- @kind: merge
-- @unique_key: order_id
-- @incremental: updated_at
-- @constraint: total >= 0
-- @audit: row_count > 0
SELECT * FROM raw.orders
A fully managed pipeline — updates existing rows, blocks invalid data, verifies the result, tracks changes automatically.
Grouped Reference
Core
Control how data is materialized:
| Directive | What it does |
|---|---|
@kind | Materialization strategy — table, merge, append, scd2, partition, view, events |
@unique_key | Row matching key (required for merge, scd2, partition) |
@incremental | Cursor column for incremental state |
@incremental_initial | Starting cursor value (default: 1970-01-01T00:00:00Z) |
Storage & Performance
| Directive | What it does |
|---|---|
@partitioned_by | DuckLake file partitioning — organizes data into separate files per key |
@sorted_by | DuckLake sorted table — better statistics for faster queries |
Validation
| Directive | What it does |
|---|---|
@constraint | Block bad data before insert (26 patterns) |
@audit | Verify results after insert, rollback on failure (17 patterns) |
@warning | Log issues without failing |
Metadata
| Directive | What it does |
|---|---|
@description | Table comment in DuckLake catalog |
@column | Column comment + masking tags |
Runtime
| Directive | What it does |
|---|---|
@extension | Load a DuckDB extension before execution |
Syntax by File Type
Same directives, all formats:
-- @kind: merge
-- @unique_key: order_id
-- @constraint: order_id NOT NULL
SELECT * FROM raw.orders
# @kind: append
# @incremental: updated_at
resp = http.get("https://api.example.com/orders")
for order in resp.json:
save.row(order)
kind: merge
unique_key: order_id
constraints:
- order_id NOT NULL
Column Masking
Tags in @column directives trigger automatic masking during materialization:
-- @column: email = Contact email | mask_email | PII
-- @column: ssn = SSN | mask_ssn | sensitive
Define masking macros in config/macros.sql:
CREATE OR REPLACE MACRO mask_email(val) AS
regexp_replace(val::VARCHAR, '(.).*@', '\1***@');
CREATE OR REPLACE MACRO mask_ssn(val) AS
'***-**-' || val[-4:];
Tags with mask, hash, or redact prefixes are applied as macro calls. Other tags (PII, sensitive) are metadata-only — stored for governance, no execution effect.
Directive Compatibility
| Directive | table | view | append | merge | scd2 | partition | events |
|---|---|---|---|---|---|---|---|
@description | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
@unique_key | — | — | — | Required | Required | Required | — |
@partitioned_by | Hint | — | Hint | Hint | Hint | — | — |
@sorted_by | Yes | — | Yes | Yes | Yes | Yes | — |
@incremental | — | — | Yes | Yes | Yes | Yes | — |
@column | Yes | — | Yes | Yes | Yes | Yes | — |
@constraint | Yes | — | Yes | Yes | Yes | Yes | — |
@audit | Yes | — | Yes | Yes | Yes | Yes | — |
@warning | Yes | — | Yes | Yes | Yes | Yes | — |
@extension | Yes | Yes | Yes | Yes | Yes | Yes | — |
Why Directives Exist
Most tools split pipeline configuration across multiple systems — SQL for transformations, config files for behavior, separate tools for validation.
Directives unify everything inside the model. One file defines what the data is and how it behaves.
Ondatra Labs