Blueprints Blog Contact About

Directives

Directives turn SQL into a data pipeline

Directives turn SQL into a data pipeline. They are comments that control how data is materialized, how changes are detected, and how data is validated.

You write SQL. Directives define everything else.

Mental Model

  • SQL → what the data is
  • Directives → how it behaves

Together, they define a complete pipeline step.

Before vs After

Without directives:

SELECT * FROM raw.orders

Just a query.

With directives:

-- @kind: merge
-- @unique_key: order_id
-- @incremental: updated_at
-- @constraint: total >= 0
-- @audit: row_count > 0

SELECT * FROM raw.orders

A fully managed pipeline — updates existing rows, blocks invalid data, verifies the result, tracks changes automatically.

Grouped Reference

Core

Control how data is materialized:

DirectiveWhat it does
@kindMaterialization strategy — table, merge, append, scd2, partition, view, events
@unique_keyRow matching key (required for merge, scd2, partition)
@incrementalCursor column for incremental state
@incremental_initialStarting cursor value (default: 1970-01-01T00:00:00Z)

Storage & Performance

DirectiveWhat it does
@partitioned_byDuckLake file partitioning — organizes data into separate files per key
@sorted_byDuckLake sorted table — better statistics for faster queries

Validation

DirectiveWhat it does
@constraintBlock bad data before insert (26 patterns)
@auditVerify results after insert, rollback on failure (17 patterns)
@warningLog issues without failing

Metadata

DirectiveWhat it does
@descriptionTable comment in DuckLake catalog
@columnColumn comment + masking tags

Runtime

DirectiveWhat it does
@extensionLoad a DuckDB extension before execution

Syntax by File Type

Same directives, all formats:

-- @kind: merge
-- @unique_key: order_id
-- @constraint: order_id NOT NULL

SELECT * FROM raw.orders
# @kind: append
# @incremental: updated_at

resp = http.get("https://api.example.com/orders")
for order in resp.json:
    save.row(order)
kind: merge
unique_key: order_id
constraints:
  - order_id NOT NULL

Column Masking

Tags in @column directives trigger automatic masking during materialization:

-- @column: email = Contact email | mask_email | PII
-- @column: ssn = SSN | mask_ssn | sensitive

Define masking macros in config/macros.sql:

CREATE OR REPLACE MACRO mask_email(val) AS
    regexp_replace(val::VARCHAR, '(.).*@', '\1***@');

CREATE OR REPLACE MACRO mask_ssn(val) AS
    '***-**-' || val[-4:];

Tags with mask, hash, or redact prefixes are applied as macro calls. Other tags (PII, sensitive) are metadata-only — stored for governance, no execution effect.

Directive Compatibility

Directivetableviewappendmergescd2partitionevents
@descriptionYesYesYesYesYesYesYes
@unique_keyRequiredRequiredRequired
@partitioned_byHintHintHintHint
@sorted_byYesYesYesYesYes
@incrementalYesYesYesYes
@columnYesYesYesYesYes
@constraintYesYesYesYesYes
@auditYesYesYesYesYes
@warningYesYesYesYesYes
@extensionYesYesYesYesYesYes

Why Directives Exist

Most tools split pipeline configuration across multiple systems — SQL for transformations, config files for behavior, separate tools for validation.

Directives unify everything inside the model. One file defines what the data is and how it behaves.