Directives

On this page

Directives are SQL comments that control materialization, change detection, and validation.

-- @kind: merge
-- @unique_key: order_id
-- @incremental: updated_at
-- @constraint: compare(total, >=, 0)
-- @audit: row_count(>, 0)

Core

DirectiveWhat it does
@kindMaterialization strategy: table, merge, append, scd2, tracked, events
@unique_keyRow matching key (required for merge, scd2)
@group_keyGroup key for content-hash detection (required for tracked)
@incrementalCursor column for incremental state
@incremental_initialStarting cursor value (default: 1970-01-01T00:00:00Z)

@unique_key vs @group_key

@unique_key identifies individual rows for merge and scd2 kinds. It determines which rows to update, insert, or delete.

@group_key groups rows for the tracked kind. Tracked computes an MD5 content hash per group — if any row in the group changes, the entire group is replaced. Use @group_key when the identity is a grouping concept (e.g. source_file for a file-based pipeline) rather than a row-level primary key.

-- merge: row-level identity
-- @kind: merge
-- @unique_key: order_id

-- tracked: group-level identity
-- @kind: tracked
-- @group_key: source_file

Outbound Sync

DirectiveWhat it does
@sinkPush function name in lib/ for outbound sync. See Sync Data to External APIs

@sink works with table, append, merge, and tracked kinds. Not supported with scd2 (use @kind: table with WHERE is_current = true instead) or events. The runtime exposes raw DuckLake change_type values (insert, update_preimage, update_postimage, delete) to your push function via __ondatra_change_type. Your Starlark code decides how to handle each change type.

Storage and Performance

DirectiveWhat it does
@partitioned_byFile partitioning. Supports column names and transforms: year(col), month(col), day(col), hour(col), bucket(N, col). Applied on new writes.
@sorted_bySorted table hint. Improves query performance via min/max statistics. Applied during compaction (ondatrasql merge).

Validation

DirectiveWhat it does
@constraintBlock bad data before insert. See Constraints
@auditVerify results after insert, rollback on failure. See Audits
@warningLog issues without failing. See Warnings

Metadata

DirectiveWhat it does
@descriptionTable comment in DuckLake catalog
@columnColumn comment + masking tags. See Mask Sensitive Columns
@expose <column>Serve via OData v4. Key column required. See Serve Data via OData
@extensionLoad a DuckDB extension before execution

Example

-- @kind: merge
-- @unique_key: customer_id
-- @sink: hubspot_push
-- @constraint: not_null(customer_id)
-- @audit: row_count(>, 0)

SELECT customer_id, email, plan
FROM mart.customers
WHERE active = true

Compatibility Matrix

Directivetableappendmergescd2trackedevents
@unique_keyRequiredRequired
@group_keyRequired
@incrementalYesYesYesYesYes
@sinkYesYesYesYes
@partitioned_byHintHintHintHintHint
@sorted_byYesYesYesYesYes
@columnYesYesYesYesYes
@constraintYesYesYesYesYes
@auditYesYesYesYesYes
@warningYesYesYesYesYes
@extensionYesYesYesYesYes
@exposeYesYesYesYesYes
@descriptionYesYesYesYesYesYes