Group key for content-hash detection (required for tracked)
@incremental
Cursor column for incremental state
@incremental_initial
Starting cursor value (default: 1970-01-01T00:00:00Z)
@unique_key vs @group_key
@unique_key identifies individual rows for merge and scd2 kinds. It determines which rows to update, insert, or delete.
@group_key groups rows for the tracked kind. Tracked computes an MD5 content hash per group — if any row in the group changes, the entire group is replaced. Use @group_key when the identity is a grouping concept (e.g. source_file for a file-based pipeline) rather than a row-level primary key.
@sink works with table, append, merge, and tracked kinds. Not supported with scd2 (use @kind: table with WHERE is_current = true instead) or events. The runtime exposes raw DuckLake change_type values (insert, update_preimage, update_postimage, delete) to your push function via __ondatra_change_type. Your Starlark code decides how to handle each change type.
Storage and Performance
Directive
What it does
@partitioned_by
File partitioning. Supports column names and transforms: year(col), month(col), day(col), hour(col), bucket(N, col). Applied on new writes.
@sorted_by
Sorted table hint. Improves query performance via min/max statistics. Applied during compaction (ondatrasql merge).