How the tracked model kind handles outbound data sync
OndatraSQL's tracked model kind uses content hashing to detect row changes and sync only modified data to external APIs for outbound sync.
OndatraSQL’s tracked model kind enables outbound data sync with automatic change detection. This post describes how it works and how it differs from merge for outbound use cases.
How tracked models work
A tracked model groups rows by @group_key and computes a content hash per group. On each run, OndatraSQL compares the current hashes against the previous run. Only groups with changed hashes are written.
-- models/sync/invoices.sql
-- @kind: tracked
-- @group_key: invoice_id
-- @sink: erp_push
SELECT invoice_id, line_item, quantity, price
FROM mart.invoice_lines
On the first run, all rows are materialized and pushed. On subsequent runs, only groups where any row changed are pushed to the API.
Content hashing
OndatraSQL adds a _content_hash column to the target table automatically. The hash is computed from all column values in the row group (excluding the hash column itself).
| Run | Group invoice_id=INV-001 hash | Action |
|---|---|---|
| First | a1b2c3... | Push all rows for INV-001 |
| Second | a1b2c3... | Skip (unchanged) |
| Third | d4e5f6... | Push all rows for INV-001 (hash changed) |
Raw change types
Since v0.18.0, the runtime exposes raw DuckLake change types to your push function. Tracked materializes via DELETE+INSERT, so table_changes() produces delete and insert events. Your push function receives each row with __ondatra_change_type set to one of these values.
To distinguish updates from real deletes, group by @group_key:
- Key has both
deleteandinsert→ the entity was updated (push the inserts) - Key has only
delete→ the entity was removed from source - Key has only
insert→ new entity
This is the same logic that was previously built into the runtime, but now lives in your Starlark code where you can customize it.
tracked vs merge for outbound sync
| merge | tracked | |
|---|---|---|
| Rows per key | Exactly one | Multiple |
| Change detection | Per row (MERGE INTO) | Per group (content hash) |
| Change types | insert, update_postimage | delete, insert |
| Delete detection | Not automatic (no DELETE in DuckLake) | Automatic (groups that disappear) |
| Use case | CRM contacts, user profiles | Invoices, orders with line items |
Use merge when one row = one API object (HubSpot contact, Salesforce account). Use tracked when multiple rows = one API object (invoice with line items, order with products), or when you need delete detection.
Push function
The push function in lib/ receives batches of rows with __ondatra_change_type and __ondatra_rowid. For tracked, you group by @group_key to build complete API payloads:
# lib/erp_push.star
API = {
"base_url": "https://erp.example.com",
"auth": {"env": "ERP_API_KEY"},
"retry": 3,
"backoff": 2,
"push": {
"batch_size": 100,
"batch_mode": "sync",
},
}
def push(rows=[], batch_number=1, kind="tracked", key_columns=[]):
results = {}
# Group rows by invoice_id
by_key = {}
for row in rows:
iid = row.get("invoice_id", "")
if iid not in by_key:
by_key[iid] = {"inserts": [], "deletes": []}
ct = row["__ondatra_change_type"]
if ct == "insert":
by_key[iid]["inserts"].append(row)
elif ct == "delete":
by_key[iid]["deletes"].append(row)
for iid, group in by_key.items():
if group["inserts"]:
# Updated or new — push the current rows
payload = {
"invoice_id": iid,
"lines": [{k: v for k, v in l.items() if not k.startswith("_")} for l in group["inserts"]],
}
resp = http.put("/api/invoices/" + iid, json=payload)
status = "ok" if resp.ok else "error: " + resp.text
else:
# Only deletes — entity removed from source
resp = http.delete("/api/invoices/" + iid)
status = "ok" if resp.ok else "error: " + resp.text
for row in group["inserts"] + group["deletes"]:
key = str(row["__ondatra_rowid"]) + ":" + row["__ondatra_change_type"]
results[key] = status
return results
Note how the API dict handles auth and retry — the push function focuses on the API-specific logic.
DAG integration
Tracked models participate in the same dependency graph as all other models:
ondatrasql run executes all three in order. If mart.invoice_lines didn’t change, the sync model receives no changed rows and does nothing.
When to use tracked for outbound sync
Use tracked when:
- The destination expects complete objects (all line items for an invoice)
- Multiple source rows map to one API object
- You need delete detection (entities that disappear from source)
- You need group-level change detection (any row change triggers full group push)
For single-row-per-key sync without delete detection, use merge + @sink instead.
OndatraSQL