← Blog

How the tracked model kind handles outbound data sync

OndatraSQL's tracked model kind uses content hashing to detect row changes and sync only modified data to external APIs for outbound sync.

OndatraSQL’s tracked model kind enables outbound data sync with automatic change detection. This post describes how it works and how it differs from merge for outbound use cases.

How tracked models work

A tracked model groups rows by @group_key and computes a content hash per group. On each run, OndatraSQL compares the current hashes against the previous run. Only groups with changed hashes are written.

-- models/sync/invoices.sql
-- @kind: tracked
-- @group_key: invoice_id
-- @sink: erp_push

SELECT invoice_id, line_item, quantity, price
FROM mart.invoice_lines

On the first run, all rows are materialized and pushed. On subsequent runs, only groups where any row changed are pushed to the API.

Content hashing

OndatraSQL adds a _content_hash column to the target table automatically. The hash is computed from all column values in the row group (excluding the hash column itself).

RunGroup invoice_id=INV-001 hashAction
Firsta1b2c3...Push all rows for INV-001
Seconda1b2c3...Skip (unchanged)
Thirdd4e5f6...Push all rows for INV-001 (hash changed)

Raw change types

Since v0.18.0, the runtime exposes raw DuckLake change types to your push function. Tracked materializes via DELETE+INSERT, so table_changes() produces delete and insert events. Your push function receives each row with __ondatra_change_type set to one of these values.

To distinguish updates from real deletes, group by @group_key:

  • Key has both delete and insert → the entity was updated (push the inserts)
  • Key has only delete → the entity was removed from source
  • Key has only insert → new entity

This is the same logic that was previously built into the runtime, but now lives in your Starlark code where you can customize it.

tracked vs merge for outbound sync

mergetracked
Rows per keyExactly oneMultiple
Change detectionPer row (MERGE INTO)Per group (content hash)
Change typesinsert, update_postimagedelete, insert
Delete detectionNot automatic (no DELETE in DuckLake)Automatic (groups that disappear)
Use caseCRM contacts, user profilesInvoices, orders with line items

Use merge when one row = one API object (HubSpot contact, Salesforce account). Use tracked when multiple rows = one API object (invoice with line items, order with products), or when you need delete detection.

Push function

The push function in lib/ receives batches of rows with __ondatra_change_type and __ondatra_rowid. For tracked, you group by @group_key to build complete API payloads:

# lib/erp_push.star

API = {
    "base_url": "https://erp.example.com",
    "auth": {"env": "ERP_API_KEY"},
    "retry": 3,
    "backoff": 2,

    "push": {
        "batch_size": 100,
        "batch_mode": "sync",
    },
}

def push(rows=[], batch_number=1, kind="tracked", key_columns=[]):
    results = {}

    # Group rows by invoice_id
    by_key = {}
    for row in rows:
        iid = row.get("invoice_id", "")
        if iid not in by_key:
            by_key[iid] = {"inserts": [], "deletes": []}
        ct = row["__ondatra_change_type"]
        if ct == "insert":
            by_key[iid]["inserts"].append(row)
        elif ct == "delete":
            by_key[iid]["deletes"].append(row)

    for iid, group in by_key.items():
        if group["inserts"]:
            # Updated or new — push the current rows
            payload = {
                "invoice_id": iid,
                "lines": [{k: v for k, v in l.items() if not k.startswith("_")} for l in group["inserts"]],
            }
            resp = http.put("/api/invoices/" + iid, json=payload)
            status = "ok" if resp.ok else "error: " + resp.text
        else:
            # Only deletes — entity removed from source
            resp = http.delete("/api/invoices/" + iid)
            status = "ok" if resp.ok else "error: " + resp.text

        for row in group["inserts"] + group["deletes"]:
            key = str(row["__ondatra_rowid"]) + ":" + row["__ondatra_change_type"]
            results[key] = status

    return results

Note how the API dict handles auth and retry — the push function focuses on the API-specific logic.

DAG integration

Tracked models participate in the same dependency graph as all other models:

staging/orders.sql transform
mart/invoice_lines.sql aggregate
sync/invoices.sql outbound sync

ondatrasql run executes all three in order. If mart.invoice_lines didn’t change, the sync model receives no changed rows and does nothing.

When to use tracked for outbound sync

Use tracked when:

  • The destination expects complete objects (all line items for an invoice)
  • Multiple source rows map to one API object
  • You need delete detection (entities that disappear from source)
  • You need group-level change detection (any row change triggers full group push)

For single-row-per-key sync without delete detection, use merge + @sink instead.