Blueprints Blog Contact About
← Back to blog

How the tracked model kind handles outbound data sync

OndatraSQL's tracked model kind uses content hashing to detect row changes and sync only modified data to external APIs for reverse ETL.

OndatraSQL’s tracked model kind enables outbound data sync — sending pipeline results to external APIs with automatic change detection. This post describes how the tracked kind works and how it applies to outbound use cases.

How tracked models work

A tracked model groups rows by @unique_key and computes a content hash per group. On each run, OndatraSQL compares the current hashes against the previous run. Only groups with changed hashes are processed.

# models/sync/users.star
# @kind: tracked
# @unique_key: id

rows = query("SELECT id, email, plan FROM mart.customers")
for row in rows:
    http.post("https://api.example.com/users", json=row)
    save.row(row)

On the first run, all rows are processed. On subsequent runs, only rows where email, plan, or any other column value changed are sent to the API.

Content hashing

OndatraSQL adds a _content_hash column to the target table automatically. The hash is computed from all column values in the row group (excluding the hash column itself).

RunRow id=42 hashAction
Firsta1b2c3...Process (new)
Seconda1b2c3...Skip (unchanged)
Thirdd4e5f6...Process (hash changed)

This is the same mechanism used for file ingestion with @kind: tracked — one row per source file, content hash per file. The outbound sync case inverts the direction: instead of reading from an external source, it writes to one.

DAG integration

Tracked models participate in the same dependency graph as all other models. In this example, the sync model depends on mart.customers:

raw/orders.star       (ingest)
    ↓
staging/orders.sql    (transform)
    ↓
mart/customers.sql    (aggregate)
    ↓
sync/hubspot.star     (outbound sync)

ondatrasql run executes all four in order. If mart.customers didn’t change (because its upstream sources were unchanged), the sync model receives no changed rows and does nothing.

Starlark modules for outbound sync

The outbound model uses the same built-in modules as ingestion models:

ModuleRole in outbound sync
query()Select rows from pipeline tables
http.post()Send data to external API
http.oauth()Authenticate with OAuth2
save.row()Record the synced row (for hash tracking)

save.row() is required — it writes the row to the tracked table so the content hash is stored for the next run.

Full example

# models/sync/hubspot_contacts.star
# @kind: tracked
# @unique_key: customer_id

rows = query("""
    SELECT customer_id, email, name, plan
    FROM mart.customers
    WHERE active = true
""")

for row in rows:
    http.post("https://api.hubapi.com/contacts/v1/contact",
        headers={"Authorization": "Bearer " + env("HUBSPOT_TOKEN")},
        json={"email": row["email"], "properties": [
            {"property": "name", "value": row["name"]},
            {"property": "plan", "value": row["plan"]}
        ]})
    save.row(row)

On each ondatrasql run, only customers whose email, name, or plan changed since the last sync are sent to HubSpot.

When to use tracked for outbound sync

The tracked kind is designed for batch sync patterns where:

  • The destination accepts individual row updates via API
  • Change detection per row is needed
  • The sync should run as part of the regular pipeline schedule

For high-throughput streaming or real-time sync requirements, a dedicated integration tool may be more appropriate.