Incremental Models
Incremental is not a feature — it's the default
Incremental processing is automatic. Write a normal query. OndatraSQL runs it on new data only.
Mental Model
- SQL models → incremental by default
- Scripts → use a cursor
You don’t write incremental logic — you just define the model.
Before and After
Traditional approach:
SELECT * FROM raw.events
WHERE event_time > (SELECT MAX(event_time) FROM target)
Manual. Fragile.
OndatraSQL:
-- @kind: append
SELECT * FROM raw.events
Done. Smart CDC handles the rest.
SQL Models
-- @kind: append
-- @incremental: event_time
SELECT * FROM raw.events
- First run → all data
- Next runs → only new data
Automatic via Smart CDC. No manual filters.
Custom Start Value
-- @incremental: updated_at
-- @incremental_initial: "2024-01-01"
Default: 1970-01-01T00:00:00Z.
Starlark Scripts
Scripts don’t have SQL to analyze — you control the cursor.
# @kind: append
# @incremental: updated_at
url = "https://api.example.com/events"
if not incremental.is_backfill:
url += "?since=" + incremental.last_value
resp = http.get(url)
for item in resp.json:
save.row(item)
| Property | Description |
|---|---|
incremental.is_backfill | True on first run |
incremental.last_value | MAX(cursor) from previous run |
incremental.initial_value | Starting value |
incremental.last_run | Last successful run timestamp |
SQL vs Scripts
| Model type | Incremental behavior |
|---|---|
| SQL | Automatic (Smart CDC) |
| Scripts | Manual cursor |
| YAML | Uses script logic |
When Backfill Happens
- Model SQL or directives changed
- Target table doesn’t exist
@kind,@unique_key,@incrementalchanged
Why This Matters
Most tools require writing incremental filters and managing state manually.
OndatraSQL makes incremental the default. Combined with Smart CDC and Schema Evolution, this eliminates most pipeline maintenance work.
Ondatra Labs