Audits

Detect regressions — rolled back automatically if they fail

Audits detect regressions after data is written.

If an audit fails, the transaction is rolled back and the model fails.

Constraints stop bad rows. Audits catch bad results.

Why This Matters

Some data issues can’t be detected row by row.

Row counts suddenly drop
Distributions drift
Freshness breaks
Aggregates don’t match source systems

These are not row-level problems — they are dataset-level problems. That’s what audits are for.

When Audits Run

Audits run inside the same transaction as the data write.

This means:

Either everything succeeds
Or nothing is committed

If any audit fails:

The transaction is rolled back
No data is committed
Downstream models do not run

Usage

Add audits directly in your model:

-- @audit: PATTERN

You can define multiple audits per model.

-- @audit: row_count > 0
-- @audit: row_count_change < 20%
-- @audit: freshness(updated_at, 24h)
-- @audit: reconcile_sum(total, raw.orders.total)
-- @audit: distribution(status) STABLE(0.1)

If any of these fail, the entire model is rolled back.

Constraints vs Audits

Constraints:

Run before insert
Validate individual rows
Prevent bad data from entering

Audits:

Run after insert
Validate the dataset as a whole
Catch regressions and anomalies

Common Patterns

Pattern	Example	What it catches
`row_count > 0`	`row_count > 0`	Empty results
`freshness(col, duration)`	`freshness(updated_at, 24h)`	Stale data
`row_count_change < N%`	`row_count_change < 20%`	Sudden drops or spikes
`min(col) OP N`	`min(price) >= 0`	Out-of-range values
`max(col) OP N`	`max(price) <= 10000`	Out-of-range values
`reconcile_sum(col, other.col)`	`reconcile_sum(total, raw.orders.total)`	Source/target mismatch
`distribution(col) STABLE(N)`	`distribution(status) STABLE(0.1)`	Category drift

All 17 Patterns

Row Count

Pattern	Example	Description
`row_count OP N`	`row_count >= 100`	Check total row count (supports `>=`, `<=`, `>`, `<`, `=`)
`row_count_change < N%`	`row_count_change < 10%`	Row count change from previous run

Freshness

Pattern	Example	Description
`freshness(col, duration)`	`freshness(updated_at, 24h)`	Maximum age of most recent record (supports `h` for hours, `d` for days)

Statistical

Pattern	Example	Description
`mean(col) ...`	`mean(price) BETWEEN 10 AND 100`	Average value check (supports BETWEEN and comparisons)
`stddev(col) < N`	`stddev(amount) < 25`	Standard deviation threshold
`min(col) OP N`	`min(price) >= 0`	Minimum value check
`max(col) OP N`	`max(price) <= 10000`	Maximum value check
`sum(col) OP N`	`sum(amount) >= 0`	Aggregate sum check
`zscore(col) < N`	`zscore(amount) < 3.0`	Statistical outlier detection
`percentile(col, p) OP N`	`percentile(latency, 0.95) <= 500`	Quantile validation

Reconciliation

Pattern	Example	Description
`reconcile_count(table)`	`reconcile_count(raw.orders)`	Row counts must match
`reconcile_sum(col, other.col)`	`reconcile_sum(total, raw.orders.total)`	Aggregate sums must match

Schema

Pattern	Example	Description
`column_exists(col)`	`column_exists(email)`	Verify column exists in output
`column_type(col, type)`	`column_type(price, DECIMAL(18,2))`	Verify column type (supports precision)

Data Comparison

Pattern	Example	Description
`golden('path')`	`golden('tests/expected.csv')`	Compare output with golden dataset
`distribution(col) STABLE(threshold)`	`distribution(status) STABLE(0.1)`	Categorical distribution stability

Full Example

-- @kind: append
-- @incremental: event_time
-- @audit: row_count > 0
-- @audit: freshness(event_time, 24h)
-- @audit: row_count_change < 50%
-- @audit: min(amount) >= 0
-- @audit: zscore(amount) < 3.0
-- @audit: reconcile_sum(amount, raw.events.amount)

SELECT event_id, event_time, amount, category
FROM raw.events

Compared to Traditional Validation

Traditional approach:

Run tests after data is written
Failures require manual cleanup
Data may already be used downstream

OndatraSQL:

Audits run inside the transaction
Failures automatically roll back
Bad data never persists