DuckDB Configuration

How the runtime boots — DuckDB configuration via SQL files

OndatraSQL prepares DuckDB before running your pipeline. Everything in config/ is executed as SQL — no YAML, no config DSL.

This is where you configure DuckDB, define credentials, attach your catalog, and connect external sources.

Mental Model

The runtime loads configuration in two phases:

Before the catalog is attached → configure DuckDB itself

After the catalog is attached → define logic that depends on your data

Loading Order

Pre-catalog:

File	What it does
`settings.sql`	Configure DuckDB (memory, threads, etc.)
`secrets.sql`	Credentials — not stored in data, only used at runtime
`extensions.sql`	Load DuckDB extensions

Catalog:

File	What it does
`catalog.sql`	Attach DuckLake — without this, there’s no database to write to

Post-catalog:

File	What it does
`macros.sql`	Reusable SQL logic
`variables.sql`	Runtime variables
`sources.sql`	External database connections

All files except catalog.sql are optional. Missing files are silently skipped.

Schemas are created automatically. Organize models into directories — models/staging/orders.sql creates staging schema on first run.

Common Patterns

Connect to PostgreSQL:

-- config/sources.sql
ATTACH 'postgresql://${PG_USER}:${PG_PASSWORD}@${PG_HOST}/db' AS source (READ_ONLY);

Add a reusable macro:

-- config/macros.sql
CREATE MACRO is_active(status) AS status = 'active';

Use cloud storage:

-- config/secrets.sql
CREATE SECRET aws_chain (TYPE S3, PROVIDER credential_chain);

Set DuckDB memory limit:

-- config/settings.sql
SET memory_limit = '4GB';
SET threads = 4;

Built-in Runtime

OndatraSQL automatically provides variables and macros after config loads. No setup required. These power change detection, lineage, and incremental execution.

Variables

Variable	Description
`ondatra_run_time`	When the current run started (UTC)
`ondatra_load_id`	Unique identifier for this run
`curr_snapshot`	Current DuckLake snapshot ID
`prev_snapshot`	Previous snapshot ID (for CDC)
`dag_start_snapshot`	Snapshot at DAG start (consistent CDC across models)

Macros

Runtime

ondatra_now() · ondatra_load_id() · last_snapshot() · prev_snapshot() · dag_start_snapshot()

Change Data Capture

ondatra_cdc_changes() · ondatra_cdc_deletes() · ondatra_cdc_updates_before() · ondatra_cdc_summary() · ondatra_cdc_empty()

Schema & Metadata

ondatra_table_exists() · ondatra_column_exists() · ondatra_column_type() · ondatra_get_columns() · ondatra_schema_hash()

Data Quality

ondatra_row_count() · ondatra_has_rows() · ondatra_min_value() · ondatra_max_value() · ondatra_distinct_count() · ondatra_null_count() · ondatra_duplicate_count() · ondatra_table_diff() · ondatra_compare_tables()

Metadata

ondatra_get_sql_hash() · ondatra_get_commit_info() · ondatra_get_downstream()

Editing

ondatrasql edit settings
ondatrasql edit secrets
ondatrasql edit catalog
ondatrasql edit macros
ondatrasql edit variables
ondatrasql edit sources

Sandbox Mode

In sandbox mode, catalog.sql is skipped. Instead:

Production catalog → attached read-only
Sandbox catalog → attached writable

Models read from production, write to sandbox. All other config files load normally.

DuckDB Configuration

Mental Model

Loading Order

Common Patterns

Built-in Runtime

Variables

Macros

Editing

Sandbox Mode

settings.sql

secrets.sql

extensions.sql

catalog.sql

macros.sql

variables.sql

sources.sql