catalog.sql

On this page

Catalog: where your data lives

Phase: Catalog | Order: 4 | Required: Yes

One ATTACH statement tells OndatraSQL where to store metadata and Parquet data.

Backends

SQLite (default)

ATTACH 'ducklake:sqlite:ducklake.sqlite' AS lake
    (DATA_PATH 'ducklake.sqlite.files');
-- With cloud data path
ATTACH 'ducklake:sqlite:ducklake.sqlite' AS lake
    (DATA_PATH 's3://my-bucket/data/');

DuckDB

ATTACH 'ducklake:duckdb:ducklake_catalog.duckdb' AS lake
    (DATA_PATH 'ducklake.duckdb.files');

PostgreSQL

ATTACH 'ducklake:postgres:dbname=ducklake_catalog host=db.prod.internal user=ondatra password=${PG_PASSWORD}' AS lake
    (DATA_PATH 's3://my-bucket/data/');

Requires postgres in extensions.sql. Environment variables (${PG_PASSWORD}) are expanded before parsing.

Backend Comparison

SQLiteDuckDBPostgreSQL
SetupZero configZero configRequires a PostgreSQL server
Read accessOndatraSQL onlyOndatraSQL onlyAny PostgreSQL client
Multi-machineNoNoYes
Cloud dataYes (S3/GCS)Yes (S3/GCS)Yes
SandboxFile copyFile copyCREATE DATABASE TEMPLATE
MySQL is supported by DuckLake but not by OndatraSQL. Sandbox, CDC, and run-type detection are tested for SQLite, DuckDB, and PostgreSQL only.

PostgreSQL Architecture

Pipeline
OndatraSQL
DAG, CDC, validation, scheduling
Storage
Primary PostgreSQL
Metadata catalog
S3 / MinIO
Parquet data files
Analytics
Read replica
Streaming replication
Grafana
Metabase
Redash
Looker
Superset

Sandbox creates a temporary catalog copy via CREATE DATABASE TEMPLATE. This requires exclusive access to the primary, so all other connections are terminated during the fork. Read replicas are unaffected.

Options

Data Inlining

CALL lake.set_option('data_inlining_row_limit', 20);   -- default: 10
CALL lake.set_option('data_inlining_row_limit', 0);    -- disable

Deletion Vectors

CALL lake.set_option('write_deletion_vectors', true);

Encryption

ATTACH 'ducklake:sqlite:ducklake.sqlite' AS lake
    (DATA_PATH 'ducklake.sqlite.files', ENCRYPTED);

Each file gets a unique encryption key stored in the catalog. Decryption is transparent.