API Dict

On this page

The API dict declares the complete contract for a lib function. One dict per file. All values must be literals — no variables, no concatenation. The runtime parses it as AST without executing code.

API = {
    # Shared config (injected into all http.* calls)
    "base_url": "https://api.example.com",
    "auth": {"env": "API_KEY"},
    "headers": {"Accept": "application/json"},
    "timeout": 30,
    "retry": 3,
    "backoff": 1,
    "rate_limit": {"requests": 100, "per": "10s"},

    # Inbound
    "fetch": {
        "args": ["resource"],
        "page_size": 100,
    },

    # Outbound
    "push": {
        "args": ["spreadsheet_id", "range"],
        "batch_size": 100,
        "batch_mode": "sync",
    },
}

Top-level config (shared)

Injected into all http.* calls by both fetch() and push().

FieldTypeDefaultDescription
base_urlstringnonePrepended to relative URLs in http.* calls
authdictnoneAuth injection (see Auth patterns)
headersdictnoneDefault headers merged into every request
timeoutint30Request timeout in seconds
retryint0Number of retries on 5xx/429
backoffint1Initial backoff in seconds (exponential)
rate_limitdictnoneProactive rate limiting: {"requests": N, "per": "Ns"}

Per-call kwargs in http.get(url, timeout=60) override these defaults.

Auth patterns

The runtime handles token refresh, header injection, and caching. Auth is only injected when the caller does NOT set auth= in the http.* call.

API key from .env

"auth": {"env": "API_KEY"}                              # → Authorization: Bearer <value>
"auth": {"env": "API_KEY", "header": "X-Api-Key"}       # → X-Api-Key: <value>
"auth": {"env": "API_KEY", "param": "api_key"}           # → ?api_key=<value>

Google service account

"auth": {
    "service_account": {"env": "GAM_KEY_FILE"},
    "scope": "https://www.googleapis.com/auth/admanager",
}

service_account resolves the key file path from .env via the {"env": "..."} pattern. The runtime handles JWT signing and token refresh automatically.

OAuth2 provider (browser-based SaaS APIs)

"auth": {"provider": "hubspot"}

Register with ondatrasql auth <provider>. Tokens refresh automatically.

Basic auth

"auth": {
    "user": {"env": "API_USER"},
    "pass": {"env": "API_PASS"},
}

Resolves both values from .env and sends Authorization: Basic <base64>.

No authentication

Omit auth entirely:

API = {
    "base_url": "https://api.riksbank.se/swea/v1",
    "fetch": {"args": ["series"]},
}

Fetch section

FieldTypeDefaultDescription
argslist[]Parameter names passed from SQL
page_sizeint0Rows per page (0 = single call)
supported_columnslistnoneOptional whitelist of valid column names. Validated at startup.
supported_kindslistnoneOptional whitelist of valid model kinds. Validated at startup.

SQL controls the schema — the runtime extracts column names and normalized types from the SELECT via DuckDB AST and passes them to fetch() as the columns kwarg.

If supported_columns is declared and SQL requests an unknown column, the model fails at parse time. If not declared, any column name is accepted.

Args from SQL

API = {"fetch": {"args": ["resource", "options"]}}
SELECT * FROM my_api('users', '{"filter": "active"}')

Args are positional strings. For structured configuration, pass JSON and decode in Starlark:

def fetch(resource, options="", page=None):
    opts = json.decode(options) if options else {}

Async fetch

For APIs with asynchronous report generation (submit → poll → fetch results):

"fetch": {
    "args": ["options"],
    "page_size": 10000,
    "async": True,
    "poll_interval": "5s",
    "poll_timeout": "5m",
    "poll_backoff": 2,
},
FieldTypeDefaultDescription
asyncboolFalseEnable async fetch mode
poll_intervalstring"5s"Minimum wait between check() calls
poll_timeoutstring"5m"Max total poll duration before timeout error
poll_backoffint1Backoff multiplier for poll interval (e.g. 2 = 5s, 10s, 20s, capped at 30s)

Async mode requires submit(), check(), and fetch_result() functions instead of fetch(). See Fetch Contract — Async fetch.

Push section

FieldTypeDefaultDescription
argslist[]Parameter names from @sink: name('arg1', 'arg2')
supported_kindslistnoneOptional whitelist of valid model kinds. Validated at startup.
batch_sizeint1Rows per push() call
batch_modestring"sync""sync", "atomic", or "async"
max_concurrentint1Parallel batch workers
rate_limitdictinheritedPer-direction override
poll_intervalstring"30s"Async polling interval
poll_timeoutstring"1h"Async polling timeout

Sink args

API = {"push": {"args": ["spreadsheet_id", "range"]}}
-- @sink: gsheets('1DYJCOd...', 'Sheet1')

Args are positional strings from the @sink directive. Mapped to push kwargs by name:

def push(rows=[], spreadsheet_id="", range=""):

Push function signature

def push(rows=[], batch_number=1, kind="table", key_columns=[], columns=[]):

All parameters are kwargs. Declare only what your blueprint needs — the runtime filters automatically.

KwargTypeDescription
rowslistRow dicts with column values + internal fields
batch_numberint1-based batch counter
kindstringModel kind: table, append, merge, tracked
key_columnslistKey column names — @unique_key for merge, @group_key for tracked (e.g. ["id"] or ["region", "year"])
columnslistColumn names (sorted, excluding internal fields)

Plus any sink args declared in push.args.

Batch modes

ModeReturnBehavior
sync{"rowid:change_type": status}Per-row ack/nack
atomicNoneAll-or-nothing
async{"job_id": ...}Job-based polling via poll()

Literal values only

The API dict is parsed as Starlark AST — not executed. All values must be literals:

# Works — literal values
API = {"base_url": "https://api.example.com", "timeout": 30}

# Does NOT work — variable reference
BASE = "https://api.example.com"
API = {"base_url": BASE}

# Does NOT work — concatenation
API = {"base_url": "https://" + HOST}

This is by design. The dict is pure configuration — readable, validatable, and inspectable without running code. Dynamic values belong in the fetch() or push() function.

Validation

The runtime validates at startup:

  • fetch() params checked against args list
  • push() must accept rows parameter
  • batch_mode must be "sync", "atomic", or "async"
  • rate_limit.per must be valid duration
  • rate_limit.requests must be > 0
  • async: True requires submit(), check(), fetch_result() — not fetch()
  • scd2 kind with @sink is not allowed

See Fetch Contract and Push Contract for the complete function specs.