Analytics with DuckDB
Refract's event output is newline-delimited JSON (NDJSON). DuckDB can query it directly with SQL, no loading step needed.
Quick start
# Export events
refract export "Bitcoin" --format ndjson > bitcoin-events.jsonl
# Query with DuckDB
duckdb -c "SELECT event_type, count(*) as cnt
FROM 'bitcoin-events.jsonl'
GROUP BY event_type ORDER BY cnt DESC"
Example queries
Event type distribution
SELECT event_type, count(*) as cnt
FROM 'bitcoin-events.jsonl'
GROUP BY event_type
ORDER BY cnt DESC;
Citation churn over time
SELECT strftime(timestamp, '%Y-%m') as month,
count(*) as citation_events
FROM 'bitcoin-events.jsonl'
WHERE event_type IN ('citation_added', 'citation_removed', 'citation_replaced')
GROUP BY month
ORDER BY month;
Template disputes
SELECT section, count(*) as template_events
FROM 'bitcoin-events.jsonl'
WHERE event_type = 'template_added'
AND deterministic_facts[0]->>'fact' = 'template_changed'
GROUP BY section
ORDER BY template_events DESC;
DuckDB setup
# Install DuckDB
brew install duckdb
# Or: curl -O https://github.com/duckdb/duckdb/releases/download/v1.5.2/duckdb_cli-osx-universal.zip
Validation
# Count events — should match refract analyze output
duckdb -c "SELECT count(*) FROM 'events.jsonl'"
All queries work on native DuckDB with no extensions. Refract NDJSON follows the standard format that DuckDB's JSON scanner reads out of the box.