Downstream integration

Refract produces a deterministic event stream. Downstream systems consume that stream to interpret what the changes mean for their domain.

Integration surfaces

1. Structured events via CLI

# Export events as NDJSON
refract export "Bitcoin" --format ndjson > bitcoin-events.jsonl

# Export as ObservationReport with Merkle root
refract analyze "Bitcoin" --report > bitcoin-report.json

Each event carries schemaVersion, FactProvenance (analyzer, version, parameters), and an eventId (deterministic content hash).

2. SDK events via adapter

import { buildStructuredEvents, EVENT_SCHEMA_VERSION } from "@refract-org/evidence-graph";
import { sectionDiffer, citationTracker, detectEditClusters } from "@refract-org/analyzers";

const events = buildStructuredEvents(revisions);
// Each event has schemaVersion, FactProvenance with version, parameters

3. FactProvenance for auditability

Every event's deterministicFacts[0].provenance includes:

{
  "analyzer": "section-differ",
  "version": "0.4.0",
  "parameters": { "similarityThreshold": 0.8 }
}

When a consumer overrides a threshold, the effective value is in parameters.

4. Schema versioning

Every event carries schemaVersion matching EVENT_SCHEMA_VERSION. This prevents silent invalidation of historical observations when EventType gains new members.

5. Config version pinning

AnalyzerConfig.$version is pinned from the CLI version. Downstream systems can prove which configuration was used for a given run.

refract analyze "Bitcoin" --similarity 0.85
# config.$version = "0.5.1"
# config.section.similarityThreshold = 0.85

6. ObservationReport for chain-of-custody

{
  "pageTitle": "Bitcoin",
  "observedAt": "2026-05-15T10:00:00Z",
  "revisionRange": { "from": 100, "to": 200 },
  "merkleRoot": "a1b2c3d4...",
  "eventCount": 47,
  "analyzerVersion": "0.5.1"
}

The Merkle root is from the replay manifest. A downstream system can verify events match the observation.

Baseline superiority validation

When validating an integrated signal, supply a Refract event summary to computeBaselineSuperiority(). The function adds Refract-derived baselines (revert count, citation flux, edit clusters) that the signal must outperform.

const result = computeBaselineSuperiority({
  integratedLeadTimeDays: signal.leadTimeDays,
  mentionCount: snapshot.metrics.mentionCount,
  revertCount: snapshot.metrics.revertCount,
  refractEventSummary: { totalEvents: 47, revertCount: 12 },
});

Principle

Refract's event stream is purely mechanical. All interpretation happens downstream. Refract provides the deterministic record; the consumer provides the judgment.

What downstream systems build

Consumer Builds on Refract
Healthcare decision intelligence Feed structured events into a measurement pipeline that scores claims by clinical truth, ratification, economic stake, and feasibility. Each event carries the exact analyzer thresholds used.
AI training data curation Score each claim by revert count, citation churn, talk page correlation, and template dispute history. Include only stable, well-sourced claims in training data.
Provenance-aware RAG Enrich each retrieved chunk with its claim history — stable, recently changed, source-fragile, contested. Use the signal to weight or filter results.
Regulatory monitoring Run refract cron on drug pages, guidelines, and regulatory topics. Alert on citation removal, template disputes, or section reorganization.
Competitive intelligence Use refract diff to compare how the same topic is framed across wikis (English vs German Wikipedia, Fandom vs independent wiki). Track divergence over time.
Fact-checking Given a claim, query its lifecycle — first appearance, source additions, revert history, talk page activity. Return a verifiable provenance timeline.
Academic research Export ObservationReport with Merkle-verifiable claim histories. Analyze claim stability across topics, time periods, and editorial environments.
Journalism forensics Track how a specific claim about a person evolved. Detect coordinated editing, source softening, or removal without replacement.
Fan wiki canon tracking Compare the same fictional universe across competing wikis. Detect retcon divergence and measure by how much.
Knowledge graph engineering Use --depth forensic to capture category and wikilink change events. Build an entity graph that evolves with the public record.

Complementary technologies

Refract pairs naturally with these modern tools. The event stream is standard JSON/NDJSON — anything that reads JSON or speaks HTTP can consume it.

Category Technology How they fit
Vector databases Pinecone, Weaviate, pgvector, Chroma Store claim embeddings alongside stability metadata. Query: "find claims similar to X that are stable and well-sourced."
RAG frameworks LangChain, LlamaIndex, Vercel AI SDK Use Refract's stability/contestation signals as retrieval filters or reranking features. A LangChain document loader is available in the refract-py package.
AI coding agents Claude Code, Cline, Codex CLI, OpenClaw Agents connect via Refract's built-in MCP server (refract mcp) to read claim histories, track changes, and cite provenance in their reasoning.
Python SDK refract-py (GitHub) Typed dataclasses, pandas DataFrame integration, RefractError handling. Install: pip install refract-py (requires npm install -g @refract-org/cli).
MCP (Model Context Protocol) Any MCP client (Claude Desktop, VS Code, Cursor, ChatGPT) refract mcp is a native MCP server exposing tools for analyze, claim, export, cron, and classify. AI agents use these tools to retrieve claim history directly.
Data lakes & query DuckDB, Apache Parquet, ClickHouse Query refract export --format ndjson output with SQL. DuckDB can query JSONL files directly: SELECT event_type, count(*) FROM 'events.jsonl' GROUP BY event_type;
Streaming Apache Kafka, Redpanda, Cloudflare Queues Feed event streams into real-time claim monitoring pipelines. Each EvidenceEvent is a Kafka message with key by claimId for stateful processing.
Visualization Observable Framework, Mermaid, D3 refract visualize --format mermaid produces Mermaid diagrams. Observable Framework has a dedicated @refract-org/observable data loader. D3 reads event JSONL directly.
Knowledge graphs RDF, SPARQL, Neo4j Convert wikilink_added/category_added events into triple statements. Build an evolving entity graph where each edge has a revision timestamp.
Model serving OpenAI API, DeepSeek, Ollama, vLLM, Workers AI Plug any OpenAI-compatible endpoint into refract classify at each BYO-inference boundary. Workers AI runs models at the edge without managing servers.
Local inference WebGPU, MLX, llama.cpp Run detection models directly on-device — no API key needed. Refract defaults are mechanical (zero inference), but any boundary can be replaced with a local model via MCP sampling or Ollama.
Notebooks Jupyter, Marimo, Observable notebooks Load event JSONL into a DataFrame: pd.read_json("events.jsonl", lines=True). Analyze claim stability, citation churn, and edit cluster patterns interactively. Marimo's reactive runtime is particularly well-suited for live event stream analysis.
Serverless Cloudflare Workers, D1, R2, Queues Run refract via npx in a Worker, store structured events in D1, export to R2, queue re-observations. The entire infrastructure is edge-deployable with no servers to manage.

Production ingestion

When consuming Refract events in a production pipeline, persist them to a queryable table. The schema below is a reference DDL that works with any relational database (D1, PostgreSQL, SQLite):

CREATE TABLE refract_events (
    event_id TEXT PRIMARY KEY,
    event_type TEXT NOT NULL,
    schema_version TEXT NOT NULL,
    from_revision_id INTEGER NOT NULL,
    to_revision_id INTEGER NOT NULL,
    section TEXT NOT NULL,
    fact TEXT NOT NULL,
    fact_detail TEXT,
    analyzer_name TEXT,
    analyzer_version TEXT,
    input_hashes TEXT,       -- JSON array of input hashes
    parameters_json TEXT,    -- JSON: effective FactProvenance parameters
    observed_at TEXT NOT NULL,
    batch_id TEXT NOT NULL,
    page_title TEXT NOT NULL,
    entity_id TEXT,
    created_at TEXT NOT NULL DEFAULT (datetime('now'))
);

CREATE INDEX idx_refract_events_batch ON refract_events(batch_id);
CREATE INDEX idx_refract_events_type ON refract_events(event_type);
CREATE INDEX idx_refract_events_page ON refract_events(page_title);
CREATE INDEX idx_refract_events_observed ON refract_events(observed_at);

Repository pattern

Use a single adapter file as the import boundary between Refract and your codebase. The adapter re-exports Refract functions and types; no other file imports from @refract-org/* directly.

// adapter.ts — single import boundary
export type { EvidenceEvent, FactProvenance, AnalyzerConfig } from '@refract-org/evidence-graph';
export { EVENT_SCHEMA_VERSION, DEFAULT_ANALYZER_CONFIG, createEventIdentity } from '@refract-org/evidence-graph';
export { sectionDiffer, citationTracker, revertDetector, detectEditClusters } from '@refract-org/analyzers';
export { buildStructuredEvents } from '../adapter/build-events';

// repository.ts — D1 insert
export async function insertRefractEvents(
  db: D1Database,
  events: EvidenceEvent[],
  batchId: string,
  pageTitle: string,
): Promise<number> {
  let count = 0;
  for (const event of events) {
    const fact = event.deterministicFacts?.[0];
    await db.prepare(`
      INSERT OR IGNORE INTO refract_events (...) VALUES (...)
    `).bind(/* event fields */).run();
    count++;
  }
  return count;
}

Migration guide (0.3.x → 0.4.x)

When upgrading from @refract-org/evidence-graph@0.3.x to 0.4.x:

  1. Add "sentence_modified" to any EventType whitelists in your code
  2. The FactProvenance interface now has an optional parameters field — consumers that read it gain provenance transparency but are not required to
  3. The EVENT_SCHEMA_VERSION constant ("0.4.0") and CLAIM_IDENTITY_VERSION ("claimidentityv1") are new exports
  4. Every event now carries schemaVersion — verify your DDL can store this field
  5. AnalyzerConfig now supports $version for config pinning — optional, no migration action needed

See the version compatibility table for the full matrix.