Interpreting Refract output

Refract emits a deterministic event stream. This page explains how to read it.

The event envelope

Every event has the same structure:

{
  "eventId": "a3f5c2e1b7d409fa",
  "eventType": "sentence_first_seen",
  "claimId": "c7d8e9f0a1b23456",
  "fromRevisionId": 1280110001,
  "toRevisionId": 1280110100,
  "section": "Geology",
  "before": "",
  "after": "Earth's crust consists of tectonic plates...",
  "deterministicFacts": [
    {
      "fact": "New sentence detected",
      "detail": "First appearance of this text across revision range",
      "provenance": {
        "analyzer": "section-differ",
        "version": "0.5.1",
        "inputHashes": ["sha256:abc123..."]
      }
    }
  ],
  "layer": "observed",
  "timestamp": "2024-11-25T12:00:00Z",
  "schemaVersion": "0.4.0"
}

Field meanings

Field What it tells you
eventType Kind of change — sentence, citation, template, revert, section, etc. See event taxonomy.
fromRevisionId / toRevisionId The revision boundary where this change occurred. from is the parent revision; to is the revision containing the change.
section Which section the change belongs to. (lead) means the page introduction. Not all event types use this field.
before / after The text before and after the change. For additions, before is empty. For removals, after is empty. For modifications, both are present.
deterministicFacts Why the engine produced this event. Always mechanical — pure functions of the wikitext. The provenance field identifies which analyzer, what version, and what parameters.
layer Where the evidence came from: "observed" (deterministic), "policy_coded" (rules-based), "model_interpretation" (downstream only), "speculative" (low confidence), "unknown".
schemaVersion Which schema version produced this event. Use this for compatibility checks when consuming events across package versions.

Event types by category

Claim lifecycle (sentence_*)

These events track propositional content across revisions. A sentence is identified by its text; Refract detects when the same sentence appears, disappears, changes, or returns.

Claim lifecycle across revisions

Source changes (citation_*)

Citations are extracted from <ref> tags in wikitext.

Policy signals (template_*)

Templates like {{citation needed}}, {{NPOV}}, {{BLP sources}} are Wikipedia's dispute signals. Refract tracks them mechanically.

Structural changes (section_*, lead_*, page_moved)

Dispute signals (revert_*, edit_cluster_*)

Talk page activity (talk_*)

These events correlate article revisions with their associated talk pages.

Reading patterns across events

Stability signal

A claim that has sentence_first_seen early in the timeline, no sentence_modified events, and is still present at the end of the revision range is stable.

Contestation signal

A claim with frequent sentence_modified events, revert_detected in its section, and template_added with dispute templates ({{NPOV}}, {{citation needed}}) is actively contested.

Source fragility signal

A citation that was added and then removed within a short revision window (or was replaced multiple times) is source-fragile — the page's evidence base for that claim is unstable.

Canon change signal (Fandom wikis)

category_removed immediately followed by category_added with a different category in the same revision boundary is a canon reclassification. sentence_removed on a long-standing claim followed by sentence_first_seen with different text in the same section is a retcon.

Downstream consumption

When building on Refract's event stream:

  1. Filter by eventType to isolate the signal you care about (e.g., citation events for source churn analysis).
  2. Group by section to find which parts of the page are most active.
  3. Chain by claimId to trace a specific proposition across its full lifecycle.
  4. Sort by timestamp to reconstruct the chronological sequence.
  5. Never interpret layer values you didn't setmodel_interpretation is for downstream use only. Refract never sets it.

The event stream is pure NDJSON — pipe it, filter it, join it, archive it.