Frontier Use Cases for Claim-Provenance Infrastructure
Core Primitive
Refract turns version history into:
claim + source + wording + placement + stability + time
It does not report what a document says now. It explains how a claim entered, changed, was sourced, challenged, moved, stabilized, weakened, or disappeared.
Retrieval & Search
AI Retrieval Provenance
AI systems retrieve text as if every sentence is equally stable. They lack signals about what has been contested, recently changed, or stabilized through editorial scrutiny.
Refract exposes per-claim metadata:
{
"claim": "Entity X knew about the defect before launch",
"status": "included_with_attribution",
"first_seen": "2021-04-03",
"stability": "stable_after_regulatory_source_added",
"contestation": "previously_reverted_twice",
"placement": "body_not_lead",
"source_support": "regulatory_report_and_secondary_news"
}
This helps retrieval systems decide whether to present a claim directly, hedge it, attribute it, surface uncertainty, or deprioritize it.
Claim-Level Search
Search across claim histories, not documents.
Example queries Refract can support:
- Claims about an entity that stabilized after a given date
- Claims removed as unsourced
- Claims supported by a specific source type
- Claims that moved from lead to body
- Claims that softened after external events
This is a new search primitive over version-history data.
AI Evaluation & Temporal Leakage Detection
Use claim histories to test AI systems for temporal leakage.
Questions answerable with Refract data:
- Was this claim public before the model's knowledge cutoff?
- Did the supporting source appear later?
- Did the wording exist at the time?
- Is the model using future knowledge?
Refract provides temporal ground truth for benchmarking retrieval truthfulness, training-data audits, and knowledge-cutoff testing.
Claim History & Memory
Public Claim Memory
Create a temporal knowledge graph for claims.
Not: What does the article say now? But: What has this claim said over time?
Refract tracks across revisions:
- Origin (first appearance)
- Source support evolution
- Wording changes
- Removals and restorations
- Disputes and challenges
- Current status
Knowable-at-the-Time Reconstruction
Reconstruct what was publicly knowable at a specific date.
Example: As of March 1, 2023, what sources supported a given claim?
Refract's revision-level timestamping and deterministic extraction make this auditable, with no hindsight leakage.
Institutionalization of Knowledge
Detect when a claim moves from rumor into durable public reference.
Pattern Refract can surface:
rumor → local reporting → national reporting → institutional source → article body → article lead → stable public reference
Core question: When did this stop being chatter and become part of the durable public record?
Claim Survivorship
Measure which claims survive scrutiny.
Refract can classify claims by survival pattern:
- Failed claim (removed and never restored)
- Temporary claim (appeared briefly)
- Stable claim (survives extended period unchanged)
- Stable only with attribution
- Stable only in body, not lead
- Repeatedly contested
- Removed under policy concern
Source & Evidence Integrity
Source Escalation & Degradation
Track how source support changes across revisions.
Escalation patterns Refract detects:
unsourced → local source → national source → regulatory source → court ruling → academic analysis
Degradation patterns:
secondary source removed → primary source remains → citation-needed tag → claim removed
Article Voice & Attribution
Distinguish who is making a claim — not just what the claim says.
States Refract detects:
- Direct assertion in article voice
- Attributed to a named source
- Attributed to critics or regulators
- Reported allegation
- Disputed claim
- Removed claim
Example: "The company concealed the defect" versus "Regulators alleged that the company failed to disclose the defect" should not be treated as equivalent.
Semantic Intensity Drift
Detect language getting stronger, weaker, broader, or narrower.
Example shifts Refract can surface:
- caused → contributed to
- concealed → failed to disclose
- proved → alleged
- will → may
- riot → protest → unrest
Misinformation & Circular Sourcing
Track weak, false, or fringe claims as they move through public systems.
Patterns Refract detects:
- Claim introduced without source
- Reverted as unsourced
- Reintroduced with weak source
- Challenged and removed
- Returns in softened form
Citogenesis detection: a claim originates on Wikipedia, gets cited by an external source, and Wikipedia later cites that source to verify the original claim — creating a self-referential loop.
Source Dependency Mapping
Map which claims depend on which sources.
If a source is corrected, discredited, or retracted, Refract can identify all affected claims across the knowledge base, enabling cascade analysis.
Editorial Mechanics
Editorial Consensus Mapping
Show how a page or claim reaches a stable form.
Signals Refract surfaces:
- Talk-page discussion activity
- Revert cycles
- Policy references (BLP, NPOV, RS, UNDUE)
- Source replacement patterns
- Compromise wording emergence
- Page protection events
- Post-discussion stabilization
Prominence & Placement
Track not just whether a claim exists, but where it appears.
States Refract detects: lead, infobox, body, controversy section, footnote, caption, table, category, talk page only, removed.
Meaningful transitions: body → lead (increased prominence), lead → body (reduced prominence), removed → restored (narrative resurrection).
Cross-Language Narrative Divergence
Compare the same claim across language editions.
Refract can surface differences in: wording, sources cited, prominence, stability timelines, and terminology across language versions of the same topic.
Change Monitoring
Public Reference Monitoring
Monitor how an entity or topic changes in public reference sources.
What Refract surfaces:
- New claims added
- Claims moving into prominent positions
- Regulatory or institutional sources added or removed
- Policy templates applied (BLP, NPOV)
- Revert chains triggered
- Page protections activated
- Cross-language spread detected
Knowledge Volatility Metrics
Create quantitative metrics from claim-history data:
- Claim volatility over time
- Source churn rate
- Lead section instability
- Controversy intensity
- Cross-language divergence rate
- Edit-burst frequency
Change Summaries
Generate structured summaries from claim-history analysis.
Refract can produce outputs like:
Three public claims changed materially this period.
- One stabilized after regulatory sourcing.
- One moved from body to lead.
- One remains disputed after repeated reverts.
Each statement links to the supporting evidence.
Domain Extensions
Policy, Regulation & Standards Tracking
Apply the same engine to public rules and guidance documents.
Changes Refract can detect: requirements added or weakened, mandatory → optional transitions, deadline extensions, scope narrowing, enforcement language removed.
Enterprise Knowledge Governance
Apply claim provenance to internal knowledge bases.
Detect across internal documentation: unsupported claims, stale procedures, contradictions between docs, obsolete product claims, policy drift, claims without sources.
Corporate Disclosure Tracking
Track changes in publicly filed corporate documents.
Refract can surface when commitments, risk factors, deadlines, or claims change in investor presentations, earnings materials, ESG reports, terms of service, or privacy policies.
Legal-Document History
Apply the same engine to contracts, policies, and versioned legal documents.
Detect: obligations added or narrowed, warranties softened, termination rights expanded, indemnity strengthened, compliance requirements removed.
Example shifts: shall → may, all damages → direct damages, must notify within 24 hours → must notify without undue delay.
Universal Document-History
General Document-History Operating System
The broadest application. For any versioned document system, Refract answers:
- What changed?
- What claim or obligation did it affect?
- What source or evidence supported it?
- Was it challenged?
- Did it stabilize?
- What is its current status?
This applies across: public wikis, policy drafts, standards documents, open-source documentation, collaborative prose, and any versioned text corpus.
Synthesis
The frontier is not summarizing version history.
The frontier is turning version history into claim-level provenance.
The reusable primitive is:
claim + source + wording + placement + stability + time
That primitive can power use cases across retrieval, research, monitoring, governance, and knowledge management — without deciding what is true.