Belief Systems: Teaching Simplex Agents to Know What They Know

Part 3 of the Simplex Evolution Series

25 December 2025 25 min read

Current AI systems have an epistemology problem. They produce outputs without distinguishing knowledge from belief from speculation. They can't track where their claims come from. They don't quantify uncertainty appropriately. And they certainly don't revise claims when evidence changes.

This leads to hallucination, calibration failures, brittle reasoning, and epistemic opacity. The model states falsehoods with confidence, can't explain why it believes things, and falls apart when it encounters contradictions.

For the Mnemonic Hive evolution of Simplex, I'm tackling this head-on. Agents need to know what they know, acknowledge what they don't, and revise beliefs appropriately when evidence demands. This post details the epistemic architecture I've designed to make that possible.

The Problem of Machine Epistemology

Here's what traditional AI systems lack:

They don't distinguish knowledge from belief from speculation
They don't track the provenance of their claims
They don't quantify uncertainty appropriately
They don't revise claims when evidence changes

I've spent considerable time thinking about what a well-designed epistemic architecture should satisfy. I've landed on seven core desiderata:

D1. Graded Belief — Beliefs should have degrees of confidence, not binary truth values. BEL(φ, c) where confidence c ∈ [0, 1].

D2. Source Tracking — Every belief should trace to its origins. Where did this come from? Who said it? When?

D3. Type Classification — Beliefs should be categorized by epistemic status. Is this a fact, an opinion, or an inference?

D4. Temporal Dynamics — Beliefs have lifespans. Old information becomes less reliable. Confidence should decay appropriately.

D5. Revision Rationality — Beliefs should update according to rational principles. I'm drawing on AGM theory here.

D6. Consistency Maintenance — Contradictions should be detected and resolved, not ignored.

D7. Introspective Access — The system should be able to query and explain its own epistemic state.

Truth Categorization: Not All Beliefs Are Equal

One of the key decisions I've made is implementing a formal truth categorization system. Not all beliefs have the same epistemic status, and treating them uniformly is a mistake.

I've defined four truth categories, each with distinct properties:

ABSOLUTE Truths

These are propositions that are empirically verifiable, context-independent, and have a single correct answer. Things like "Python 3.12 was released in October 2023" or "The SHA-256 hash of 'hello' is 2cf24dba..."

For absolute truths, I've set strict confidence requirements:

Minimum confidence for assertion: 0.9
Promotion to persistent memory tier: 0.95
Any contradiction triggers immediate review

Absolute truths can only be overridden by another absolute with higher confidence and reliability, or by explicit user correction.

CONTEXTUAL Truths

These are propositions that are true within specific contexts, domains, or conditions. "React is the best framework for component-based UIs" is true in the context of frontend JavaScript development—meaningless in embedded systems.

The critical insight here is that contextual truths from non-overlapping domains don't contradict each other. "Use dependency injection for testability" (enterprise OOP context) and "Avoid unnecessary abstraction" (embedded systems context) can coexist peacefully.

I've implemented a context model that tracks domains, conditions, and temporal validity. When querying beliefs, context matching adjusts effective confidence:

effective_confidence = belief.confidence × context_match_score

OPINIONS

These are propositions expressing personal preferences, values, or subjective judgments. "I prefer functional programming paradigms" or "Dark mode is more pleasant to use."

The key decision here: opinions can contradict without being inconsistent. This is fundamentally different from how we treat factual contradictions. An agent can hold that "the user prefers concise responses" and "the user appreciates detailed explanations" simultaneously—these might apply to different contexts or represent evolving preferences.

I'm tracking opinion consistency over time. High consistency means stable preferences. Low consistency might indicate genuine uncertainty or context-dependent preferences.

INFERRED Truths

These are propositions derived from patterns, observations, or reasoning rather than direct evidence. "User is a morning person" (inferred from activity patterns) or "Prefers concise responses" (inferred from editing behavior).

For inferred truths, I'm tracking the entire reasoning chain—what evidence was used, what type of reasoning (deductive, inductive, abductive), and how confidence propagates through the chain:

Deductive inference: conclusion confidence = min(premises)
Inductive inference: confidence = weighted average × 0.8 discount
Abductive inference: confidence = min(premises) × 0.6 (heavy discount for hypothesis generation)

Inferred beliefs are always open to revision. They're hypotheses, not facts.

The Mathematics of Confidence

I'm treating confidence as subjective probability: c(φ) = P(φ is true | agent's evidence and reasoning)

Total confidence is computed from four components:

Source reliability (R) — How trustworthy is the source? Weight: 30%
Recency (T) — How fresh is the information? Weight: 20%
Corroboration (C) — How many independent sources confirm this? Weight: 30%
Contradiction (D) — How many sources contradict this? Weight: 20% (penalty)

The formula:

confidence = 0.30 × R + 0.20 × T + 0.30 × min(C/5, 1) - 0.20 × min(D/3, 1)

Recency Decay

Confidence decays over time using exponential decay with a configurable half-life:

T(t) = e^(-λt) where λ = ln(2) / half_life

After one half-life, recency = 0.5. After two half-lives, recency = 0.25. I'm using a default half-life of 30 days for most beliefs, but this is configurable per truth category.

I've also factored in access patterns. Frequently accessed beliefs decay slower—if you're regularly retrieving and confirming a belief, it's probably still relevant.

Bayesian Updates

When new evidence arrives, I update confidence using Bayes' rule in odds form:

O(H|E) = P(E|H)/P(E|¬H) × O(H)

The likelihood ratio determines how strongly evidence affects belief:

Supporting evidence: likelihood ratio 1-4 depending on strength
Contradicting evidence: likelihood ratio 0.25-1 depending on strength
Neutral evidence: likelihood ratio 1.0 (no change)

This is weighted by source reliability—evidence from unreliable sources has a muted effect.

Belief Revision: The AGM Approach

For belief revision, I'm implementing the AGM framework from formal epistemology—Alchourrón, Gärdenfors, and Makinson's work from the 1980s. It defines three fundamental operations:

Expansion

Adding a new belief when it doesn't contradict existing beliefs. Simple case—just add it to the belief set.

Contraction

Removing a belief without adding new information. The key principle here is minimal change—remove the target belief while preserving as much else as possible.

I don't actually delete beliefs during contraction. Instead, I mark them as inactive and archive them. This preserves history and allows for potential revival if future evidence supports the belief again.

Critically, I track belief dependencies. When a belief is contracted, beliefs that depend on it have their confidence reduced. If that confidence drops too low, those dependent beliefs get contracted too—a cascade of rational revision.

Revision

Adding a new belief that contradicts something, requiring us to choose what to give up. This is the hard case.

I'm implementing this using the Levi identity: K * φ = (K - ¬φ) + φ. First contract the contradicting beliefs, then expand with the new belief.

The resolution decision uses epistemic entrenchment—how "rooted" a belief is in the system:

entrenchment = 0.30 × confidence
             + 0.20 × age_factor
             + 0.20 × corroboration_factor
             + 0.15 × access_frequency
             + 0.15 × dependent_beliefs

Higher-ranked truth categories win over lower-ranked ones. ABSOLUTE beats CONTEXTUAL beats OPINION beats INFERRED. Within the same category, entrenchment decides.

One nuance I've added: when beliefs are very close in entrenchment, I check if they might be contextually different. If so, both can coexist with their respective contexts explicitly marked.

Contradiction Detection

Detecting contradictions isn't trivial. I'm using multiple methods:

Semantic Opposition

High embedding similarity with opposite meaning. If two beliefs are about the same topic but express opposing views, that's a contradiction. I use a small NLI (Natural Language Inference) model to classify the relationship as entailment, neutral, or contradiction.

Explicit Negation

Pattern matching for explicit negation structures. "X is true" vs "X is not true" or "X is false." This catches the obvious cases with high confidence.

Numerical Contradiction

Same quantity, different values. "Project has 3 modules" vs "Project has 5 modules"—these can't both be true at the same time.

Temporal Contradiction

Beliefs about the same thing at incompatible times. "User joined in 2020" vs "User joined in 2018" (assuming we're talking about the same event).

Epistemic Coherence

Beyond individual belief operations, I'm implementing coherence metrics for the overall belief system:

Consistency (40% weight): No logical contradictions. Measured as 1 - (contradictions / pairs checked).

Calibration (25% weight): Confidence matches accuracy. Uses historical verification data where available.

Connectedness (20% weight): Beliefs support each other. Higher when beliefs have corroborating evidence from other beliefs.

Stability (15% weight): Beliefs don't fluctuate wildly without cause. Low variance in confidence changes = high stability.

A maintenance cycle runs periodically to:

Detect and resolve contradictions
Apply confidence decay
Prune low-confidence beliefs
Consolidate redundant beliefs
Update entrenchment scores

Multi-Agent Epistemic Coordination

In a Mnemonic Hive, multiple agents need to coordinate their beliefs. I've designed two key mechanisms:

Collective Belief Formation

When the hive needs to form a collective view on a topic, individual agent beliefs are aggregated using weighted voting. The weight considers:

Expertise in the topic (40%)
Track record / reputation (40%)
Recent activity in topic (20%)

Similar beliefs are clustered, and the dominant cluster forms the collective belief. The agreement level (what fraction of agents share this belief) is tracked alongside confidence.

Belief Propagation

When one agent learns something, it can propagate that belief to others. I've defined three propagation types:

INFORM — Share as evidence (strength: 0.5)
TEACH — Share with high confidence (strength: 0.8)
SUGGEST — Share as low-confidence option (strength: 0.3)

The receiving agent treats the propagated belief as evidence and revises its own beliefs accordingly. This allows the hive to converge on shared understanding while still respecting individual agent expertise.

Implementation in Simplex

In the evolving Simplex language, beliefs become first-class citizens. Here's a sketch of the syntax:

// Belief literals with truth categories
let fact = belief::absolute("Python 3.12 released October 2023",
                            confidence: 0.95,
                            source: documentation);

let preference = belief::opinion("Prefer functional style",
                                 confidence: 0.8);

let inferred = belief::inferred("User is morning person",
                                confidence: 0.7,
                                evidence: [activity_patterns]);

// Belief queries with context
let relevant = agent.beliefs
    .query("database design")
    .in_context(domain: "postgres")
    .min_confidence(0.6);

// Belief revision
agent.beliefs.revise(new_evidence);

The type system ensures that belief operations are handled correctly. You can't accidentally treat an opinion as an absolute truth. The compiler enforces epistemic hygiene.

Why This Matters

This isn't an academic exercise. The epistemic architecture directly addresses the failures of current AI systems:

Hallucination — Prevented by requiring confidence thresholds for assertions. Low-confidence beliefs aren't stated as facts.
Calibration failures — Addressed by tracking actual accuracy against stated confidence and adjusting.
Brittle reasoning — Handled by formal contradiction detection and resolution.
Epistemic opacity — Solved by maintaining full provenance and reasoning chains.

An agent with this architecture doesn't just process information—it knows things. It can explain why it believes what it believes. It can acknowledge uncertainty. It can change its mind when evidence demands.

That's the foundation for AI systems that are actually trustworthy.

This is Part 3 of the Simplex Evolution Series. Part 1: The Mnemonic Hive covers the overall vision. Part 2: Evolving Simplex details the language extensions for cognitive agents.