System-2 FW: Post-Generation Verification (Model-Agnostic)

1. Overview

System-2 FW formalizes decision semantics for LLM systems by explicitly separating generation, verification, and acceptance as independent architectural concerns. The key claim is architectural: sampling a plausible answer is not equivalent to judging it acceptable.

Design goal: produce inspectable, deterministic accept/reject/abstain decisions under explicitly stated assumptions (“profiles”), without modifying the underlying LLM.

2. Why “System-2” is Needed

Common pipeline (implicit)

User prompt $\rightarrow$ LLM $\rightarrow$ Output $\Rightarrow$ (implicitly) Accept

In many deployed systems, “the model produced it” is treated as a weak accept signal. Failures become “degraded successes” instead of explicit decision failures.

System-2 pipeline (explicit)

Prompt $\rightarrow$ Generate candidates $\rightarrow$ Verify $\rightarrow$ Aggregate $\rightarrow$ Outcome

The architecture treats refusal/abstention as first-class outcomes. If no candidate satisfies the active constraints, the system rejects (or abstains) rather than forcing a response.

3. How it Works

3.1 Profiles = explicit assumption sets

A profile is a selectable set of assumptions and constraints: domain applicability, regulatory thresholds, physical limits, temporal ordering rules, etc. Profiles are logged and inspectable, so outcomes are attributable to stated assumptions rather than opaque defaults.

Key property: the same generated candidate can be evaluated under different profiles, yielding different outcomes, without ambiguity.

3.2 Two-axis verification (minimal sufficient factorization)

Verification is decomposed into two orthogonal axes:

Structural consistency: reference integrity, inferential validity, relational coherence
Semantic compatibility: domain rules, temporal/type constraints, regulatory/physical limits

The point is practical separability: some outputs are structurally valid yet semantically invalid, and others are semantically plausible yet structurally invalid—so a single unified axis leaves blind spots.

3.3 Hard vs. soft constraints

Within a profile, constraints can be hard (any violation forces rejection) or soft (influences ranking/score). This keeps enforcement strict where needed (finance/medical/legal) while allowing exploratory behavior in low-stakes settings.

4. Deterministic Result Aggregation

Verifiers are heterogeneous (rule-based, symbolic, statistical, hybrid), but they must report in a common format. Crucially, aggregation is deterministic and rule-based: it does not invoke any language model, and it does not regenerate or silently revise assumptions to force acceptance.

Canonical fusion (sketch)
if structural hard violation under profile $P$ → REJECT
else if semantic hard violation under $P$ → REJECT
else compute soft score $S$ and compare to threshold
else ACCEPT

Abstention / UNK: if semantic verification reports unresolved factual conflict under the active profile, the aggregation outcome is UNK (abstain), regardless of structural satisfaction.

5. Component Roles

Generator

Produces one or more candidates. No claim that candidates are valid by virtue of being generated.

generate(prompt) → {c₁, …, cₙ}

Verifiers

Each verifier checks one constraint family and reports satisfaction/violation + (optional) localized failure info. Verifiers do not propose alternative candidates, revise assumptions, or decide the profile.

verify(candidate, profile) → result

Orchestrator

Routes candidates to verifiers, runs sequentially or in parallel, and collects results. Not a decision-maker.

Aggregator

Deterministic fusion under the active profile. Outputs: ACCEPT, REJECT, UNK + trace.

6. What This Paper Claims (and does not)

Claims

Decision semantics should be explicit: generation, verification, acceptance are separate concerns.
Profiles externalize assumptions; verification is performed under stated assumption sets.
Two-axis verification is a minimal sufficient factorization for common failure modes.
Aggregation must be deterministic and profile-governed; abstention is a valid outcome.

Non-claims

No new base model or training method is proposed.
No claim that verifiers solve truth “in general”; they enforce constraints under explicit profiles.
No claim that a single metric/score can replace explicit accept/reject/abstain semantics.

7. Intended Use

This architecture is for settings where “sounds right” is insufficient: domain-critical pipelines that need explicit refusal, traceability, and stable decision policy.

Practical outcome: where today’s systems hallucinate, a System-2 FW system can legitimately return “no acceptable candidate under the current profile” (REJECT/UNK) rather than inventing confidence theater.