L4 Semantic Verification — Glossary

Numerical agreement (L1) and classification framing-fidelity (L3) catch many kinds of drift, but they cannot catch every misrepresentation. A claim can quote correct numbers and a correct label while telling a substantively wrong story — through selective omission, misattribution of conclusions, or generative-AI confabulation that sounds-right-doesn’t-cite-right. L4 is the layer that catches that.

L4 is Active via an external claim-checker module. The layer takes the claim’s prose, the cited-source excerpts, and asks: does this prose faithfully represent what the source says, or does it overstate, understate, misattribute, or fabricate? The check is not a paraphrase test — it is a semantic-fidelity test against the cited evidence.

L4 emits one of four layer-grain outcomes: verified when prose and source are semantically aligned; contradicted when prose conflicts with source; ambiguous when the language model cannot reach a confident judgment within scope; not_run when the claim is structurally outside the L4 scope (e.g., a pure numerical claim with no prose to evaluate). These layer-grain outcomes feed into the per-claim rollup that produces the canonical verification_status — an L4 ambiguous outcome on a substrate class that structurally cannot multi-source-corroborate contributes to a per-claim ungrounded rollup.

Because L4 uses a language model to judge, its outputs are themselves subject to the verdict-scope-boundary discipline — L4 attests claim-vs-cited-source consistency, not cited-source-vs-reality. The cited source is the authority for what is true; L4 checks fidelity to that authority, not the authority of the source itself. If the source is wrong, L4 will not catch it.

Language-model judgments carry their own uncertainty. The verdict-of-record discipline (Mattermost post under the veritas handle per H-1) makes that uncertainty inspectable: every L4 verdict is auditable to the model output that produced it.