Verification Report
Verification Report: Methodology
Verification Report
This report documents the independent verification of the Methodology research page.
Every factual claim on the methodology page was checked against its source data before publication. This is the first verification report on the site to draw on two separate bodies of source data — the methodology page combines quantitative validation results with the published record on research retractions, and each claim was checked against whichever of the two it came from. The source behind each claim, and its individual result, is listed claim by claim below.
How the claims were checked
The claims on the methodology page come from two distinct bodies of evidence:
- Validation results — 16 claims covering the quantitative validation figures and how the methodology is described.
- Retraction-record claims — 7 claims covering published retraction figures and their surrounding context.
Each claim was checked at the level that fits what it asserts:
- Numerical checks confirm that a stated figure matches its source data exactly.
- Framing checks confirm that the wording faithfully represents what the source actually says — that a summary doesn’t imply more than the underlying evidence supports.
A claim is only marked as framing-checked if its wording was actually put through that check; claims that assert a number, not a characterization, are confirmed numerically. The two checks are reported separately rather than blended, so “the figure is correct” is never overstated into “the interpretation is correct,” or the reverse.
One claim carries an openly documented correction. An earlier draft described a retraction trend as “accelerating year over year”; the framing check found that the source did not support that characterization. The wording was corrected, the claim was re-checked, and it then passed. We keep that history visible rather than removing it — it appears as a worked example on the methodology page itself, showing what the framing check catches and how a correction is recorded.
Summary
| Claims checked | 23 total (16 validation results + 7 retraction-record) |
| Result | All 23 confirmed — none unsupported, none uncheckable, none contradicted, and none left unchecked |
| Source data | Two bodies of evidence: the quantitative validation results and the retraction-record sources (the exact versions checked are recorded below at build time) |
| Verified | 2026-05-21 |
| Verified by | Veritas v1.1 |
Claim-by-Claim Verification
| Claim ID | Claim | Source | Strength | Status |
|---|---|---|---|---|
| CLM-METHOD-001 | Verified cohort: 7 DMS entries | NeuroAutomata verified-cohort DMS entry count | strong | CLM-METHOD-001 Verified |
| CLM-METHOD-014 | Verified cohort: 6 distinct proteins (5 median-contributing + Calmodulin, held out as a disclosed weakness) | NeuroAutomata verified-cohort distinct-protein count | strong | CLM-METHOD-014 Verified |
| CLM-METHOD-002 | Cohort median Spearman rho = 0.5298 (N=7 DMS entries) | NeuroAutomata verified-cohort Spearman rho median (per-DMS-entry grain) | strong | CLM-METHOD-002 Verified |
| CLM-METHOD-003 | Cohort sorted Spearman values: [0.3989, 0.409, 0.484, 0.5298, 0.534, 0.634, 0.679] | NeuroAutomata verified-cohort per-DMS Spearman rho values, sorted ascending | strong | CLM-METHOD-003 Verified |
| CLM-METHOD-004 | ESM-2 650M ProteinGym aggregate Spearman baseline = 0.414 | ESM-2 650M ProteinGym cross-leaderboard aggregate Spearman baseline (Average_Spearman column, accessed 2026-05-08) | strong | CLM-METHOD-004 Verified |
| CLM-METHOD-005 | Cohort outperforms ProteinGym ESM-2 baseline by +28.0% relative | Relative delta of cohort median measured against the ESM-2 650M cross-leaderboard aggregate baseline (cohort-frame) | strong | CLM-METHOD-005 Verified |
| CLM-METHOD-006 | Calmodulin Tier-2 stratum: rho = 0.2116 | Calmodulin (CALM1_HUMAN, Weile et al. 2017) DMS — Tier-2 named-weakness stratum, Spearman correlation | strong | CLM-METHOD-006 Verified |
| CLM-METHOD-006b | Calmodulin Tier-2 stratum: 1,813 variants | Calmodulin (CALM1_HUMAN, Weile et al. 2017) DMS — Tier-2 named-weakness stratum, variant count | strong | CLM-METHOD-006b Verified |
| CLM-METHOD-007 | Calmodulin Tier-2 stratum: -48.9% relative to cross-leaderboard baseline | Relative delta of the Calmodulin named-weakness stratum measured against the ESM-2 650M cross-leaderboard aggregate baseline (cohort-frame) | strong | CLM-METHOD-007 Verified |
| CLM-METHOD-008 | Median taken at per-DMS-entry grain (N=7), not per-protein grain | Cohort-median grain convention — computed per-DMS-entry, not per-protein | strong | CLM-METHOD-008 Verified |
| CLM-METHOD-009 | Baseline sourced from catalog-first frozen archive; not regenerated | Frozen-archive provenance for the ESM-2 650M cross-leaderboard baseline (catalog-first, accessed 2026-05-08) | strong | CLM-METHOD-009 Verified |
| CLM-METHOD-010 | Surface-class dual-field convention: cohort-frame cites cross-leaderboard, per-post cites per-protein | Convention for which baseline each page type compares against (summary pages vs individual protein pages) | strong | CLM-METHOD-010 Verified |
| CLM-METHOD-011 | Named-weakness routing: explicit disclosure, not collapsed, not framed as coverage failure | Disclosure context for the openly-stated Calmodulin binding-affinity weakness | strong | CLM-METHOD-011 Verified |
| CLM-METHOD-012 | Methodology figures are verified-claim-grain, not prose-grain | internal knowledge base decision record; internal knowledge base decision record | strong | CLM-METHOD-012 Verified |
| CLM-METHOD-013 | pinned validation summary authored + independently verified by Veritas | Independent verification record backing the cohort substrate's verified status | strong | CLM-METHOD-013 Verified |
| CLM-METHOD-015 | Methodology page = surface-class A cohort-frame only; per-protein figures live in per-post pages | Reference to the page-type figure convention (which comparisons appear on summary vs per-protein pages) | strong | CLM-METHOD-015 Verified |
| CLM-METHOD-TRUST-001 | AI-related retractions surged sharply in 2023 and remained elevated into 2024 | Frontiers in Research Metrics and Analytics 2025 bibliometric review of AI-related retractions | strong | CLM-METHOD-TRUST-001 Verified |
| CLM-METHOD-TRUST-002 | 51.1% of retracted articles retain above-average citations after retraction | Frontiers in Research Metrics and Analytics 2025 bibliometric review of AI-related retractions | strong | CLM-METHOD-TRUST-002 Verified |
| CLM-METHOD-TRUST-003 | Median publication-to-retraction time: 550 days (~1.5 years) | Frontiers in Research Metrics and Analytics 2025 bibliometric review of AI-related retractions | strong | CLM-METHOD-TRUST-003 Verified |
| CLM-METHOD-TRUST-004 | ChatGPT 4o-mini × 6,510 evaluations of retracted articles flagged zero retractions | Wiley Learned Publishing 2025 — 'ChatGPT tends to ignore retractions' (empirical evaluation) | strong | CLM-METHOD-TRUST-004 Verified |
| CLM-METHOD-TRUST-005a | AI-generated content with fabricated citations documented in published literature (Zhang et al. 2024 + Harvard MR audit; Retraction Watch case-studies adjacent-context) | Zhang et al. 2024 *Surfaces and Interfaces* retraction (Crossref RETRACTED: prefix) + Harvard Misinformation Review GPT-fabricated-papers audit (2024). Retraction Watch case studies (cat-3 journalism-aggregator, below gate-3 PASS) cited as adjacent-context only. | moderate | CLM-METHOD-TRUST-005a Verified |
| CLM-METHOD-TRUST-005b | Independent verification of AI inference-tool validation pages remains uncommon | Observation-from-absence — the evidence is the absence of documented cases of independent-verification implementations in the inference-tool-validation-page record as of 2026-05-21 review. | moderate | CLM-METHOD-TRUST-005b Verified |
| CLM-METHOD-TRUST-006 | Internal-pipeline near-miss — initial PASS overturned on re-verification against open primary source (internal usage-statistic near-miss) | Internal-pipeline audit-trail artifact (upstream content-gen discussion 2026-05-11) | strong | CLM-METHOD-TRUST-006 Verified |
Substantiated against the cohort-frame validation claim list (version a67dc48)
and the editorial validation claim list (version 7fa5ce4);
the verification result on record, at version
86ymym1tobb59nna399j8uxr8a — all captured at build time, providing
a cross-checked source trail across three independent pointers.