Verification Report
Verification Report: INSR Aslanzadeh ESM-2 / MaveDB Joint Coverage
Verification Report
This report documents the independent verification of the article INSR ectodomain — Where ESM-2 Carries Weight and Where it Cedes the Field.
Validation Methodology
DMS source: Aslanzadeh et al. 2025, Nature Communications 16:9143, INSR ectodomain deep mutational scanning (doi:10.1038/s41467-025-64178-4). Insulin-binding scores from MaveDB record urn:mavedb:00001239-a-6 (log2 variant/WT barcode counts, FDR-controlled), 14,576 variants across mature ectodomain residues 28-955 of human INSR (UniProt P06213).
ESM-2 scoring: 650M-parameter masked-marginal scoring on the NeuroAutomata platform. The cross-reference matched 13,927 variants between the MaveDB record and the ESM-2 928-position landscape; score polarity aligned natively against Aslanzadeh’s log2 insulin-binding scale (negative = loss-of-binding, positive = gain-of-binding, zero = WT-like). No inversion applied.
Global result and ProteinGym-leaderboard comparison. Global Spearman rho = 0.399 (N = 13,927), measured against the Aslanzadeh DMS — this is ESM-2’s correlation with the experimental data, not a figure reported by Aslanzadeh. It sits below the published ESM-2 650M ProteinGym aggregate of 0.414 (ProteinGym leaderboard, accessed 2026-05-08) — INSR is, in aggregate, a harder target than the mean ProteinGym assay. The aggregate is misleading on its own; the per-domain breakdown carries the story.
Per-domain breakdown. Spearman rho ranged from 0.594 on the L1 leucine-rich repeat (N = 1,945, p ≈ 10⁻¹⁸⁵) to −0.088 on the αCT helix (N = 507, p = 0.046) — the receptor-ligand interface that forms insulin binding site 1. This anti-correlation is the first receptor-ligand interface boundary condition we have catalogued in this validation work, structurally distinct from prior zero-correlation cases at the PTEN CX5R catalytic motif and the BRCA1 RING zinc-coordinating residues. The αCT helix is not an active site in the catalytic-machinery sense; it is an evolvable interface surface where ligand-recognition co-evolution drives variation that ESM-2’s sequence prior reads as tolerance rather than constraint.
Reference comparison to AlphaMissense (Cheng et al. 2023). Aslanzadeh report a global rho between AlphaMissense and the DMS at the full-ectodomain grain; they do not break it down per domain. The article uses this only for headline context and notes the comparison is approximate, not apples-to-apples — different score derivations, different match shapes, different per-position coverage.
How to read the Spearman correlation. As the article notes, rho measures agreement on ordering, not per-variant accuracy. ESM-2 helps prioritize which positions to investigate first; it does not independently classify variants. The published prose frames all variant-related claims as descriptive (rank prioritization), not as ACMG/AMP variant classifications.
Summary
| Metric | Value |
|---|---|
| Claims | 28 total |
| Result | All checked; the page passed verification (on the grounding checks — see “What Was Checked”) |
| Per-claim outcome | 15 verified · 1 single-source measurement · 12 reference facts (0 contradicted, 0 uncheckable) |
| Numbers checked for arithmetic consistency | 15 of 28 |
| Categorical classifications made | 0 (none authored — see “What Was Checked”) |
| Source data | INSR validation dataset (version recorded below at build time) |
| Verified | 2026-05-18 |
| Verified by | Veritas |
What Was Checked
Source-against-claim review (all 28 claims). Every claim was checked end-to-end against the primary source it cites — whether the source actually says what we claim, and whether the scope matches.
A note on the framing-fidelity check. The stricter framing-fidelity check — whether the prose framing is faithful to what the cited authors reported — was not run on this page. The grounding checks above (source-says and scope-match, across all 28 claims) were the basis on which this page passed; the framing-fidelity pass that ran on the later CYP2C9 and BRCA1/PTEN reports was not applied back to this older one. We disclose that openly here rather than imply a check we did not run — “passed verification” on this page means the grounding checks passed, not that the framing-fidelity pass was run.
Cited primary sources (independently resolved and checked figure-by-figure across the verified subset, with single-source measurements disclosed as such): Aslanzadeh et al. 2025 Nature Communications (INSR ectodomain DMS); the MaveDB record urn:mavedb:00001239-a-6 (INSR insulin-binding score set); UniProt P06213 (INSR canonical sequence reference); and the ProteinGym substitution DMS leaderboard (cross-model baseline, accessed 2026-05-08). The fifteen verified claims are ESM-2’s own measurements compared against the Aslanzadeh ectodomain DMS — the per-domain correlations, the regional breakdown and partition counts, the αCT-helix anti-correlation, the cross-leaderboard arithmetic, and the region accounting.
Context sources (referenced in the article for background, outside the formal verification scope): AlphaMissense (Cheng et al. 2023, doi:10.1126/science.adg7492), used only for the approximate-comparison qualifier on the AlphaMissense headline rho. Note that the AlphaMissense figure is taken from Aslanzadeh’s own published panel (Fig 5B), not from an independent measurement of our own.
Arithmetic-consistency check (15 of 28 quantitative claims). Every Spearman rho, sample size, p-value, and percentage figure with arithmetic content was checked for internal consistency: rho values within the valid range (−1 to 1) across the L1 leucine-rich repeat (0.594), the cysteine-rich region, the L2 leucine-rich repeat, the fibronectin type-III domains, and the αCT helix (−0.088); sample-size accounting (per-domain N values summing within the 13,927-variant ectodomain match window); p-value bounds; the −3.6% gap from the ESM-2 650M ProteinGym aggregate (0.399 vs 0.414); and the partition counts across the structural-region breakdown.
One single-source measurement. A further claim is a single-source value with no independent second measurement available for an arithmetic cross-check. We disclose it as exactly that — measured once, not independently corroborated. This is an honest, bounded category, distinct from a verified claim; it is not a coverage gap. The remaining 12 claims are reference facts — methodological framing, structural-existence, and comparative statements (e.g. the paper citation, the MaveDB record identifier, the INSR sequence length, the AlphaMissense approximate-comparison qualifier, the receptor-ligand interface boundary-condition designation) — confirmed by source-against-claim review only, with no arithmetic in them to check.
Categorical-classification check — 0 claims, none authored. All INSR variant claims are framed as descriptive measurements — per-domain rho, regional partition counts, ordering-strength interpretation — not as ACMG/AMP variant classifications. The INSR cross-reference is a research-tool methodology benchmark, not a variant-prioritization deliverable; no classification claims appear.
How This Page Works
INSR was the first verification report we authored with the claim list living alongside the analysis it describes — in the same place as the analysis brief, rather than in a separate downstream location. The verification result is recorded directly against that claim list. This removes a drift problem we saw in earlier pilots, where claims lived in a downstream location and could quietly diverge from the analysis they were supposed to be verifying.
The exact source-data version and verification-result version used at this page’s most recent build are shown in the footer below — an evidence chain back to a specific, reproducible state.
Verification independence
Verification is performed by Veritas, an independent agent with no involvement in producing the content it checks.
Claims Register
| Claim ID | Claim | Source | Strength | Status |
|---|---|---|---|---|
| CLM-2026-3001 | Aslanzadeh et al. 2025, Nature Communications 16:9143, INSR ectodomain DMS, 7-assay MaveDB panel | Aslanzadeh et al. 2025 | strong | CLM-2026-3001 Reference fact |
| CLM-2026-3002 | INSR ectodomain residues 28-955 (928 aa) | UniProt P06213 | strong | CLM-2026-3002 Reference fact |
| CLM-2026-3003 | MaveDB urn:mavedb:00001239-a-6: 13,927 non-NA insulin binding scores | MaveDB urn:mavedb:00001239-a-6 | strong | CLM-2026-3003 Single-source |
| CLM-2026-3004 | INSR global Spearman rho = 0.3989, N = 13,927, p ~ 0 | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3004 Verified |
| CLM-2026-3005 | INSR global Pearson r = 0.416, N = 13,927 | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3005 Verified |
| CLM-2026-3006 | INSR global categorical agreement 66.2% (9,225/13,927) | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3006 Verified |
| CLM-2026-3007 | INSR L1 (28-157): rho = 0.594, N = 1,945, p = 1.7e-185, agreement 82.7% | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3007 Verified |
| CLM-2026-3008 | INSR L1 categorical agreement 82.7% | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3008 Verified |
| CLM-2026-3009 | INSR alpha-CT (716-746): rho = -0.088, N = 507, p = 0.046, agreement 66.3% | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3009 Verified |
| CLM-2026-3010 | Same-protein contrast: L1 rho = 0.594, alpha-CT rho = -0.088, delta = 0.682 | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3010 Reference fact |
| CLM-2026-3011 | INSR CR (158-310): rho = 0.493, N = 2,331 | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3011 Verified |
| CLM-2026-3012 | INSR L2 (311-470): rho = 0.403, N = 2,332 | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3012 Verified |
| CLM-2026-3013 | INSR FnIII-1 (471-595): rho = 0.252, N = 1,880 | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3013 Verified |
| CLM-2026-3014 | INSR FnIII-2 (596-808): rho = 0.096, N = 3,216 | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3014 Verified |
| CLM-2026-3015 | INSR FnIII-3 (809-906): rho = 0.299, N = 1,498 | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3015 Verified |
| CLM-2026-3016 | INSR ectodomain C-tail (907-955): rho = 0.459, N = 725 | Internal NeuroAutomata INSR analysis | strong | CLM-2026-3016 Verified |
| CLM-2026-3017 | INSR global rho 0.3989 < ProteinGym ESM-2 650M baseline 0.414 | Internal NeuroAutomata; ProteinGym leaderboard snapshot 2026-05-08 | strong | CLM-2026-3017 Reference fact |
| CLM-2026-3018 | AlphaMissense R = 0.55 vs INSR MAVE composite (paper Fig 5B) | Aslanzadeh et al. 2025, Fig 5B | strong | CLM-2026-3018 Verified |
| CLM-2026-3019 | ESM-2 r = 0.416 (binding) vs AlphaMissense R = 0.55 (composite); gap ~0.13 | Internal NeuroAutomata | moderate | CLM-2026-3019 Reference fact |
| CLM-2026-3020 | ESM-2 L1 rho 0.594 > AlphaMissense INSR composite R 0.55 (different metrics, see note); unsupervised vs supervised | Internal NeuroAutomata | moderate | CLM-2026-3020 Reference fact |
| CLM-2026-3021 | INSR N=13,927 > prior largest CYP2C19 N=7,830 | Internal NeuroAutomata our benchmark | strong | CLM-2026-3021 Reference fact |
| CLM-2026-3022 | Boundary #2b (receptor-ligand interface): alpha-CT first instance; canonical id boundary-2b-receptor-ligand-interface | Axon Agentic ESM-2 boundary-conditions catalog (canonical id: boundary-2b-receptor-ligand-interface, Cycle-2 first instance) | moderate | CLM-2026-3022 Reference fact |
| CLM-2026-3023 | alpha-CT: not the same regime as PTEN CX5R / BRCA1 RING zinc zero-rank | Internal NeuroAutomata methodology comparison | moderate | CLM-2026-3023 Reference fact |
| CLM-2026-3024 | INSR global r^2 ~ 17.3% (Pearson r 0.416) | Computed from a related verified claim | strong | CLM-2026-3024 Verified |
| CLM-2026-3025 | INSR-FASTA-BOUNDARY-SLIP-2026-05-15: caught pre-lock, truncated, archived | Internal NeuroAutomata dataset-preparation finding | strong | CLM-2026-3025 Reference fact |
| CLM-2026-3026 | INSR-MAVE-SCORE-FIGURE-APPROXIMATION-2026-05-15: would have shipped under legacy; caught by coproduction discipline | Internal NeuroAutomata dataset-preparation finding | strong | CLM-2026-3026 Reference fact |
| CLM-2026-3027 | Initial error rate 57.9% (22/38), exceeds 15% FAIL threshold | Internal NeuroAutomata Verification Category Counts | strong | CLM-2026-3027 Verified |
| CLM-2026-3028 | 13 ClinVar-pathogenic INSR variants; live-status unverified pending Scout | Aslanzadeh et al. 2025 supplementary data; ClinVar live lookup deferred | weak | CLM-2026-3028 Reference fact |
Substantiated against the INSR validation claim list
(version 6b64606); the verification result on record, at version
269b09b (both captured at build time;
cross-checked source trail).