Verification Report

Verification Report: INSR Aslanzadeh ESM-2 / MaveDB Joint Coverage

Published · Updated

Verification Report

This report documents the independent verification of the article INSR ectodomain — Where ESM-2 Carries Weight and Where it Cedes the Field.


Validation Methodology

DMS source: Aslanzadeh et al. 2025, Nature Communications 16:9143, INSR ectodomain deep mutational scanning (doi:10.1038/s41467-025-64178-4). Insulin-binding scores from MaveDB record urn:mavedb:00001239-a-6 (log2 variant/WT barcode counts, FDR-controlled), 14,576 variants across mature ectodomain residues 28-955 of human INSR (UniProt P06213).

ESM-2 scoring: 650M-parameter masked-marginal scoring on the NeuroAutomata platform. The cross-reference matched 13,927 variants between the MaveDB record and the ESM-2 928-position landscape; score polarity aligned natively against Aslanzadeh’s log2 insulin-binding scale (negative = loss-of-binding, positive = gain-of-binding, zero = WT-like). No inversion applied.

Global result and ProteinGym-leaderboard comparison. Global Spearman rho = 0.399 (N = 13,927), measured against the Aslanzadeh DMS — this is ESM-2’s correlation with the experimental data, not a figure reported by Aslanzadeh. It sits below the published ESM-2 650M ProteinGym aggregate of 0.414 (ProteinGym leaderboard, accessed 2026-05-08) — INSR is, in aggregate, a harder target than the mean ProteinGym assay. The aggregate is misleading on its own; the per-domain breakdown carries the story.

Per-domain breakdown. Spearman rho ranged from 0.594 on the L1 leucine-rich repeat (N = 1,945, p ≈ 10⁻¹⁸⁵) to −0.088 on the αCT helix (N = 507, p = 0.046) — the receptor-ligand interface that forms insulin binding site 1. This anti-correlation is the first receptor-ligand interface boundary condition we have catalogued in this validation work, structurally distinct from prior zero-correlation cases at the PTEN CX5R catalytic motif and the BRCA1 RING zinc-coordinating residues. The αCT helix is not an active site in the catalytic-machinery sense; it is an evolvable interface surface where ligand-recognition co-evolution drives variation that ESM-2’s sequence prior reads as tolerance rather than constraint.

Reference comparison to AlphaMissense (Cheng et al. 2023). Aslanzadeh report a global rho between AlphaMissense and the DMS at the full-ectodomain grain; they do not break it down per domain. The article uses this only for headline context and notes the comparison is approximate, not apples-to-apples — different score derivations, different match shapes, different per-position coverage.

How to read the Spearman correlation. As the article notes, rho measures agreement on ordering, not per-variant accuracy. ESM-2 helps prioritize which positions to investigate first; it does not independently classify variants. The published prose frames all variant-related claims as descriptive (rank prioritization), not as ACMG/AMP variant classifications.


Summary

MetricValue
Claims28 total
ResultAll checked; the page passed verification (on the grounding checks — see “What Was Checked”)
Per-claim outcome15 verified · 1 single-source measurement · 12 reference facts (0 contradicted, 0 uncheckable)
Numbers checked for arithmetic consistency15 of 28
Categorical classifications made0 (none authored — see “What Was Checked”)
Source dataINSR validation dataset (version recorded below at build time)
Verified2026-05-18
Verified byVeritas

What Was Checked

Source-against-claim review (all 28 claims). Every claim was checked end-to-end against the primary source it cites — whether the source actually says what we claim, and whether the scope matches.

A note on the framing-fidelity check. The stricter framing-fidelity check — whether the prose framing is faithful to what the cited authors reported — was not run on this page. The grounding checks above (source-says and scope-match, across all 28 claims) were the basis on which this page passed; the framing-fidelity pass that ran on the later CYP2C9 and BRCA1/PTEN reports was not applied back to this older one. We disclose that openly here rather than imply a check we did not run — “passed verification” on this page means the grounding checks passed, not that the framing-fidelity pass was run.

Cited primary sources (independently resolved and checked figure-by-figure across the verified subset, with single-source measurements disclosed as such): Aslanzadeh et al. 2025 Nature Communications (INSR ectodomain DMS); the MaveDB record urn:mavedb:00001239-a-6 (INSR insulin-binding score set); UniProt P06213 (INSR canonical sequence reference); and the ProteinGym substitution DMS leaderboard (cross-model baseline, accessed 2026-05-08). The fifteen verified claims are ESM-2’s own measurements compared against the Aslanzadeh ectodomain DMS — the per-domain correlations, the regional breakdown and partition counts, the αCT-helix anti-correlation, the cross-leaderboard arithmetic, and the region accounting.

Context sources (referenced in the article for background, outside the formal verification scope): AlphaMissense (Cheng et al. 2023, doi:10.1126/science.adg7492), used only for the approximate-comparison qualifier on the AlphaMissense headline rho. Note that the AlphaMissense figure is taken from Aslanzadeh’s own published panel (Fig 5B), not from an independent measurement of our own.

Arithmetic-consistency check (15 of 28 quantitative claims). Every Spearman rho, sample size, p-value, and percentage figure with arithmetic content was checked for internal consistency: rho values within the valid range (−1 to 1) across the L1 leucine-rich repeat (0.594), the cysteine-rich region, the L2 leucine-rich repeat, the fibronectin type-III domains, and the αCT helix (−0.088); sample-size accounting (per-domain N values summing within the 13,927-variant ectodomain match window); p-value bounds; the −3.6% gap from the ESM-2 650M ProteinGym aggregate (0.399 vs 0.414); and the partition counts across the structural-region breakdown.

One single-source measurement. A further claim is a single-source value with no independent second measurement available for an arithmetic cross-check. We disclose it as exactly that — measured once, not independently corroborated. This is an honest, bounded category, distinct from a verified claim; it is not a coverage gap. The remaining 12 claims are reference facts — methodological framing, structural-existence, and comparative statements (e.g. the paper citation, the MaveDB record identifier, the INSR sequence length, the AlphaMissense approximate-comparison qualifier, the receptor-ligand interface boundary-condition designation) — confirmed by source-against-claim review only, with no arithmetic in them to check.

Categorical-classification check — 0 claims, none authored. All INSR variant claims are framed as descriptive measurements — per-domain rho, regional partition counts, ordering-strength interpretation — not as ACMG/AMP variant classifications. The INSR cross-reference is a research-tool methodology benchmark, not a variant-prioritization deliverable; no classification claims appear.


How This Page Works

INSR was the first verification report we authored with the claim list living alongside the analysis it describes — in the same place as the analysis brief, rather than in a separate downstream location. The verification result is recorded directly against that claim list. This removes a drift problem we saw in earlier pilots, where claims lived in a downstream location and could quietly diverge from the analysis they were supposed to be verifying.

The exact source-data version and verification-result version used at this page’s most recent build are shown in the footer below — an evidence chain back to a specific, reproducible state.


Verification independence

Verification is performed by Veritas, an independent agent with no involvement in producing the content it checks.

Claims Register

Claim ID Claim Source Strength Status
CLM-2026-3001 Aslanzadeh et al. 2025, Nature Communications 16:9143, INSR ectodomain DMS, 7-assay MaveDB panel Aslanzadeh et al. 2025
Data source
strong CLM-2026-3001 Reference fact
CLM-2026-3002 INSR ectodomain residues 28-955 (928 aa) UniProt P06213
Supporting citation
strong CLM-2026-3002 Reference fact
CLM-2026-3003 MaveDB urn:mavedb:00001239-a-6: 13,927 non-NA insulin binding scores MaveDB urn:mavedb:00001239-a-6
Data source
strong CLM-2026-3003 Single-source
CLM-2026-3004 INSR global Spearman rho = 0.3989, N = 13,927, p ~ 0 Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3004 Verified
CLM-2026-3005 INSR global Pearson r = 0.416, N = 13,927 Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3005 Verified
CLM-2026-3006 INSR global categorical agreement 66.2% (9,225/13,927) Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3006 Verified
CLM-2026-3007 INSR L1 (28-157): rho = 0.594, N = 1,945, p = 1.7e-185, agreement 82.7% Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3007 Verified
CLM-2026-3008 INSR L1 categorical agreement 82.7% Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3008 Verified
CLM-2026-3009 INSR alpha-CT (716-746): rho = -0.088, N = 507, p = 0.046, agreement 66.3% Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3009 Verified
CLM-2026-3010 Same-protein contrast: L1 rho = 0.594, alpha-CT rho = -0.088, delta = 0.682 Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3010 Reference fact
CLM-2026-3011 INSR CR (158-310): rho = 0.493, N = 2,331 Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3011 Verified
CLM-2026-3012 INSR L2 (311-470): rho = 0.403, N = 2,332 Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3012 Verified
CLM-2026-3013 INSR FnIII-1 (471-595): rho = 0.252, N = 1,880 Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3013 Verified
CLM-2026-3014 INSR FnIII-2 (596-808): rho = 0.096, N = 3,216 Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3014 Verified
CLM-2026-3015 INSR FnIII-3 (809-906): rho = 0.299, N = 1,498 Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3015 Verified
CLM-2026-3016 INSR ectodomain C-tail (907-955): rho = 0.459, N = 725 Internal NeuroAutomata INSR analysis
Data source
strong CLM-2026-3016 Verified
CLM-2026-3017 INSR global rho 0.3989 < ProteinGym ESM-2 650M baseline 0.414 Internal NeuroAutomata; ProteinGym leaderboard snapshot 2026-05-08
Supporting citation
strong CLM-2026-3017 Reference fact
CLM-2026-3018 AlphaMissense R = 0.55 vs INSR MAVE composite (paper Fig 5B) Aslanzadeh et al. 2025, Fig 5B
Supporting citation
strong CLM-2026-3018 Verified
CLM-2026-3019 ESM-2 r = 0.416 (binding) vs AlphaMissense R = 0.55 (composite); gap ~0.13 Internal NeuroAutomata
Supporting citation
moderate CLM-2026-3019 Reference fact
CLM-2026-3020 ESM-2 L1 rho 0.594 > AlphaMissense INSR composite R 0.55 (different metrics, see note); unsupervised vs supervised Internal NeuroAutomata
Supporting citation
moderate CLM-2026-3020 Reference fact
CLM-2026-3021 INSR N=13,927 > prior largest CYP2C19 N=7,830 Internal NeuroAutomata our benchmark
Supporting citation
strong CLM-2026-3021 Reference fact
CLM-2026-3022 Boundary #2b (receptor-ligand interface): alpha-CT first instance; canonical id boundary-2b-receptor-ligand-interface Axon Agentic ESM-2 boundary-conditions catalog (canonical id: boundary-2b-receptor-ligand-interface, Cycle-2 first instance)
Builds on
moderate CLM-2026-3022 Reference fact
CLM-2026-3023 alpha-CT: not the same regime as PTEN CX5R / BRCA1 RING zinc zero-rank Internal NeuroAutomata methodology comparison
Qualifying note
moderate CLM-2026-3023 Reference fact
CLM-2026-3024 INSR global r^2 ~ 17.3% (Pearson r 0.416) Computed from a related verified claim
Data source
strong CLM-2026-3024 Verified
CLM-2026-3025 INSR-FASTA-BOUNDARY-SLIP-2026-05-15: caught pre-lock, truncated, archived Internal NeuroAutomata dataset-preparation finding
Supporting citation
strong CLM-2026-3025 Reference fact
CLM-2026-3026 INSR-MAVE-SCORE-FIGURE-APPROXIMATION-2026-05-15: would have shipped under legacy; caught by coproduction discipline Internal NeuroAutomata dataset-preparation finding
Supporting citation
strong CLM-2026-3026 Reference fact
CLM-2026-3027 Initial error rate 57.9% (22/38), exceeds 15% FAIL threshold Internal NeuroAutomata Verification Category Counts
Data source
strong CLM-2026-3027 Verified
CLM-2026-3028 13 ClinVar-pathogenic INSR variants; live-status unverified pending Scout Aslanzadeh et al. 2025 supplementary data; ClinVar live lookup deferred
Qualifying note
weak CLM-2026-3028 Reference fact

Substantiated against the INSR validation claim list (version 6b64606); the verification result on record, at version 269b09b (both captured at build time; cross-checked source trail).