Verification Report

Verification Report: TPMT VAMP-seq Abundance / ESM-2 Cross-Reference

Published · Updated

Verification Report

This report documents the independent verification of the article TPMT: ESM-2 / VAMP-seq Complementarity.


Validation Methodology

DMS source: Matreyek et al. 2018, Nature Genetics 50(6):874–882 (doi:10.1038/s41588-018-0122-z). The Matreyek paper introduced VAMP-seq — Variant Abundance by Massively Parallel Sequencing — and applied the assay to multiple proteins including TPMT, the human thiopurine S-methyltransferase whose deficiency mandates pre-treatment genotyping under CPIC guidelines for 6-mercaptopurine, azathioprine, and thioguanine therapy. EGFP-TPMT fusion was expressed in HEK293T landing-pad cells; cells were FACS-sorted into 4 fluorescence bins and deep-sequenced; Enrich2 scored 8 biological replicates against WT synonymous (normalized to 1.0) and median nonsense at positions 51–219 (normalized to 0.0). Scores are deposited at MaveDB as urn:mavedb:00000013-b-1 (CC0 license, N = 3,689 missense variants scored of 4,655 possible).

ESM-2 scoring: 650M-parameter masked-marginal scoring on the NeuroAutomata platform across all 245 positions of TPMT (UniProt P51580). Score polarity aligned natively against VAMP-seq abundance (positive = stable/abundant, negative = unstable/degraded); both metrics agree on direction, no inversion applied. Cross-referencing variants between the ESM-2 landscape and VAMP-seq scores yielded N = 3,685 matched after WT-identity filtering. Position stratification used SAM-binding active-site annotations to partition active-site (N = 14), non-active-site annotated (N = 150), and unannotated (N = 3,521) positions.

Global Spearman rho = 0.530 (N = 3,685, p = 7.4 × 10⁻²⁶⁶); Pearson r = 0.535. This sits near the ProteinGym ESM-2 650M baseline of 0.414 across 217 assays — a positive but modest gap consistent with VAMP-seq abundance selection on a stable single-domain enzyme. The companion NUDT15 benchmark (rho = 0.526), another pharmacogenomic enzyme in the same thiopurine metabolic pathway scored by VAMP-seq, sits within 0.004 — strong evidence that rho ≈ 0.53 is characteristic for thiopurine-pathway enzymes under VAMP-seq abundance selection.

Regional breakdown carries the structural boundary. The SAM-binding active-site domain achieves Spearman rho = 0.222 at N = 14 (p = 0.45, non-significant) — consistent with the active-site small-N + specificity-engineering boundary observed in PTEN, BRCA1, and prior bacterial enzyme benchmarks. Non-active-site annotated positions (rho = 0.524, N = 150) and unannotated positions (rho = 0.531, N = 3,521) recover the global signal cleanly. The article frames the active-site result as a small-N caveat rather than a discriminative finding — the underlying boundary is methodological, not biological.

Clinical pharmacogenomic alleles (TPMT*2 A80P, *3B A154T, *3C Y240C, *5 L49S, *7 H227Q, *8 R215H) were checked for whether ESM-2’s directional ranking agrees with VAMP-seq abundance. The article frames these as descriptive ESM-2 directional concordance against published clinical phenotype severity (*2 / *3B correctly flagged as low-abundance; *3C borderline; *5 / *7 activity loss not abundance-driven), not as ACMG/AMP variant classifications. The underlying record matches that conservative framing: clinical-allele scores are recorded as quantitative measurements on the VAMP-seq abundance scale with a descriptive concordance reading, not as classification predictions — authored that way from the start, the same approach taken on the earlier CYP2C9 report. The classification check found no mismatch between that record and the published prose.


Summary

MetricValue
Claims33 total
ResultAll checked; the page passed verification (5 numerically verified / 12 single-source measurements / 16 reference facts; 0 contradicted, 0 uncheckable)
Source dataTPMT validation dataset (version recorded below at build time)
Verified2026-05-18
Verified byVeritas

Claim-by-Claim Verification

What was checked, and how each claim resolves against its primary source, is shown claim by claim below. The exact source-data version and verification-result version used at this page’s most recent build are shown in the page footer.


What Was Verified

Framing-fidelity check — all 33 claims. Every claim was checked against its primary source for whether the source actually entails it, whether the scope matches, and whether the prose framing is faithful to what the cited authors reported. Key sources: Matreyek et al. 2018 Nat Genet (TPMT VAMP-seq DMS), UniProt P51580 (TPMT canonical reference), MaveDB urn:mavedb:00000013-b-1 (deposited score set), PharmVar TPMT (clinical allele inventory), the CPIC TPMT / thiopurine guidelines (clinical-context framing), and the ProteinGym substitution DMS leaderboard (cross-model baseline, accessed 2026-05-08).

Numerical check. Of the 33 claims, 5 carried figures that were arithmetic-verified with explicit numeric-range and consistency checks. A further 12 are single-source DMS measurements — single values from one deposited record, with no independent second source available to cross-check them against by construction. We disclose these as exactly that: measured once, not independently corroborated. This is an honest, bounded category, distinct from a verified claim; it is not a coverage gap. The remaining 16 are reference facts (existence, methodological, comparative, literature-sourced, and causal statements) with no arithmetic in them to check — confirmed by the framing-fidelity review above.

Classification check — 0 classification claims. The TPMT clinical-allele claims were written from the start as quantitative measurements on the VAMP-seq abundance scale with a descriptive concordance reading — the same approach taken on the earlier CYP2C9 report — not as classification predictions. No mismatch was found between that record and the published prose. ESM-2 is a ranking tool, not a classifier; categorical thresholds do not generalize across proteins.

One consistency layer not yet active. A further consistency check is planned but was not active for this report.

On retroactive verification. This is an older post, originally written before our current per-claim verification discipline existed; it was brought under verification retroactively — the claim list was written against the already-published article and the existing source-verification records, then independently verified. The source-data version and verification-result version used at this page’s most recent build are both shown in the footer, an evidence chain back to a specific, reproducible state.


Verification independence

Verification is performed by Veritas, an independent agent with no involvement in producing the content it checks.

Claim ID Claim Source Strength Status
CLM-TPMT-001 ESM-2 (Evolutionary Scale Model, 650M parameters, facebook/esm2_t33_650M_UR50D) is a protein language model trained on evolutionary sequence data. Masked-marginal scoring produces a per-variant log-likelihood that ranks mutations by predicted evolutionary plausibility. Lin, Z. et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637), 1123–1130.
Supporting citation
strong CLM-TPMT-001 Reference fact
CLM-TPMT-002 The published ESM-2 650M baseline on ProteinGym substitution Spearman is ρ = 0.414 (live leaderboard CSV, rank 45 of 97 models). ProteinGym zero-shot substitution DMS leaderboard, per-model Spearman summary CSV — (2026-05-08 access). Notin et al. 2023 introduced the v1.0 217-assay substitution benchmark.
Data source
strong CLM-TPMT-002 Single-source
CLM-TPMT-003 ESM-2 650M is the peak of the ESM-2 size scaling on ProteinGym substitution Spearman: 650M = 0.414, 3B = 0.406, 15B = 0.400. The 3B and 15B variants perform below the 650M variant on this benchmark. ProteinGym zero-shot substitution DMS leaderboard, per-model Spearman summary CSV — rows for ESM2 8M / 35M / 150M / 650M / 3B / 15B. Frozen-archive snapshot at `pinned leaderboard snapshot`.
Data source
strong CLM-TPMT-003 Reference fact
CLM-TPMT-004 VAMP-seq (Variant Abundance by Massively Parallel sequencing) measures intracellular protein abundance for thousands of variants in parallel using GFP fusions and fluorescent cell sorting. Introduced for PTEN (4,112 variants) and TPMT (3,689 variants) in Matreyek et al. 2018. Matreyek, K.A., Starita, L.M., Stephany, J.J., et al. (2018). Multiplex assessment of protein variant abundance by massively parallel sequencing. Nature Genetics, 50(6), 874–882.
Supporting citation
strong CLM-TPMT-004 Reference fact
CLM-TPMT-005a Matreyek et al. 2021's integration of PTEN abundance and activity DMS data yields a four-class taxonomy with the following composition: WT-like (51%), loss of abundance only (22%), loss of activity only (6%), and loss of both (21%). Matreyek, K.A., Stephany, J.J., Ahler, E., & Fowler, D.M. (2021). Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Medicine, 13, 165.
Builds on
strong CLM-TPMT-005a Reference fact
CLM-TPMT-005b Matreyek et al. 2021 reports 4,721 PTEN variant abundance measurements via VAMP-seq, extending Matreyek 2018's prior 4,112-missense baseline with additional missense, synonymous, and nonsense variants. Matreyek, K.A., Stephany, J.J., Ahler, E., & Fowler, D.M. (2021). Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Medicine, 13, 165.
Builds on
strong CLM-TPMT-005b Single-source
CLM-TPMT-005c Matreyek et al. 2021 reports 7,244 PTEN variant phosphatase activity measurements via yeast functional rescue (Mighell-pattern assay). Matreyek, K.A., Stephany, J.J., Ahler, E., & Fowler, D.M. (2021). Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Medicine, 13, 165.
Builds on
strong CLM-TPMT-005c Single-source
CLM-TPMT-005d Matreyek et al. 2021's 4-class taxonomy explicitly places C124S, R130G, R130Q, and G129E in the 'loss of activity only' subgroup: catalytically dead PTEN variants that retain near-wild-type abundance. Matreyek, K.A., Stephany, J.J., Ahler, E., & Fowler, D.M. (2021). Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Medicine, 13, 165.
Builds on
strong CLM-TPMT-005d Reference fact
CLM-TPMT-006 The Clinical Pharmacogenetics Implementation Consortium (CPIC) guideline for thiopurine dosing recommends pre-treatment TPMT and NUDT15 genotyping to adjust starting doses of azathioprine, mercaptopurine, and thioguanine. The 2025 update (Maillard et al. 2026) supersedes the 2018 update (Relling et al. 2019). Maillard, M., Schwab, M., Whirl-Carrillo, M., et al. (2026). Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline for Thiopurine Dosing Based on TPMT and NUDT15 Genotypes: 2025 Update. Clinical Pharmacology & Therapeutics, 119(4), 916–927.
Supporting citation
strong CLM-TPMT-006 Reference fact
CLM-TPMT-010 ESM-2 ranks 3,685 TPMT variants against Matreyek 2018 VAMP-seq abundance scores with global Spearman ρ = 0.5298 (p = 7.4 × 10⁻²⁶⁶, Pearson r = 0.5353). Internal NeuroAutomata benchmark — TPMT × Matreyek 2018 VAMP-seq cross-reference
Data source
strong CLM-TPMT-010 Verified
CLM-TPMT-011 The internal benchmark scores 3,685 TPMT variants — 4 fewer than the 3,689-variant VAMP-seq set in Matreyek 2018. The 4-variant filter excludes WT-identity substitutions (Q47S, L243L, T244T, E245E) via the `score != 0.0` filter. Internal NeuroAutomata benchmark — filter
Data source
moderate CLM-TPMT-011 Reference fact
CLM-TPMT-012 TPMT *2 (A80P): ESM-2 score = −7.03 (strongly deleterious); VAMP-seq abundance = 0.318 (severe loss-of-abundance). Both assays catch the variant. Mechanism: enhanced proteasomal degradation (Tai et al. 1997). Most common decreased-function allele in Caucasians. Internal NeuroAutomata benchmark — TPMT clinical-allele cross-reference. ESM-2 score from mutation-score table row A80 column P; VAMP-seq abundance from position 80, alt P.
Data source
strong CLM-TPMT-012 Single-source
CLM-TPMT-013 TPMT *3B (A154T, alone): ESM-2 score = −5.45 (deleterious); VAMP-seq abundance = 0.286 (severe loss-of-abundance). Both assays catch the variant. Also part of the TPMT *3A compound (A154T+Y240C), the most common deficiency allele in Caucasians and Southwest Asians. Internal NeuroAutomata benchmark — TPMT clinical-allele cross-reference. ESM-2 score from mutation-score table row A154 column T; VAMP-seq abundance from position 154, alt T.
Data source
strong CLM-TPMT-013 Single-source
CLM-TPMT-014 TPMT *3C (Y240C): ESM-2 score = −4.26 (deleterious); VAMP-seq abundance = 0.600 (moderate loss). Both assays catch the variant. Most common decreased-function allele in East Asians and African-Americans. Internal NeuroAutomata benchmark — TPMT clinical-allele cross-reference. ESM-2 score from mutation-score table row Y240 column C; VAMP-seq abundance from position 240, alt C.
Data source
strong CLM-TPMT-014 Single-source
CLM-TPMT-015 TPMT *5 (L49S): ESM-2 score = −7.68 (strongly deleterious — at the stronger end of the TPMT distribution where mean = −5.02, std = 3.82); VAMP-seq abundance = 0.790 (WT-like). ESM-2 catches the variant; VAMP-seq misses it. Mechanism: activity-loss without abundance-loss — the canonical activity-without-abundance pattern. Internal NeuroAutomata benchmark — TPMT clinical-allele cross-reference. ESM-2 score from mutation-score table row L49 column S; VAMP-seq abundance from position 49, alt S.
Data source
strong CLM-TPMT-015 Single-source
CLM-TPMT-016 TPMT *7 (H227Q): ESM-2 score = −3.27 (moderately deleterious — above the TPMT distribution mean of −5.02 by ~0.5σ, less deleterious than the population mean but still clearly negative); VAMP-seq abundance = 0.770 (WT-like). ESM-2 catches the variant; VAMP-seq misses it. Same activity-without-abundance pattern as *5. Internal NeuroAutomata benchmark — TPMT clinical-allele cross-reference. ESM-2 score from mutation-score table row H227 column Q; VAMP-seq abundance from position 227, alt Q.
Data source
moderate CLM-TPMT-016 Single-source
CLM-TPMT-017 ESM-2 ranks all five canonical TPMT clinical alleles (*2, *3B, *3C, *5, *7) as deleterious from sequence alone. VAMP-seq abundance catches the proteolytic-degradation subset (*2, *3B, *3C); the activity-loss subset (*5, *7) is the VAMP-seq blind spot. ESM-2 fills the gap via evolutionary-conservation signal — covering the activity-without-abundance subset — the canonical pharmacogenomic gap. Synthesis claim (per-allele dataset paths inherit).
Data source
strong CLM-TPMT-017 Reference fact
CLM-TPMT-018 Position-stratified TPMT correlation: active-site (n=14, ρ=0.222, p=0.45 — not statistically significant); non-active-site (n=150, ρ=0.5244, p=5.7×10⁻¹²); unannotated positions (n=3,521, ρ=0.5313, p=8.1×10⁻²⁵⁶). The active-site stratum's small sample size and non-significance precludes strong claims at that stratum; non-active-site and unannotated strata both show ρ ≈ similar to global. Internal NeuroAutomata benchmark — JSON field `position_type_stratification.{active_site,non_active_site,unknown}`
Data source
strong CLM-TPMT-018 Verified
CLM-TPMT-019 TPMT global ρ = 0.5298 is approximately 28% above the published ESM-2 650M ProteinGym baseline of 0.414 (delta = +0.116, or +28% relative / +11.6 percentage points absolute). Derived from the TPMT ρ = 0.5298 and the published baseline 0.414. Computation: (0.5298 − 0.414) / 0.414 = 0.280.
Supporting citation
strong CLM-TPMT-019 Reference fact
CLM-TPMT-020 TPMT categorical agreement between ESM-2 ranking and VAMP-seq abundance is 13.2% globally excluding neutral (488 agree / 3,197 disagree / 0 neutral). The low categorical agreement combined with ρ = 0.5298 reflects that ESM-2 and VAMP-seq agree on ranking while disagreeing on per-variant categorical class — exactly the orthogonal-not-redundant property that makes the two methods complementary. Internal NeuroAutomata benchmark — JSON field `categorical_agreement`
Data source
strong CLM-TPMT-020 Single-source
CLM-TPMT-030 ESM-2 ranks 4,112 PTEN variants against Matreyek 2018 VAMP-seq abundance with global Spearman ρ = 0.4836 (N = 4,112, p = 4.6 × 10⁻²⁴⁰). Position stratification: non-active-site ρ = 0.5357 (N = 205); active-site ρ = −0.0113 (N = 84, not significant). Internal NeuroAutomata benchmark — JSON fields `global_correlation` and `position_type_stratification`
Data source
strong CLM-TPMT-030 Verified
CLM-TPMT-031 The PTEN active-site / non-active-site contrast (ρ = −0.011 vs ρ = 0.536) reflects the same biological mechanism the Matreyek 2021 4-class taxonomy makes explicit: 6% of PTEN variants are 'loss of activity only' — catalytically dead but structurally stable. ESM-2 catches them via evolutionary-conservation signal; VAMP-seq's abundance metric does not. Synthesis: Matreyek 2021 4-class taxonomy (CLM-TPMT-005) combined with PTEN stratified correlation (CLM-TPMT-030).
Supporting citation
strong CLM-TPMT-031 Reference fact
CLM-TPMT-032 Catalytic-dead-but-stable PTEN cancer variants are explicitly classified 'loss of activity only' by Matreyek 2021: C124S (VAMP-seq abundance 1.14), R130G (1.09), G129E (0.76), R130Q (WT-like per Part 3 coverage). Each retains near-WT abundance while abolishing catalytic function. Internal NeuroAutomata benchmark — abundance scores from pten-matreyek/ positions 124 (C→S, score 1.1375), 130 (R→G, score 1.0852), 129 (G→E, score 0.7553). Taxonomy class from Matreyek 2021 (DOI 10.1186/s13073-021-00984-x).
Data source
strong CLM-TPMT-032 Single-source
CLM-TPMT-033 Akt1 T308 phosphorylation correlates with tumor incidence at Pearson r = 0.76 across Matreyek 2021's PTEN dominant-negative validation set. Novel dominant negatives discovered through the abundance+activity integration include R130P (observed 8 times across breast/uterine/esophageal tumors) and D92H (3 breast cancer occurrences). Matreyek, K.A., Stephany, J.J., Ahler, E., & Fowler, D.M. (2021). Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Medicine, 13, 165.
Supporting citation
strong CLM-TPMT-033 Single-source
CLM-TPMT-040 Calmodulin (CALM1_HUMAN_Weile_2017 binding-affinity assay) achieves ρ = 0.212 (N = 1,813) against ESM-2 ranking, flagged `failed_threshold: true` in the 5-assay benchmark. Calmodulin's category is `Binding`, distinct from the single-protein-stability and organismal-fitness assays in the benchmark. ESM-2 performance on protein-protein binding assays is the canonical weakness category. Internal NeuroAutomata benchmark — JSON `assays[CALM1_HUMAN_Weile_2017]`
Data source
strong CLM-TPMT-040 Verified
CLM-TPMT-041 The 5-assay benchmark median ρ = 0.515 spans single-protein-stability (Beta-lactamase 0.7315, BRCA1 0.5147, UBC9 0.4726), organismal-fitness (PTEN-Mighell 0.5185 yeast-growth assay), and binding-affinity (Calmodulin 0.2116, `failed_threshold: true`). TPMT ρ = 0.5298 sits ≈ at the benchmark median. Internal NeuroAutomata benchmark — JSON `summary.median_rho: 0.515` and `assays[]` array
Data source
strong CLM-TPMT-041 Verified
CLM-TPMT-042 The benchmark median delta vs published baseline is +24.3% (median 0.515 vs baseline 0.414). Note this is a different metric than TPMT-specific delta (+28%): the 24.3% applies to the 5-assay median; the 28% applies to TPMT ρ = 0.5298 specifically. Internal NeuroAutomata benchmark — JSON `summary.delta_vs_baseline_pct: 24.3`. Derived comparison: (0.5147 − 0.414) / 0.414 = 0.243.
Supporting citation
strong CLM-TPMT-042 Reference fact
CLM-TPMT-043 ESM-2 cannot reliably detect dominant-negative variants where the variant retains WT-like abundance AND retains some catalytic activity but disrupts complex function via partner-binding or oligomerization. Example: PTEN P38S (VAMP-seq abundance 1.14, melanoma-enriched at 10.4% of missense variants) shows WT-like abundance to VAMP-seq AND ranks unremarkably by ESM-2 — a true blind spot for both assays. Internal NeuroAutomata benchmark — JSON record position 38, alt S (P38S abundance 1.1361 + notes 'ESM-2 cannot detect dominant-negative mechanism')
Data source
moderate CLM-TPMT-043 Single-source
CLM-TPMT-050 TPMT's abundance-driven decreased-function mechanism (proteolytic degradation of *2, *3B, *3C variants) generalizes to receptor-class and ion-channel-class proteins where stability is the rate-limiting step for function. Matreyek-lab DMS datasets for INSR, LDLR, KCNE1, and STIM1 (each cited individually in CLM-TPMT-051 through -054) are adjacent bench-DMS surfaces for the same ESM-2 cross-reference approach. Synthesis claim spanning four per-paper primary citations.
Supporting citation
moderate CLM-TPMT-050 Reference fact
CLM-TPMT-051 The Matreyek lab contributed to deep mutational scanning of the human insulin receptor (INSR) ectodomain in 2025, scoring ~14,000 variants. Aslanzadeh, V., Brierley, G.V., Kumar, R., et al. (2025). Deep mutational scanning of the human insulin receptor ectodomain to inform precision therapy for insulin resistance. Nature Communications, 16, 9143. Matreyek listed as co-author.
Supporting citation
moderate CLM-TPMT-051 Reference fact
CLM-TPMT-052 The Matreyek lab contributed to a 2026 Science paper on the functional landscape of coding variation in LDLR (low-density lipoprotein receptor), a familial-hypercholesterolemia gene. Tabet, D.R., Coté, A.G., Lancaster, M.C., et al. (including Matreyek, K.A., Fowler, D.M., Roth, F.P., senior) (2026). The functional landscape of coding variation in the familial hypercholesterolemia gene LDLR. Science, 391(6787). Published 2026-02-19.
Supporting citation
moderate CLM-TPMT-052 Reference fact
CLM-TPMT-053 The Matreyek lab contributed to high-throughput functional mapping of KCNE1 (arrhythmia ion-channel gene) variants in 2024, scoring 2,554 variants for cell-surface expression (2,534 for function). Muhammad, A., Calandranis, M.E., Li, B., et al. (including Matreyek, K.A., Fowler, D.M., Roden, D.M., Glazer, A.M., senior) (2024). High-throughput functional mapping of variants in an arrhythmia gene, KCNE1, reveals novel biology. Genome Medicine, 16, 73. Published 2024-05-30.
Supporting citation
moderate CLM-TPMT-053 Reference fact
CLM-TPMT-054 The Matreyek lab published comprehensive mutational characterization of the calcium-sensing STIM1 EF-hand in 2025, scoring 706 of 720 possible single-amino-acid variants (Kamath and Matreyek). Kamath, N.D., & Matreyek, K.A. (2025). Comprehensive mutational characterization of the calcium-sensing STIM1 EF-hand reveals residues essential for structure and function. Genetics, 231(2), iyaf146. Published October 2025.
Supporting citation
moderate CLM-TPMT-054 Reference fact

Substantiated against the TPMT validation claim list (version 2fb2dbf); the verification result on record, at version 269b09b (both captured at build time; cross-checked source trail).