Verification Report
Verification Report: TPMT VAMP-seq Abundance / ESM-2 Cross-Reference
Verification Report
This report documents the independent verification of the article TPMT: ESM-2 / VAMP-seq Complementarity.
Validation Methodology
DMS source: Matreyek et al. 2018, Nature Genetics 50(6):874–882 (doi:10.1038/s41588-018-0122-z). The Matreyek paper introduced VAMP-seq — Variant Abundance by Massively Parallel Sequencing — and applied the assay to multiple proteins including TPMT, the human thiopurine S-methyltransferase whose deficiency mandates pre-treatment genotyping under CPIC guidelines for 6-mercaptopurine, azathioprine, and thioguanine therapy. EGFP-TPMT fusion was expressed in HEK293T landing-pad cells; cells were FACS-sorted into 4 fluorescence bins and deep-sequenced; Enrich2 scored 8 biological replicates against WT synonymous (normalized to 1.0) and median nonsense at positions 51–219 (normalized to 0.0). Scores are deposited at MaveDB as urn:mavedb:00000013-b-1 (CC0 license, N = 3,689 missense variants scored of 4,655 possible).
ESM-2 scoring: 650M-parameter masked-marginal scoring on the NeuroAutomata platform across all 245 positions of TPMT (UniProt P51580). Score polarity aligned natively against VAMP-seq abundance (positive = stable/abundant, negative = unstable/degraded); both metrics agree on direction, no inversion applied. Cross-referencing variants between the ESM-2 landscape and VAMP-seq scores yielded N = 3,685 matched after WT-identity filtering. Position stratification used SAM-binding active-site annotations to partition active-site (N = 14), non-active-site annotated (N = 150), and unannotated (N = 3,521) positions.
Global Spearman rho = 0.530 (N = 3,685, p = 7.4 × 10⁻²⁶⁶); Pearson r = 0.535. This sits near the ProteinGym ESM-2 650M baseline of 0.414 across 217 assays — a positive but modest gap consistent with VAMP-seq abundance selection on a stable single-domain enzyme. The companion NUDT15 benchmark (rho = 0.526), another pharmacogenomic enzyme in the same thiopurine metabolic pathway scored by VAMP-seq, sits within 0.004 — strong evidence that rho ≈ 0.53 is characteristic for thiopurine-pathway enzymes under VAMP-seq abundance selection.
Regional breakdown carries the structural boundary. The SAM-binding active-site domain achieves Spearman rho = 0.222 at N = 14 (p = 0.45, non-significant) — consistent with the active-site small-N + specificity-engineering boundary observed in PTEN, BRCA1, and prior bacterial enzyme benchmarks. Non-active-site annotated positions (rho = 0.524, N = 150) and unannotated positions (rho = 0.531, N = 3,521) recover the global signal cleanly. The article frames the active-site result as a small-N caveat rather than a discriminative finding — the underlying boundary is methodological, not biological.
Clinical pharmacogenomic alleles (TPMT*2 A80P, *3B A154T, *3C Y240C, *5 L49S, *7 H227Q, *8 R215H) were checked for whether ESM-2’s directional ranking agrees with VAMP-seq abundance. The article frames these as descriptive ESM-2 directional concordance against published clinical phenotype severity (*2 / *3B correctly flagged as low-abundance; *3C borderline; *5 / *7 activity loss not abundance-driven), not as ACMG/AMP variant classifications. The underlying record matches that conservative framing: clinical-allele scores are recorded as quantitative measurements on the VAMP-seq abundance scale with a descriptive concordance reading, not as classification predictions — authored that way from the start, the same approach taken on the earlier CYP2C9 report. The classification check found no mismatch between that record and the published prose.
Summary
| Metric | Value |
|---|---|
| Claims | 33 total |
| Result | All checked; the page passed verification (5 numerically verified / 12 single-source measurements / 16 reference facts; 0 contradicted, 0 uncheckable) |
| Source data | TPMT validation dataset (version recorded below at build time) |
| Verified | 2026-05-18 |
| Verified by | Veritas |
Claim-by-Claim Verification
What was checked, and how each claim resolves against its primary source, is shown claim by claim below. The exact source-data version and verification-result version used at this page’s most recent build are shown in the page footer.
What Was Verified
Framing-fidelity check — all 33 claims. Every claim was checked against its primary source for whether the source actually entails it, whether the scope matches, and whether the prose framing is faithful to what the cited authors reported. Key sources: Matreyek et al. 2018 Nat Genet (TPMT VAMP-seq DMS), UniProt P51580 (TPMT canonical reference), MaveDB urn:mavedb:00000013-b-1 (deposited score set), PharmVar TPMT (clinical allele inventory), the CPIC TPMT / thiopurine guidelines (clinical-context framing), and the ProteinGym substitution DMS leaderboard (cross-model baseline, accessed 2026-05-08).
Numerical check. Of the 33 claims, 5 carried figures that were arithmetic-verified with explicit numeric-range and consistency checks. A further 12 are single-source DMS measurements — single values from one deposited record, with no independent second source available to cross-check them against by construction. We disclose these as exactly that: measured once, not independently corroborated. This is an honest, bounded category, distinct from a verified claim; it is not a coverage gap. The remaining 16 are reference facts (existence, methodological, comparative, literature-sourced, and causal statements) with no arithmetic in them to check — confirmed by the framing-fidelity review above.
Classification check — 0 classification claims. The TPMT clinical-allele claims were written from the start as quantitative measurements on the VAMP-seq abundance scale with a descriptive concordance reading — the same approach taken on the earlier CYP2C9 report — not as classification predictions. No mismatch was found between that record and the published prose. ESM-2 is a ranking tool, not a classifier; categorical thresholds do not generalize across proteins.
One consistency layer not yet active. A further consistency check is planned but was not active for this report.
On retroactive verification. This is an older post, originally written before our current per-claim verification discipline existed; it was brought under verification retroactively — the claim list was written against the already-published article and the existing source-verification records, then independently verified. The source-data version and verification-result version used at this page’s most recent build are both shown in the footer, an evidence chain back to a specific, reproducible state.
Verification independence
Verification is performed by Veritas, an independent agent with no involvement in producing the content it checks.
| Claim ID | Claim | Source | Strength | Status |
|---|---|---|---|---|
| CLM-TPMT-001 | ESM-2 (Evolutionary Scale Model, 650M parameters, facebook/esm2_t33_650M_UR50D) is a protein language model trained on evolutionary sequence data. Masked-marginal scoring produces a per-variant log-likelihood that ranks mutations by predicted evolutionary plausibility. | Lin, Z. et al. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637), 1123–1130. | strong | CLM-TPMT-001 Reference fact |
| CLM-TPMT-002 | The published ESM-2 650M baseline on ProteinGym substitution Spearman is ρ = 0.414 (live leaderboard CSV, rank 45 of 97 models). | ProteinGym zero-shot substitution DMS leaderboard, per-model Spearman summary CSV — (2026-05-08 access). Notin et al. 2023 introduced the v1.0 217-assay substitution benchmark. | strong | CLM-TPMT-002 Single-source |
| CLM-TPMT-003 | ESM-2 650M is the peak of the ESM-2 size scaling on ProteinGym substitution Spearman: 650M = 0.414, 3B = 0.406, 15B = 0.400. The 3B and 15B variants perform below the 650M variant on this benchmark. | ProteinGym zero-shot substitution DMS leaderboard, per-model Spearman summary CSV — rows for ESM2 8M / 35M / 150M / 650M / 3B / 15B. Frozen-archive snapshot at `pinned leaderboard snapshot`. | strong | CLM-TPMT-003 Reference fact |
| CLM-TPMT-004 | VAMP-seq (Variant Abundance by Massively Parallel sequencing) measures intracellular protein abundance for thousands of variants in parallel using GFP fusions and fluorescent cell sorting. Introduced for PTEN (4,112 variants) and TPMT (3,689 variants) in Matreyek et al. 2018. | Matreyek, K.A., Starita, L.M., Stephany, J.J., et al. (2018). Multiplex assessment of protein variant abundance by massively parallel sequencing. Nature Genetics, 50(6), 874–882. | strong | CLM-TPMT-004 Reference fact |
| CLM-TPMT-005a | Matreyek et al. 2021's integration of PTEN abundance and activity DMS data yields a four-class taxonomy with the following composition: WT-like (51%), loss of abundance only (22%), loss of activity only (6%), and loss of both (21%). | Matreyek, K.A., Stephany, J.J., Ahler, E., & Fowler, D.M. (2021). Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Medicine, 13, 165. | strong | CLM-TPMT-005a Reference fact |
| CLM-TPMT-005b | Matreyek et al. 2021 reports 4,721 PTEN variant abundance measurements via VAMP-seq, extending Matreyek 2018's prior 4,112-missense baseline with additional missense, synonymous, and nonsense variants. | Matreyek, K.A., Stephany, J.J., Ahler, E., & Fowler, D.M. (2021). Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Medicine, 13, 165. | strong | CLM-TPMT-005b Single-source |
| CLM-TPMT-005c | Matreyek et al. 2021 reports 7,244 PTEN variant phosphatase activity measurements via yeast functional rescue (Mighell-pattern assay). | Matreyek, K.A., Stephany, J.J., Ahler, E., & Fowler, D.M. (2021). Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Medicine, 13, 165. | strong | CLM-TPMT-005c Single-source |
| CLM-TPMT-005d | Matreyek et al. 2021's 4-class taxonomy explicitly places C124S, R130G, R130Q, and G129E in the 'loss of activity only' subgroup: catalytically dead PTEN variants that retain near-wild-type abundance. | Matreyek, K.A., Stephany, J.J., Ahler, E., & Fowler, D.M. (2021). Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Medicine, 13, 165. | strong | CLM-TPMT-005d Reference fact |
| CLM-TPMT-006 | The Clinical Pharmacogenetics Implementation Consortium (CPIC) guideline for thiopurine dosing recommends pre-treatment TPMT and NUDT15 genotyping to adjust starting doses of azathioprine, mercaptopurine, and thioguanine. The 2025 update (Maillard et al. 2026) supersedes the 2018 update (Relling et al. 2019). | Maillard, M., Schwab, M., Whirl-Carrillo, M., et al. (2026). Clinical Pharmacogenetics Implementation Consortium (CPIC) Guideline for Thiopurine Dosing Based on TPMT and NUDT15 Genotypes: 2025 Update. Clinical Pharmacology & Therapeutics, 119(4), 916–927. | strong | CLM-TPMT-006 Reference fact |
| CLM-TPMT-010 | ESM-2 ranks 3,685 TPMT variants against Matreyek 2018 VAMP-seq abundance scores with global Spearman ρ = 0.5298 (p = 7.4 × 10⁻²⁶⁶, Pearson r = 0.5353). | Internal NeuroAutomata benchmark — TPMT × Matreyek 2018 VAMP-seq cross-reference | strong | CLM-TPMT-010 Verified |
| CLM-TPMT-011 | The internal benchmark scores 3,685 TPMT variants — 4 fewer than the 3,689-variant VAMP-seq set in Matreyek 2018. The 4-variant filter excludes WT-identity substitutions (Q47S, L243L, T244T, E245E) via the `score != 0.0` filter. | Internal NeuroAutomata benchmark — filter | moderate | CLM-TPMT-011 Reference fact |
| CLM-TPMT-012 | TPMT *2 (A80P): ESM-2 score = −7.03 (strongly deleterious); VAMP-seq abundance = 0.318 (severe loss-of-abundance). Both assays catch the variant. Mechanism: enhanced proteasomal degradation (Tai et al. 1997). Most common decreased-function allele in Caucasians. | Internal NeuroAutomata benchmark — TPMT clinical-allele cross-reference. ESM-2 score from mutation-score table row A80 column P; VAMP-seq abundance from position 80, alt P. | strong | CLM-TPMT-012 Single-source |
| CLM-TPMT-013 | TPMT *3B (A154T, alone): ESM-2 score = −5.45 (deleterious); VAMP-seq abundance = 0.286 (severe loss-of-abundance). Both assays catch the variant. Also part of the TPMT *3A compound (A154T+Y240C), the most common deficiency allele in Caucasians and Southwest Asians. | Internal NeuroAutomata benchmark — TPMT clinical-allele cross-reference. ESM-2 score from mutation-score table row A154 column T; VAMP-seq abundance from position 154, alt T. | strong | CLM-TPMT-013 Single-source |
| CLM-TPMT-014 | TPMT *3C (Y240C): ESM-2 score = −4.26 (deleterious); VAMP-seq abundance = 0.600 (moderate loss). Both assays catch the variant. Most common decreased-function allele in East Asians and African-Americans. | Internal NeuroAutomata benchmark — TPMT clinical-allele cross-reference. ESM-2 score from mutation-score table row Y240 column C; VAMP-seq abundance from position 240, alt C. | strong | CLM-TPMT-014 Single-source |
| CLM-TPMT-015 | TPMT *5 (L49S): ESM-2 score = −7.68 (strongly deleterious — at the stronger end of the TPMT distribution where mean = −5.02, std = 3.82); VAMP-seq abundance = 0.790 (WT-like). ESM-2 catches the variant; VAMP-seq misses it. Mechanism: activity-loss without abundance-loss — the canonical activity-without-abundance pattern. | Internal NeuroAutomata benchmark — TPMT clinical-allele cross-reference. ESM-2 score from mutation-score table row L49 column S; VAMP-seq abundance from position 49, alt S. | strong | CLM-TPMT-015 Single-source |
| CLM-TPMT-016 | TPMT *7 (H227Q): ESM-2 score = −3.27 (moderately deleterious — above the TPMT distribution mean of −5.02 by ~0.5σ, less deleterious than the population mean but still clearly negative); VAMP-seq abundance = 0.770 (WT-like). ESM-2 catches the variant; VAMP-seq misses it. Same activity-without-abundance pattern as *5. | Internal NeuroAutomata benchmark — TPMT clinical-allele cross-reference. ESM-2 score from mutation-score table row H227 column Q; VAMP-seq abundance from position 227, alt Q. | moderate | CLM-TPMT-016 Single-source |
| CLM-TPMT-017 | ESM-2 ranks all five canonical TPMT clinical alleles (*2, *3B, *3C, *5, *7) as deleterious from sequence alone. VAMP-seq abundance catches the proteolytic-degradation subset (*2, *3B, *3C); the activity-loss subset (*5, *7) is the VAMP-seq blind spot. ESM-2 fills the gap via evolutionary-conservation signal — covering the activity-without-abundance subset — the canonical pharmacogenomic gap. | Synthesis claim (per-allele dataset paths inherit). | strong | CLM-TPMT-017 Reference fact |
| CLM-TPMT-018 | Position-stratified TPMT correlation: active-site (n=14, ρ=0.222, p=0.45 — not statistically significant); non-active-site (n=150, ρ=0.5244, p=5.7×10⁻¹²); unannotated positions (n=3,521, ρ=0.5313, p=8.1×10⁻²⁵⁶). The active-site stratum's small sample size and non-significance precludes strong claims at that stratum; non-active-site and unannotated strata both show ρ ≈ similar to global. | Internal NeuroAutomata benchmark — JSON field `position_type_stratification.{active_site,non_active_site,unknown}` | strong | CLM-TPMT-018 Verified |
| CLM-TPMT-019 | TPMT global ρ = 0.5298 is approximately 28% above the published ESM-2 650M ProteinGym baseline of 0.414 (delta = +0.116, or +28% relative / +11.6 percentage points absolute). | Derived from the TPMT ρ = 0.5298 and the published baseline 0.414. Computation: (0.5298 − 0.414) / 0.414 = 0.280. | strong | CLM-TPMT-019 Reference fact |
| CLM-TPMT-020 | TPMT categorical agreement between ESM-2 ranking and VAMP-seq abundance is 13.2% globally excluding neutral (488 agree / 3,197 disagree / 0 neutral). The low categorical agreement combined with ρ = 0.5298 reflects that ESM-2 and VAMP-seq agree on ranking while disagreeing on per-variant categorical class — exactly the orthogonal-not-redundant property that makes the two methods complementary. | Internal NeuroAutomata benchmark — JSON field `categorical_agreement` | strong | CLM-TPMT-020 Single-source |
| CLM-TPMT-030 | ESM-2 ranks 4,112 PTEN variants against Matreyek 2018 VAMP-seq abundance with global Spearman ρ = 0.4836 (N = 4,112, p = 4.6 × 10⁻²⁴⁰). Position stratification: non-active-site ρ = 0.5357 (N = 205); active-site ρ = −0.0113 (N = 84, not significant). | Internal NeuroAutomata benchmark — JSON fields `global_correlation` and `position_type_stratification` | strong | CLM-TPMT-030 Verified |
| CLM-TPMT-031 | The PTEN active-site / non-active-site contrast (ρ = −0.011 vs ρ = 0.536) reflects the same biological mechanism the Matreyek 2021 4-class taxonomy makes explicit: 6% of PTEN variants are 'loss of activity only' — catalytically dead but structurally stable. ESM-2 catches them via evolutionary-conservation signal; VAMP-seq's abundance metric does not. | Synthesis: Matreyek 2021 4-class taxonomy (CLM-TPMT-005) combined with PTEN stratified correlation (CLM-TPMT-030). | strong | CLM-TPMT-031 Reference fact |
| CLM-TPMT-032 | Catalytic-dead-but-stable PTEN cancer variants are explicitly classified 'loss of activity only' by Matreyek 2021: C124S (VAMP-seq abundance 1.14), R130G (1.09), G129E (0.76), R130Q (WT-like per Part 3 coverage). Each retains near-WT abundance while abolishing catalytic function. | Internal NeuroAutomata benchmark — abundance scores from pten-matreyek/ positions 124 (C→S, score 1.1375), 130 (R→G, score 1.0852), 129 (G→E, score 0.7553). Taxonomy class from Matreyek 2021 (DOI 10.1186/s13073-021-00984-x). | strong | CLM-TPMT-032 Single-source |
| CLM-TPMT-033 | Akt1 T308 phosphorylation correlates with tumor incidence at Pearson r = 0.76 across Matreyek 2021's PTEN dominant-negative validation set. Novel dominant negatives discovered through the abundance+activity integration include R130P (observed 8 times across breast/uterine/esophageal tumors) and D92H (3 breast cancer occurrences). | Matreyek, K.A., Stephany, J.J., Ahler, E., & Fowler, D.M. (2021). Integrating thousands of PTEN variant activity and abundance measurements reveals variant subgroups and new dominant negatives in cancers. Genome Medicine, 13, 165. | strong | CLM-TPMT-033 Single-source |
| CLM-TPMT-040 | Calmodulin (CALM1_HUMAN_Weile_2017 binding-affinity assay) achieves ρ = 0.212 (N = 1,813) against ESM-2 ranking, flagged `failed_threshold: true` in the 5-assay benchmark. Calmodulin's category is `Binding`, distinct from the single-protein-stability and organismal-fitness assays in the benchmark. ESM-2 performance on protein-protein binding assays is the canonical weakness category. | Internal NeuroAutomata benchmark — JSON `assays[CALM1_HUMAN_Weile_2017]` | strong | CLM-TPMT-040 Verified |
| CLM-TPMT-041 | The 5-assay benchmark median ρ = 0.515 spans single-protein-stability (Beta-lactamase 0.7315, BRCA1 0.5147, UBC9 0.4726), organismal-fitness (PTEN-Mighell 0.5185 yeast-growth assay), and binding-affinity (Calmodulin 0.2116, `failed_threshold: true`). TPMT ρ = 0.5298 sits ≈ at the benchmark median. | Internal NeuroAutomata benchmark — JSON `summary.median_rho: 0.515` and `assays[]` array | strong | CLM-TPMT-041 Verified |
| CLM-TPMT-042 | The benchmark median delta vs published baseline is +24.3% (median 0.515 vs baseline 0.414). Note this is a different metric than TPMT-specific delta (+28%): the 24.3% applies to the 5-assay median; the 28% applies to TPMT ρ = 0.5298 specifically. | Internal NeuroAutomata benchmark — JSON `summary.delta_vs_baseline_pct: 24.3`. Derived comparison: (0.5147 − 0.414) / 0.414 = 0.243. | strong | CLM-TPMT-042 Reference fact |
| CLM-TPMT-043 | ESM-2 cannot reliably detect dominant-negative variants where the variant retains WT-like abundance AND retains some catalytic activity but disrupts complex function via partner-binding or oligomerization. Example: PTEN P38S (VAMP-seq abundance 1.14, melanoma-enriched at 10.4% of missense variants) shows WT-like abundance to VAMP-seq AND ranks unremarkably by ESM-2 — a true blind spot for both assays. | Internal NeuroAutomata benchmark — JSON record position 38, alt S (P38S abundance 1.1361 + notes 'ESM-2 cannot detect dominant-negative mechanism') | moderate | CLM-TPMT-043 Single-source |
| CLM-TPMT-050 | TPMT's abundance-driven decreased-function mechanism (proteolytic degradation of *2, *3B, *3C variants) generalizes to receptor-class and ion-channel-class proteins where stability is the rate-limiting step for function. Matreyek-lab DMS datasets for INSR, LDLR, KCNE1, and STIM1 (each cited individually in CLM-TPMT-051 through -054) are adjacent bench-DMS surfaces for the same ESM-2 cross-reference approach. | Synthesis claim spanning four per-paper primary citations. | moderate | CLM-TPMT-050 Reference fact |
| CLM-TPMT-051 | The Matreyek lab contributed to deep mutational scanning of the human insulin receptor (INSR) ectodomain in 2025, scoring ~14,000 variants. | Aslanzadeh, V., Brierley, G.V., Kumar, R., et al. (2025). Deep mutational scanning of the human insulin receptor ectodomain to inform precision therapy for insulin resistance. Nature Communications, 16, 9143. Matreyek listed as co-author. | moderate | CLM-TPMT-051 Reference fact |
| CLM-TPMT-052 | The Matreyek lab contributed to a 2026 Science paper on the functional landscape of coding variation in LDLR (low-density lipoprotein receptor), a familial-hypercholesterolemia gene. | Tabet, D.R., Coté, A.G., Lancaster, M.C., et al. (including Matreyek, K.A., Fowler, D.M., Roth, F.P., senior) (2026). The functional landscape of coding variation in the familial hypercholesterolemia gene LDLR. Science, 391(6787). Published 2026-02-19. | moderate | CLM-TPMT-052 Reference fact |
| CLM-TPMT-053 | The Matreyek lab contributed to high-throughput functional mapping of KCNE1 (arrhythmia ion-channel gene) variants in 2024, scoring 2,554 variants for cell-surface expression (2,534 for function). | Muhammad, A., Calandranis, M.E., Li, B., et al. (including Matreyek, K.A., Fowler, D.M., Roden, D.M., Glazer, A.M., senior) (2024). High-throughput functional mapping of variants in an arrhythmia gene, KCNE1, reveals novel biology. Genome Medicine, 16, 73. Published 2024-05-30. | moderate | CLM-TPMT-053 Reference fact |
| CLM-TPMT-054 | The Matreyek lab published comprehensive mutational characterization of the calcium-sensing STIM1 EF-hand in 2025, scoring 706 of 720 possible single-amino-acid variants (Kamath and Matreyek). | Kamath, N.D., & Matreyek, K.A. (2025). Comprehensive mutational characterization of the calcium-sensing STIM1 EF-hand reveals residues essential for structure and function. Genetics, 231(2), iyaf146. Published October 2025. | moderate | CLM-TPMT-054 Reference fact |
Substantiated against the TPMT validation claim list
(version 2fb2dbf); the verification result on record, at version
269b09b (both captured at build time;
cross-checked source trail).