An Independent Check Caught a Claim We'd Overstated
An independent check corrected one of our own live ESM-2 claims about disordered regions — and overturned an internal assessment too. Why we don't self-certify.
Read moreAn independent check corrected one of our own live ESM-2 claims about disordered regions — and overturned an internal assessment too. Why we don't self-certify.
Read moreBefore building a per-position confidence view for our variant scorer, we pre-registered a pass/fail bar — witnessed, no escape clause. It failed. We didn't ship it.
Read moreESM-2 ranks germline TP53 DNA-binding-domain variants reliably (Spearman rho 0.46–0.68) but carries no usable signal in the disordered regions (0 to −0.26).
Read moreAcross 13,927 INSR ectodomain variants, ESM-2 ranks at ρ=0.594 on the L1 leucine-rich repeat and ρ=−0.088 at the αCT helix — same protein, same model, same assay. The contrast is structural.
Read moreAcross the 5 canonical TPMT clinical alleles, ESM-2 ranks every one deleterious from sequence alone. VAMP-seq catches *2, *3B, *3C. ESM-2 covers *5 and *7.
Read moreESM-2 vs 5,949 BRCA1 and PTEN variants — validated against SGE and VAMP-seq. Where it works, where it's blind, and why the combination matters.
Read moreESM-2 predictions vs 6,142 CYP2C9 variants from the largest pharmacogenomic DMS dataset. What we found — including where it fails.
Read moreWe built NeuroAutomata to make ESM-2 protein variant scoring accessible without setup. Validation results — including the one protein where it failed.
Read moreHow I built a multi-agent system for natural language queries for Human Protein Atlas data, from naive RAG to AI verification architecture
Read moreDetailed validation methodology, reproducibility protocols, and AI agent architecture for the HPA natural language query system
Read more