ACMG PP3/BP4 — Glossary

The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) jointly defined a framework for classifying genetic variants into five tiers: pathogenic, likely pathogenic, uncertain significance (VUS), likely benign, and benign. Within this framework, PP3 and BP4 are the two criteria that allow computational (in silico) predictions to contribute evidence.

How PP3 and BP4 Work

PP3 (Pathogenic, Supporting): Multiple lines of computational evidence support a deleterious effect on the gene or gene product. This includes conservation-based tools (SIFT, PolyPhen-2), ensemble meta-predictors (REVEL, CADD), and protein language models (ESM-2, AlphaMissense).

BP4 (Benign, Supporting): Multiple lines of computational evidence suggest no impact on the gene or gene product.

Both criteria contribute “supporting” level evidence — they cannot independently reclassify a VUS. They must combine with other evidence types (functional data, population frequency, co-segregation) to shift a variant’s classification.

Computational Tools Used for PP3/BP4

Tool	Method	Era
SIFT	Sequence conservation	2003
PolyPhen-2	Sequence + structure features	2010
REVEL	Ensemble meta-predictor	2016
CADD	Genome-wide scoring	2014
AlphaMissense	Protein language model (DeepMind)	2023
ESM-2	Protein language model (Meta AI)	2023

All of these tools are designated Research Use Only (RUO). Clinical laboratories validate and incorporate them under their own Laboratory Developed Test (LDT) workflows.

Why It Matters for ESM-2

ESM-2 masked marginal scores provide PP3/BP4-level computational evidence for missense variants. The scores are not clinical diagnoses — they are one input into a multi-criteria classification decision. This is the same regulatory posture as every other computational predictor in the field.