ACMG PP3/BP4
Also known as: PP3, BP4, ACMG computational evidence, PP3/BP4
ACMG criteria allowing computational variant effect predictions to count as pathogenic (PP3) or benign (BP4) evidence in clinical variant classification, per Richards et al. 2015 guidelines.
Source: Richards S et al. 'Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the ACMG and AMP.' Genet Med 2015;17(5):405-424. https://doi.org/10.1038/gim.2015.30
Primary reference ↗The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) jointly defined a framework for classifying genetic variants into five tiers: pathogenic, likely pathogenic, uncertain significance (VUS), likely benign, and benign. Within this framework, PP3 and BP4 are the two criteria that allow computational (in silico) predictions to contribute evidence.
How PP3 and BP4 Work
PP3 (Pathogenic, Supporting): Multiple lines of computational evidence support a deleterious effect on the gene or gene product. This includes conservation-based tools (SIFT, PolyPhen-2), ensemble meta-predictors (REVEL, CADD), and protein language models (ESM-2, AlphaMissense).
BP4 (Benign, Supporting): Multiple lines of computational evidence suggest no impact on the gene or gene product.
Both criteria contribute “supporting” level evidence — they cannot independently reclassify a VUS. They must combine with other evidence types (functional data, population frequency, co-segregation) to shift a variant’s classification.
Computational Tools Used for PP3/BP4
| Tool | Method | Era |
|---|---|---|
| SIFT | Sequence conservation | 2003 |
| PolyPhen-2 | Sequence + structure features | 2010 |
| REVEL | Ensemble meta-predictor | 2016 |
| CADD | Genome-wide scoring | 2014 |
| AlphaMissense | Protein language model (DeepMind) | 2023 |
| ESM-2 | Protein language model (Meta AI) | 2023 |
All of these tools are designated Research Use Only (RUO). Clinical laboratories validate and incorporate them under their own Laboratory Developed Test (LDT) workflows.
Why It Matters for ESM-2
ESM-2 masked marginal scores provide PP3/BP4-level computational evidence for missense variants. The scores are not clinical diagnoses — they are one input into a multi-criteria classification decision. This is the same regulatory posture as every other computational predictor in the field.