ACMG PP3/BP4

Published

Also known as: PP3, BP4, ACMG computational evidence, PP3/BP4

ACMG criteria allowing computational variant effect predictions to count as pathogenic (PP3) or benign (BP4) evidence in clinical variant classification, per Richards et al. 2015 guidelines.

Source: Richards S et al. 'Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the ACMG and AMP.' Genet Med 2015;17(5):405-424. https://doi.org/10.1038/gim.2015.30

Primary reference ↗

The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) jointly defined a framework for classifying genetic variants into five tiers: pathogenic, likely pathogenic, uncertain significance (VUS), likely benign, and benign. Within this framework, PP3 and BP4 are the two criteria that allow computational (in silico) predictions to contribute evidence.

How PP3 and BP4 Work

PP3 (Pathogenic, Supporting): Multiple lines of computational evidence support a deleterious effect on the gene or gene product. This includes conservation-based tools (SIFT, PolyPhen-2), ensemble meta-predictors (REVEL, CADD), and protein language models (ESM-2, AlphaMissense).

BP4 (Benign, Supporting): Multiple lines of computational evidence suggest no impact on the gene or gene product.

Both criteria contribute “supporting” level evidence — they cannot independently reclassify a VUS. They must combine with other evidence types (functional data, population frequency, co-segregation) to shift a variant’s classification.

Computational Tools Used for PP3/BP4

ToolMethodEra
SIFTSequence conservation2003
PolyPhen-2Sequence + structure features2010
REVELEnsemble meta-predictor2016
CADDGenome-wide scoring2014
AlphaMissenseProtein language model (DeepMind)2023
ESM-2Protein language model (Meta AI)2023

All of these tools are designated Research Use Only (RUO). Clinical laboratories validate and incorporate them under their own Laboratory Developed Test (LDT) workflows.

Why It Matters for ESM-2

ESM-2 masked marginal scores provide PP3/BP4-level computational evidence for missense variants. The scores are not clinical diagnoses — they are one input into a multi-criteria classification decision. This is the same regulatory posture as every other computational predictor in the field.