Boundary Condition — Glossary

A boundary condition is a documented threshold — a specific protein context or mutation type — where ESM-2’s prediction accuracy transitions from strong to weak. Boundary conditions are engineering specifications, not just limitations: knowing where a model works and where it doesn’t allows engineers and researchers to plan appropriate verification steps.

The 14 Boundary Conditions

Across 25 protein analyses in the ESM-2 Benchmark Series, we’ve identified 14 specific boundary conditions. Key examples:

#	Condition	ESM-2 Performance	Notes
1	Non-active-site stability	Strong (rho 0.67–0.89)	Core use case
2	Active-site specificity engineering	Weak (rho 0.0–0.29)	Evolvable residues
9	Intrinsically disordered regions	Near-zero	IDPs lack evolutionary constraint
10	Drug-resistance mutations	Weak under selection	Escape mutations are evolutionarily novel
14	Active-site heme binding (CYP450)	Strong (rho 0.811)	Exception — universally conserved chemistry

Why We Document Them Openly

Scientific tools are most useful when their failure modes are known. Documenting boundary conditions:

Allows researchers to assess applicability before running an experiment
Prevents false confidence in predictions for edge cases
Enables appropriate downstream verification planning
Invites community testing to refine or extend the boundaries

Each post in the ESM-2 Benchmark Series explicitly discloses which boundary conditions apply to the protein being analyzed.

Boundary Conditions vs. “Limitations”

The term “limitation” implies an unfortunate shortcoming. A “boundary condition” is a precise, engineered characterization: here is where the model transitions from reliable to unreliable, and here is why. The distinction matters for rigorous use of any computational tool.