Boundary Condition
Also known as: prediction boundary, model boundary, accuracy boundary
A documented context where ESM-2's prediction accuracy transitions from reliable to unreliable. Boundary conditions define where to trust the model — and where to verify independently.
A boundary condition is a documented threshold — a specific protein context or mutation type — where ESM-2’s prediction accuracy transitions from strong to weak. Boundary conditions are engineering specifications, not just limitations: knowing where a model works and where it doesn’t allows engineers and researchers to plan appropriate verification steps.
The 14 Boundary Conditions
Across 25 protein analyses in the ESM-2 Benchmark Series, we’ve identified 14 specific boundary conditions. Key examples:
| # | Condition | ESM-2 Performance | Notes |
|---|---|---|---|
| 1 | Non-active-site stability | Strong (rho 0.67–0.89) | Core use case |
| 2 | Active-site specificity engineering | Weak (rho 0.0–0.29) | Evolvable residues |
| 9 | Intrinsically disordered regions | Near-zero | IDPs lack evolutionary constraint |
| 10 | Drug-resistance mutations | Weak under selection | Escape mutations are evolutionarily novel |
| 14 | Active-site heme binding (CYP450) | Strong (rho 0.811) | Exception — universally conserved chemistry |
Why We Document Them Openly
Scientific tools are most useful when their failure modes are known. Documenting boundary conditions:
- Allows researchers to assess applicability before running an experiment
- Prevents false confidence in predictions for edge cases
- Enables appropriate downstream verification planning
- Invites community testing to refine or extend the boundaries
Each post in the ESM-2 Benchmark Series explicitly discloses which boundary conditions apply to the protein being analyzed.
Boundary Conditions vs. “Limitations”
The term “limitation” implies an unfortunate shortcoming. A “boundary condition” is a precise, engineered characterization: here is where the model transitions from reliable to unreliable, and here is why. The distinction matters for rigorous use of any computational tool.