Glossary

Published March 30, 2026 · Updated June 25, 2026

Definitions for AI, bioinformatics, and life sciences terms used across blog posts and research.

28 terms

AI / ML

Boundary Conditionprediction boundary, model boundaryA documented context where ESM-2's prediction accuracy transitions from reliable to unreliable. Boundary conditions define where to trust the model — and where to verify independently.
Context Windowcontext length, context sizeAn AI model's short-term working memory — the maximum amount of text it can hold and process at once before older content is dropped.
LLMLarge Language Model, language modelLarge Language Model — an AI model trained on vast text that can understand and generate human language. Examples: GPT-5, Claude, Gemini.
Masked Marginal Scoringmasked marginals, zero-shot scoringESM-2's zero-shot method for scoring variant effects. It masks each position and measures how surprising the mutant amino acid is relative to wild-type.
Model Context ProtocolMCP, Model Context ProtocolAn open standard that lets AI models connect to external tools and data sources, such as databases, APIs, and file systems, through a universal interface without custom integration code.
Multi-Agent SystemMAS, multi-agent architectureA network of specialized, autonomous AI agents that collaborate and divide complex tasks among themselves to solve problems too difficult for a single model.
RAGRetrieval-Augmented Generation, retrieval augmented generationRetrieval-Augmented Generation — a technique where an AI retrieves relevant documents from a database before generating a response, grounding answers in real data to reduce hallucinations.
Tokenstoken, tokenizationThe basic units of text an AI processes — roughly a word or word fragment. A sentence of ~10 words is about 13–15 tokens.

Bioinformatics

Proteomics

Genomics

Statistics

Spearman RhoSpearman ρ, Spearman rank correlationA rank correlation coefficient (−1 to +1) that measures whether two variables agree in order, not magnitude. The primary metric for variant effect benchmarks.

General

RUOResearch Use Only, research use onlyResearch Use Only — a regulatory designation meaning the tool provides research scores, not clinical diagnoses. The same label used by REVEL, CADD, AlphaMissense, and PolyPhen-2.