Deep Mutational Scanning

Published

Also known as: DMS, deep mutational scan, multiplexed variant assay

A lab technique that measures the functional effect of every possible single amino acid substitution across a protein. The gold standard for variant effect data.

Source: Fowler DM & Fields S. 'Deep mutational scanning: a new style of protein science.' Nat Methods 2014;11(8):801-807. https://doi.org/10.1038/nmeth.3027

Primary reference ↗

Deep mutational scanning (DMS) combines saturation mutagenesis (creating every possible single amino acid variant) with a high-throughput functional assay (measuring all of them in a pooled competition). The result is a complete fitness landscape: every position × every amino acid substitution → measured functional effect.

How It Works

  1. Library creation: Saturation mutagenesis generates every possible single amino acid substitution across the protein (e.g., 9,310 variants for a 490-aa protein)
  2. Functional selection: Variants compete under a selection pressure relevant to the protein’s function (enzymatic activity, binding, stability, organismal fitness)
  3. Deep sequencing: Next-generation sequencing counts variant frequencies before and after selection
  4. Score calculation: Enrichment or depletion relative to wild-type gives a fitness score per variant

Scale

A typical DMS experiment characterizes 5,000–15,000 variants in a single experiment. This would take years to measure one-by-one using traditional biochemistry.

Why It’s the Gold Standard

  • Each variant is measured under identical experimental conditions
  • Replicate experiments allow statistical confidence estimates
  • Results are protein-specific and directly reflect biological function
  • Data is deposited in MaveDB for community reuse

Limitations

  • Requires months of specialized lab work and custom cell lines/systems
  • Each protein needs a tailored assay (no universal DMS protocol)
  • Measures one functional phenotype at a time (activity ≠ stability ≠ binding)
  • Expensive: typically $50,000–$200,000 per dataset