RAG — Glossary | Axon Agentic

Retrieval-Augmented Generation (RAG) is a technique that combines a retrieval system (typically a vector database) with a generative language model. Instead of relying solely on knowledge baked into model weights at training time, RAG fetches relevant documents at query time and includes them in the model’s context window.

Basic RAG Pipeline

User Query
    ↓
Embed query → vector representation
    ↓
Search vector database → retrieve top-k relevant chunks
    ↓
Inject retrieved chunks into LLM context
    ↓
LLM generates response grounded in retrieved content

Naive RAG vs. Verification-First RAG

Naive RAG passes retrieved chunks directly to the model and accepts whatever it generates. For general knowledge this is acceptable; for scientific data it is insufficient.

Verification-first RAG (used in HPA systems) adds a validation layer:

Retrieved biological data is cross-checked against ground truth APIs
Metric calculations (tau scores, fold enrichment) are performed dynamically, not retrieved pre-computed
Synthesis agents flag conflicts between data sources before returning results

Limitations of RAG for Biological Research

Retrieval quality depends heavily on embedding model and chunking strategy
Retrieved chunks may contain outdated database versions
RAG alone does not guarantee biological accuracy — it must be paired with domain-specific validation logic