Research · Verification · Disclosure

Research and Disclosure at Axon Agentic

Published May 6, 2026 · Updated May 7, 2026

On this page

This page organizes Axon Agentic's published evidence: the validation work that tests our products against experimental data, the independent audit trail for the claims we publish, and the operational disclosures that describe who is responsible for what. It exists because scientifically literate readers — protein engineers, clinical researchers, and computational biologists evaluating new tools — should be able to check methodology, trace numbers to sources, and understand who is accountable before deciding whether to rely on this work.

Three artifact classes are organized here, in this order:

Validation is our science. It covers what the systems do and where they fail — deep mutational scanningbenchmarks, known failure modes, and methods. Validation artifacts report on the systems themselves.

Verification is the audit of our claims. It answers a different question: not whether the system works, but whether the claims we publish about the system are accurate. These are independent fact-checks run by Veritas — an AI agent that operates separately from the agents that produce content — and are exposed as public artifacts rather than internal quality checks.

Disclosure covers how Axon Agentic operates: who is on the team, what AI agents can and cannot do, and where to find NeuroAutomata's privacy policy.

The sections below list the available artifacts with a description of what each one contains and when to read it.

Validation — What the systems do and where they fail

Validation at Axon Agentic means applying our systems to publicly available experimental datasets and measuring how well the predictions align with measured biology. The datasets are peer-revieweddeep mutational scanning assays,saturation genome editing studies, and variant abundance measurements — experiments that exist independently of us and predate our analysis. We contribute the computational predictions; the experimental ground truth is external.

This is not a regulatory validation. Nothing here constitutes a clinical study or regulatory submission. Results are designatedResearch Use Only, the same designation used by REVEL, CADD, AlphaMissense, and PolyPhen-2.

The limitation disclosures below are not footnotes. The calmodulin result is listed at the same level as the positive results because accurate failure disclosure is part of what this page is for.

NeuroAutomata: ESM-2 validation across five proteins, median Spearman rho 0.515 (internal benchmark)

ESM-2 650M is theprotein language modelat the core of NeuroAutomata. It was developed by Meta AI Research (Lin et al. 2023, Science) and predicts variant effects from sequence alone usingmasked marginal scoring— no structure required, no experimental data needed at inference time.

The published ProteinGymzero-shot substitution benchmark places ESM-2 650M at a meanSpearman rho of 0.414 across 217 substitution DMS assays, ranking 45 of 97 models on the live leaderboard CSV (accessed 2026-05-08). That is the external published number.

Our internal validation — run on a 5-protein subset and self-reported — produced a median Spearman rho of 0.515 (internal benchmark — see validation details). The five proteins and their individual results:

Protein	Spearman rho	Experimental assay
Beta-lactamase	0.731	DMS (activity)
PTEN	0.519	VAMP-seq (abundance)
BRCA1	0.515	SGE (function scores)
UBC9	0.473	DMS (activity)
GB1	0.276	DMS (fitness)

The BRCA1 entry is the full ProteinGym BRCA1_HUMAN_Findlay_2018 dataset. Domain-level recomputations from the same Findlay 2018 saturation genome editing data show the BRCT domain alone at rho 0.534 (internal benchmark) and the RING domain at 0.409 (internal benchmark) — these are narrower scopes of the same dataset, not a separate benchmark. Sample sizes: BRCT N=1,262; RING N=575.

For methods, protein-specific results, and the full variant tables, see the validation details pageor the NeuroAutomata product page.

ESM-2 limitation: protein-protein binding correlations are weak (calmodulin rho 0.212)

Calmodulin is a calcium-binding protein whose functional effect depends on binding-affinity changes at protein-protein interfaces rather than on structural stability or folding.ESM-2 scores evolutionary plausibility — patterns learned from sequence conservation across millions of proteins — not binding energy. On the calmodulinDMS dataset, ESM-2 produced aSpearman rho of 0.212 against experimental data (internal benchmark,see validation details).

Protein-protein binding contexts are a known weak spot for this approach. Mutations that affect an interaction interface without destabilizing the fold are outside what evolutionary sequence conservation captures well. If your protein's function is primarily binding-affinity-driven, interpret ESM-2 scores cautiously and pair with structural or experimental data.

HPA multi-agent system: 18-month journey from naive RAG to verification-first architecture

The HPA multi-agent system translates natural language queries into structured queries against the Human Protein Atlas JSON API. Over 18 months of development, the architecture evolved from a naiveRAG baseline through iterative revision to a verification-firstmulti-agentpipeline. The journey post documents the architecture decisions, the cross-validation approach used to confirm query results against HPA's own API responses, and what failed along the way.

Read:Building an AI Multi-Agent System for Human Protein Atlas Data — 18-month journey

Validation methodology: HPA multi-agent system

The companion methodology post covers reproducibility protocols, agent role specifications, and the cross-validation procedure used to confirm query results against Human Protein Atlas API responses. It is intended for readers who want to understand the technical decisions behind the validation approach rather than just the results.

Read:Validation methodology — HPA multi-agent system

Verification — Independent audit of published claims

Validation tells you whether the system works. Verification tells you whether the claims we publish about it are accurate. These are not the same question.

Veritas is the AI agent responsible for verification at Axon Agentic. It operates separately from the agents that produce content — Amara (marketing), Astro (engineering), and Kiran (product). Veritas cannot edit content produced by other agents. Its findings cannot be overridden by other AI agents; any exception requires Jonathan Agoot's explicit written approval, recorded with a reason.

What Veritas does and does not cover is documented on themethodology page. The hub points there rather than reproducing the implementation here.

Machine-readable claims YAML: every published claim with its source and verification status

Every factual claim on Axon Agentic's public-facing pages is recorded in a structured YAML manifest. Each entry carries the claim text, source citation, evidence strength, and Veritas verification status. The files are served as plain text and are designed to be readable by both humans and AI language modelsystems that process structured data.

The claims manifest for this page and all published Axon Agentic content:axonagentic.ai/research/claims.yaml

Veritas verification reports: per-artifact fact-checks with source citations and verdicts

Each substantive public-facing post has a paired Veritas verification report — a fact-check of the quantitative claims and source citations in that piece. Reports list each claim, the cited source, the verification result, and any corrections applied before publication.

Available reports are listed at /verification.

Disclosure — How Axon Agentic operates

Verification audits the science. Disclosure audits the operation — who is on the team, what they can and cannot do, and what data practices cover the product.

AI Agent Staff: six AI agents and one human, with documented roles and approval requirements

Axon Agentic's team is six AI agents and one human. The human is Jonathan Agoot, who founded the company and is responsible for everything it publishes. No agent can approve its own factual claims or route around the Veritas verification step. No agent has access to customer data, personal information, or financial systems.

The AI Agent Staff page documents each agent's role, what it does and does not access, and the approval requirements for each category of output.

NeuroAutomata privacy policy: platform-specific data handling for neuroautomata.axonagentic.ai

NeuroAutomata's privacy policy documents the data practices for the protein analysis platform. This is a platform-specific policy scoped to neuroautomata.axonagentic.ai — it is not an Axon Agentic site-wide policy.

Read:NeuroAutomata privacy policy

Frequently Asked Questions

What is the difference between validation and verification at Axon Agentic?

Validation tests whether the systems produce accurate predictions against experimental data. Verification independently checks whether the claims published about those results are accurate. Most companies publish the first and skip the second.

What model and scoring method does the NeuroAutomata benchmark use?

ESM-2 650M (Meta AI Research, Lin et al. 2023, Science) with masked marginal scoring. Published ProteinGym aggregate for this model: Spearman rho 0.414, rank 45 of 97 models on the live leaderboard CSV (OATML-Markslab/ProteinGym, accessed 2026-05-08). Internal validation on a 5-protein subset produced a median of 0.515 — that is a subset result, not the full benchmark.

Are Axon Agentic's results peer-reviewed?

No. The validation work is evidence-published: methodology documented, numbers traceable to source data, independent verification by Veritas, claims YAML publicly linkable. The experimental datasets benchmarked against (ProteinGym, SGE, VAMP-seq DMS assays) are peer-reviewed.

How do I verify a specific claim Axon Agentic has made?

Each published claim has a source citation and Veritas verification status in the claims YAML for that page, served at axonagentic.ai/research/claims.yaml. If a claim cites a peer-reviewed paper, the DOI or PMID is listed. If it is an internal measurement, the YAML identifies it as such and links to the validation page. Disagreements can be sent to Jonathan Agoot via the contact form.

How do I cite NeuroAutomata or Axon Agentic's research?

Use the plain attribution strings in the "How to cite this work" section below.

What does Axon Agentic not publish, and why?

Internal pipeline architecture, environment configurations, and per-agent operational routing are not enumerated publicly. Stating what agents can and cannot do is the appropriate trust signal; enumerating how the internal system is wired is not.

How to cite this work

For NeuroAutomata:

Axon Agentic (2026). NeuroAutomata: ESM-2 protein analysis platform, on the latest production release. https://neuroautomata.axonagentic.ai

For the HPA multi-agent system:

Axon Agentic (2025). HPA Multi-Agent System: Verification-first natural language queries for Human Protein Atlas data. https://axonagentic.ai/blog/ai-natural-language-human-protein-atlas-18-month-journey

No DOI is assigned to this work at this time. NoCITATION.cff is available (repositories are private).

Work with us

If you are building AI systems for scientific research and want to discuss architecture or the verification approach,get in touch.

If you want to run your protein sequence through NeuroAutomata — score variants, view the mutation landscape, or explore ESM-2 predictions — the platform is available atneuroautomata.axonagentic.ai.