Building The Reasoning Engine at Axiom

How hierarchical planning, verification, and self-play converge to mathematical superintelligence

For centuries, mathematics has been humanity's most powerful sandbox – a place where we construct logical frameworks to understand reality's most complex systems. Why do grid cells fire in a hexagonal pattern? Why do quantum energy levels align with the spacing of primes?

Yet mathematical progress has always been shackled by a fundamental cruel bottleneck: the scarcity of extraordinary minds. When Évariste Galois revolutionized algebra in a fevered night before his fatal duel at twenty, he left behind ideas so far ahead of their time that decades passed before his contemporaries could grasp them. Srinivasa Ramanujan, before he succumbed to starvation and ill health, channeled thousands of formulas from his dreams into notebooks – results so profound that mathematicians spent nearly a century proving what he had intuited.

Even when breakthroughs occur as mathematicians' legacy, the extreme fragmentation of modern mathematics means experts in different subfields often cannot understand each other's work, causing vital connections between domains to remain hidden for decades. This combination of scarce genius and siloed knowledge creates an extraordinarily long pipeline from discovery to application – fundamental theorems discovered today might take generations before their full implications reshape technology and society – a delay so long that Hardy didn't anticipate in A Mathematician's Apology.

We're on a moonshot mission to change that. Axiom is building a reasoning engine capable of mathematical discoveries at scales and speed previously unimaginable – an AI mathematician.

The Timing

Three trends are colliding for the first time in history.

First, neural networks have stepped beyond pattern matching into scalable reasoning, improving as compute, model size, and data grow.

Second, mathematical formalization has come of age through languages like Lean: by the Curry–Howard correspondence, proofs become executable programs, and programming languages are no longer just tools for producing outputs but instruments for certifying properties of abstract objects.

And lastly, LLMs have crossed a critical threshold in code generation, reliably producing high-quality code across many languages – including formal specification languages – and serving as a strong prior over the otherwise infinite action space of mathematics.

This synergy creates an unprecedented opportunity: reasoning engines that can conjecture and prove infinite theorems with zero human involvement.

The Convergence

Autoformalization is the Natural Language Compiler

Our data factory

Imagine you were a programmer in the 1950s. Your day to day was punching machine code into cards.

In the 1970s, you were wrestling with FORTRAN. By the 1990s, maybe C++. Today? You're basically just talking to the computer in English. Turing's childhood dream is now your reality with coding agents.

Each generation got to think a little less about the machine and a little more about the fun problem they were actually trying to solve.

Modern compilers are one-way DAGs. They take Python and transform it down the stack through multiple representations until it becomes machine code. There's some upward flow - as you type in your IDE, a partial compilation happens via Language Server Protocol, feeding information back up to create those squiggly lines and suggestions. But compilers never go fully back up the abstraction ladder. Disassembly exists, but it doesn't reconstruct your original high-level intent.

Mathematics needs something compilers never achieved: a true bidirectional cycle. For thousands of years, mathematicians think in high-level, intuitive leaps, not formal logic. Yet, math is undergoing a modernization. With proofs now spanning hundreds of pages, they are often riddled with hidden errors. In fact, every time proofs dogmatically resist being formally proven, the informal human source was wrong – just recently, mistakes were fixed during the formalization effort of Fermat's Last Theorem. The bottom formal layer catches what high-level intuition misses.

Meanwhile, autoformalization - the process of converting natural language proofs to Lean code - is a form of hierarchical planning, bridging between the abstraction layers.

Going down the stack: Autoformalization translates intuitive mathematical reasoning into formal proof – the compiler's traditional direction.

Going up the stack: Autoinformalization translates formal proofs back into human intuition – something compilers never truly do.

When combined, these create an interactive prover in natural language, freeing mathematicians to explore dozens of strategies and rapidly prototype ideas while the machine handles formalization.

But here's the most powerful part: machine discoveries at the formal level feed back up. The machine finds patterns, lemmas, and connections in the formal space that humans never intuited, then surfaces them as new high-level insights. The mathematician's intuition becomes augmented by discoveries happening in a space they couldn't naturally explore.

The compiler revolution took decades; the mathematical compiler revolution is happening now.

Formal Verification Guides Mathematical World Modeling

Our algorithmic core

You are a gold prospector in 1849. Everyone brings you shiny rocks claiming they've struck it rich.

Experienced prospectors examine them: "Looks like gold to me."

But when confronted with exotic ores, even experts disagree. Their pattern matching fails on things they've never seen.

Then someone brings an assayer's scale. The metal either dissolves in acid or it doesn't. Binary truth.

When you write a proof, it's either correct or it's not. Formal verifiers like Lean provide perfect ground truth while model judges are pattern matchers that fail when being pushed on generating genuinely novel proofs. From an engineering angle, verification gives us efficiently scalable rewards for learning.

And here's the philosophical perspective of why we need formal verification: Our self-consistent, observer-supporting universe follows rules that can be captured mathematically – from laws of physics to probability theory. Mathematics is the consistent language of our consistent universe and formal languages like Lean let us consistently operate in the mathematical world model.

We are training systems in mathematics as reality's minimal simulation – by learning to navigate the world of mathematics one grounded step at a time, the hope is that the AI has learned some fundamental rules that our reality has to follow. Video generation models learn physics too. Sometimes one ponders … where do abstract reasoning and spatial reasoning join?

Conjecturing-Proving Loop Realizes Self-Improving AI

Our discovery engine

While able to test if gold is real, finding new veins is harder: working in concert with verification, we enter the chapter of scientific discovery. Imagine you're in the middle of an ocean. Sailing towards new lands, of course, you start daydreaming about mathematics:

Your map is the Knowledge Base – showing where you've been. The entire corpus of mathematics indexed into a knowledge graph: definitions, theorems, and proofs. Formalized mathematics as a searchable, Internet-scale dataset.

Your ship is the Conjecturer – navigating uncharted territories. It spots distant landmasses through fog: "something valuable three days west." Built for open-ended exploration beyond known results, it samples out of distribution and generalizes with leaps guided by intrinsic motivations.

But when you spot an unknown island on the horizon, how do you know if it's India or the West Indies? The shape looks right, the distance seems plausible, but educated guess isn't certainty. You ask the experienced captain for wisdom that you trust – that is the Prover. Successful proofs extend the knowledge base. Failed attempts provide signals for improving both the Conjecturer and Prover. While formal verification turns "might be true" into "is true," counterexample construction shows "is false." Both grow the library.

The loop is self-reinforcing. More verified theorems mean a richer knowledge base. A richer knowledge base enables more sophisticated conjectures. More proof attempts (successful and failed) train better models. Better models generate more interesting conjectures and find proofs faster.

Axiom is building the AlphaGo for mathematics, but with infinite branching.

The Path Forward

The implications extend far beyond pure mathematics. Every complex system humans want to understand – from protein folding to quantum field theory and economic models – ultimately reduces to mathematical structures. A reasoning engine that can autonomously explore mathematical space and generate new theories doesn't just solve math problems; it provides a general-purpose tool for understanding reality.

Our founding team brings together precisely the expertise needed for this revolution. We were among the first to apply AI to compilers, bringing deep experience in programming languages and compiler technology. Our work spans from AI for mathematical discovery to pioneering self-improving systems. We're building reasoning engines that can operate in the mathematical world model at superhuman scale to tackle our most complex challenges.

The mathematical renaissance isn't coming. It's here.

Axiom is here.