The (near) Future of AI in Math

I believe that the way mathematicians conduct research will be fundamentally different in the near future. Below, I’ll describe my unique perspective, developed from working as a mathematician and also in big tech developing AI solutions. Whilst AI will play a pivotal role, I believe the key next step is the adoption of software best-practices.

Current State of Mathematical Research

This the current lifecycle of a research paper

Brainstorm to strategize for ideas to explore
Convert Idea into a concrete strategy
Convince oneself that strategy will work
Work on the implementation details
Edit and verify correctness of paper
Submit Paper to a journal
Third Party reviews paper for correctness/value

To an outsider, it may …

Current State of Mathematical Research

This the current lifecycle of a research paper

Brainstorm to strategize for ideas to explore
Convert Idea into a concrete strategy
Convince oneself that strategy will work
Work on the implementation details
Edit and verify correctness of paper
Submit Paper to a journal
Third Party reviews paper for correctness/value

To an outsider, it may come to a surprise that it is not uncommon for step (7) to be the most painful step in this journey. Reviews can take years and are somewhat subjective. Reviewers are unpaid and need to balance this duty with other competing goals. This is especially painful for early-career mathematicians. A postdoc can be awaiting a career changing decision whilst a paper sits idly in an anonymous review’s mailbox.

Verifying correctness of a paper is a thorny issue. Even after self and peer review, it is not uncommon to find papers with unfixable mathematical flaws. There are many stories of top-tier mathematicians with such errors.

Computer Formalization of Mathematical Proofs

These problems can largely be solved with software engineering. Formal theorem provers, like lean, provide a way to confirm proofs are correct. Thus, if new papers were written in lean, there would be no need for humans to check the validity of the proof. This may half the work for writers of the paper and decrease the work of a reviewer — assuming they actually do check validity — by an order of magnitude.

This sounds great, but in practice there are serious challenges. First off, writing a formal proof in lean currently quite challenging on its own. Mathematicians typically skip details that other experts can infer. Computers do not allow such a luxury. Every tiny step must be justified. Whilst this is a challenge that is immediately apparent when starting to use lean (see my previous blog post), I do not see this as a long term obstacle. The use of computers allows for automation and tooling, which will ease the burden in the long run.

Tooling

Good tooling makes software development quicker, easier and less error prone — and even more enjoyable. For instance, a good IDE, can allow developers to catch and avoid bugs quickly and seamlessly navigate large code bases. Lean also offers tooling (called tactics) that allows developers to go from a statement that is “almost” proved to fully proved.

Another promising direction for good tooling is AI. Tools like Github Copilot and Claude Code give useful autocomplete and even fully code based on structured requirements. Currently these tools have limited effectiveness for Lean specific tasks — likely due to limited available data for good lean proofs. However, I do not see this as a long term blocker.

There is a virtuous cycle: as the tooling gets better, more theorems will be converted to lean, further improving the available data.

Code Migration

In order for computer formalization to work at scale, we need to formalize existing math. This is a monumental challenge. Think of this a code migration, where:

Legacy code (i.e. math papers) is not formal with incomplete — and sometimes incorrect — implementations
Contributions have been made from hundreds of thousands of developers over the course of 2000 years.
The subject matter experts typically have no experience in software engineering
Research communities are largely siloed, with projects usually having less than 5 team members

Code migrations at much smaller scales are already incredibly challenging. Currently, what exists is one large code repository that is used to host all the different code (monorepo). Inside this repository, we can find various folders containing proofs coming from different areas of mathematics. The monorepo approach is known to work at large scale systems, such as at Meta or Google.

The Vision

We wonder which of the steps

Brainstorm to strategize for ideas to explore
Convert Idea into a concrete strategy
Convince oneself that strategy will work
Work on the implementation details
Edit and verify correctness of paper
Submit Paper to a journal
Third Party reviews paper for correctness/value

can be transformed by adopting the advances in software engineering and AI. Already, mathematicians are using ChatGPT to brainstorm research ideas and even come up with proofs of simple lemmas. The latter can easily save hours of time.

Converting an idea to a concrete strategy and convincing oneself it will work is a trickier task. This usually involves deep domain expertise if one wants to be at the cutting edge of research.

Writing the implementation details and checking them is a long process. This usually involves breaking a large problem into many sub-problems and then dilligently verifying the correctness of every sub-problem. While not quite ready now, this is where good tooling could play a pivotal role. I believe a system that takes a statement of a lemma as an input and (usually) provides a correct formal proof as output should be within reach. We have some early signs of this, both through ChatGPT and also by using AI helpers within the lean code base itself. Nevertheless, there is much room for improvement. For a deep dive, Terry Tao posted a series of Youtube videos working through the same lemma with several different AI helpers.

Once we have formally verified the paper, all that is left to do is for the reviewer to provide feedback to decide on whether the paper will be accepted. This usually involves understanding the novelty and impact of a paper. For domain experts, this is a quick process. Furthermore, this is something that researchers should already be doing, as it is important to have a mental map of the research in one’s area.

Conclusion

Mathematics will inevitably adopt the tooling from software engineering and AI. Painful steps in mathematical research will be largely replaced, such as verification and certain implementation details. AI and software tooling will be a force multiplier, allowing individual mathematicians to accomplish much more than they can today.

Current State of Mathematical Research

Current State of Mathematical Research

Computer Formalization of Mathematical Proofs

Tooling

Code Migration

The Vision

Conclusion

Similar Posts