For the past few months I have noticed a gradual increase in anxiety levels among my academic colleagues. Some have tried AI models for solving mathematical problems and been surprised by how effective they can be. Others have never touched AI but heard that others are using it and are worried that they’ll be left behind. Others have heard that an OpenAI model is solving open problems and are worried that it will solve the problems that they have been working on. At some level there is a fundamental panic that hard mathematical problem solving may no longer be such a highly-valued skill if AI models can be used for this task, and an accompanying worry that society may lose interest in funding a large number of human mathematicians to do research as a consequence.
I have been using AI recently to assist with my mathematical research in various ways. It certainly helps with things like performing series expansions and controlling remainders, finding and applying useful inequalities, understanding new concepts by providing simple example calculations when prompted and explaining theorems from older textbooks and papers using more accessible notation. I’ve also found it to be very useful for proofreading draft papers, including both the mathematics and the text, to check for inconsistencies in the proofs and notation and typographical errors in the writing. I also use it to rephrase sentences that read awkwardly, but I prefer to draft and edit text myself in the first instance. I have sometimes tested if an AI model can completely solve a mathematical problem that I am thinking about just by a single prompt, but usually the answer is no. Sometimes, however, if I have fleshed out the proof strategy well enough and can guess which directions are the dead ends, I can guide the AI to the solution with a few prompts and some careful checking and corrections along the way. I’ve also often found myself in a situation that I have developed a new algorithm that should work well on a class of problems, but that I am not familiar with good example applications within this class. AI literature searches can certainly help find these examples, write the code to implement them, and somewhat mercifully, make the graphs look pretty for me. I have experimented with both free and paid models, and find that the paid are slightly better but not to the degree that the capabilities are completely incomparable.
But let’s assume that AI models keep advancing at the rate they have done for the past 2 years. Is there still a future for mathematicians if chatbots can reliably do high-level mathematics consistently well?
Perhaps I am too much of an optimist, but I think the answer is yes. It’s certainly true that AI seems to be developing into an increasingly powerful tool for mathematical problem-solving. It’s also true that certain mathematicians became famous exclusively by being powerful problem solvers. Maybe in the future this brand of mathematician will be less successful or even die out, having lost their edge if others can augment their abilities with AI tools. But mathematical problem solving, in general, does not encompass the entirety of mathematics, and personally I have always felt that other parts of the subject are undervalued.
The latest OpenAI claim that an in-house model has solved a challenging Erdos problem is a good example to study in more detail. On the one hand the AI model solved a mathematical problem that no human has managed to since its inception 80 years ago. On the other hand, this problem only existed because Paul Erdos thought of it. We only consider it to be an important problem because a human mathematician of some renown had the creativity to pose it, and the mathematical skill to state it in such a precise manner. These other parts of mathematics, which require broad understanding, taste, vision and creativity as well as precision and fundamental skill in the language of mathematics, make up a much bigger part of our subject than is often acknowledged in popular culture.
I am not sure that I have ever personally worked on a paper that began from a single well-formed conjecture and simply consisted of proving or disproving it. There are such papers, but my experience is usually a bit different. I might start with a vague idea of something that might be true, and then try to think of some concrete statements that would help understand if it is or not and try to prove or disprove these. Usually this results in some surprises, and along the way I start to formulate a clearer idea of what really might be true, and gradually through this iterative process the story reveals itself. After achieving a first set of results that I am satisfied with there are usually follow-on questions and natural extensions that might enhance the story, and the degree to which I will explore them depends on how interesting or useful I think they might be, how likely I think a reviewer is to ask about them, and how much energy I have. I can see that access to powerful AI models will modify some parts of this process. It will make it easier to fill in many of the intermediate problem-solving steps. It will probably mean that I explore more extensions than I might have otherwise, as these often consist of comparing what I have done to some similar work and combining some of the techniques and ideas to see if this results in improvements, and this is also something that AI can help with, rather than me having to read through several technical papers and do lots of hand-calculations that might lead nowhere. This kind of speculative work is easier now, so I’m more likely to do it. But the initial idea about what to explore, the broad directions to take, the reactions to initial results, the assessment of what is an interesting result and what is not, these parts will still be me. Having access to AI-enhanced problem-solving abilities might also make me a braver mathematician, more willing to take on certain questions that may have previously ended in me getting stuck on some technical problem that a mathematician with a better grasp of the appropriate proof-techniques could easily solve.
It is possible, although maybe also a bit dangerous, to compare my experience of using AI models to that of supervising a PhD student. Please, if you will, suspend any reservations you might have and allow me to conduct this small thought experiment. The role I play as PhD supervisor consists of deciding on the right problem, translating it into bite-sized chunks and then feeding these to the student. Gradually as the student progresses the chunks grow bigger, with the hope that at some point they start to flesh out their own problems, develop their own tastes and hone their creativity and vision. In time the goal is to work as equal partners, with my own research enhanced by the ideas and opinions that they offer, as well as their mathematical skills when combined with my own. When I compare this process to that of interacting with an AI model, I think that the AI can manage to process the bite-sized chunks very effectively. But I don’t see how the current suite of models will progress much beyond this, at least not without searching through and summarising the thoughts and opinions of a substantial number of contemporary human mathematicians to assess the current trends and tastes of the field, and these thoughts and opinions must be continually replenished and updated over time. I also never considered myself to be less of a mathematician once I started supervising and the students began doing a lot of the calculations for me, and I still spend a healthy amount of time doing calculations myself to solidify my own understanding and check that everything makes sense. I suspect that this will still be the case in the age of AI-assisted mathematics.