Daniel Litt is a professor of mathematics at the University of Toronto. He has been active in evaluating AI models for many years and is generally seen as a skeptic pushing back at hype. He has a very interesting statement updating his thoughts:
In March 2025 I made a bet with Tamay Besiroglu, cofounder of RL environment company Mechanize, that AI tools would not be able to autonomously produce papers I judge to be at a level comparable to that of the best few papers published in 2025, at comparable cost to human experts, by 2030. I gave him 3:1 odds at the time; I now expect to lose this bet.
Much of what I’ll say here is not factually very different from what I’ve written before. I’ve slowly updated my timelines over the past year, but if one wants to speculate about the long-term future of math research, a difference of a few years is not so important. My trigger for writing this post is that, despite all of the above, I think I was not correctly calibrated as to the capabilities of existing models, let alone near-future models. This was more apparent in the mood of my comments than their content, which was largely cautious.
To be sure, the models are not yet as original or creative as the very best human mathematicians (who is?) but:
Can an LLM invent the notion of a scheme, or of a perfectoid space, or whatever your favorite mathematical object is? (Could I? Could you? Obviously this is a high bar, and not necessary for usefulness.) Can it come up with a new technique? Execute an argument that isn’t “routine for the right expert”? Make an interesting new definition? Ask the right question?
…I am skeptical that there is any mystical aspect of mathematics research intrinsically inaccessible to models, but it is true that human mathematics research relies on discovering analogies and philosophies, and performing other non-rigorous tasks where model performance is as yet unclear.








