Imagine that any math problem you can write down on a piece of paper that a team of Fields medalists can solve, AI can as well. Until recently, I would've predicted that that was an AGI-complete problem. Of course people used to think grandmaster-level chess would require AGI. Until 2022 I was sure that commonsense reasoning and being able to explain jokes would require AGI.
If subhuman general intelligence can be a superhuman mathematical intelligence, that will be another big update for me.
FAQ
1. What if AGI happens first?
This is a conditional prediction market. If AGI, as defined in my other market, happens first, this resolves N/A.
2. Does the AI need to max out the FrontierMath benchmark for this to resolve YES?
Yes, and every math benchmark, plus gold-medal performance on the International Math Olympiad. Even acing the Putnam.
3. What if it's essentially true but there are rare exceptions?
The spirit of the question is that we'd only consider an AI failure to be an exception if it failed for a reason other than being insufficiently brilliant at math. Like tricksy wording, or any trick question. The posing of the question has to be non-adversarial.
4. What about a book-length question?
Tentative answer so far: The problem has to be posed on a single human-readable sheet of paper or equivalent. But a question can cite any peer-reviewed math paper as background. (Dumping an impenetrable tome on the arXiv doesn't count.) If you have an example where this feels limiting, let me know. My suspicion is that all interesting math problems can be posed on a single page and in any case it won't harm the spirit of this question to limit ourselves to such.
4. What about research taste?
That's a big part of being a mathematician and isn't required for this market. The AI just has to be superhuman at answering questions, not asking them.
5. What about cost and speed?
The AI has to dominate the best humans on all metrics. We'll find an authoritative source for the market value of mathematicians' time if it comes down to that.
6. What about availability to the public?
Not required. If there's any doubt about the veracity of claims that this has been achieved, we'll discuss and delay resolution as needed.
7. What if the AI is sometimes super- and sometimes sub-human at math?
In some senses that's already the case but there may be ambiguous edge cases. As an extreme example, imagine that the AI is so blatantly superhuman that it cracks a famous open problem, yet it's routinely stumped or wrong on problems human mathematicians can do. For the spirit of the question for this market, we'll try to assess whether we'd consider a human with the AI's math abilities to be the greatest mathematician of all time. (Or the greatest raw math prodigy of all time -- see FAQ 4 on the distinction between problem solving and knowing what questions to ask. The latter is a key part of being a successful mathematician and is explicitly not part of this prediction.)
Related markets
https://manifold.markets/dreev/in-what-year-will-we-have-agi
https://manifold.markets/jack/will-an-ai-outcompete-the-best-huma-cj3ul7a2g2
https://manifold.markets/MatthewBarnett/will-an-ai-achieve-85-performance-o
https://manifold.markets/Manifold/what-will-be-the-best-performance-o-nzPCsqZgPc
[ignore the subhuman clarifications that keep automatically appearing below this line]
People are also trading
This other market is 50-50 on SOME millenium problem being solved by the beginning of 2030, by human, AI, or a collaboration, and that market does not condition on there not being AGI:
https://manifold.markets/Inosen_Infinity/will-at-least-one-of-the-remaining
Feels like we are well on our way to this happening. In just two years we’ve gone from “guessing randomly on most AMC questions” to “able to get an average USAMO score.” Math feels like an especially scalable field as well, because it’s easy to automate the checking of its own work, and one where having encyclopedic knowledge of every theorem and proof technique (and the ability to try them far faster than a human) would be very useful.
@dominic We also went from the first flight in 1905 to the moon landing in 1969 but it's 2025 and we still haven't been to Mars or even back to the Moon recently.
When it comes to USAMO/IMO, everyone knew geometry and functional equations would be easy, other than that it's basically so far just done some simple one-step 1/4 problems with tons and tons of computation time. LLMs are basically useless, and only AlphaProof can do this. AlphaProof is also quite weird, it's just given the statements and it proves things until it finds the answer. Olympiads are more suited to that than research problems.
There is no question that what AI has done so far is impressive, and I don't mean to be a super AI skeptic. I do think I'll live to see AI "solve math" but 4.5 years is way too quick. We are still a long way off.
@nathanwei Not referring to AlphaProof here, just looking at the success of Gemini 2.5 Pro / o3 on the USAMO benchmark: https://matharena.ai/. o1 basically could not solve any USAMO problems whereas o3 can solve 1/4 about half the time, which is a pretty big step up. It's likely to slow down, but how fast? How long do you expect it to take before a model aces USAMO? I wouldn't be surprised if it happens within a year from today.
@dominic I would be very surprised if an LLM aces USAMO within a year. This is an LLM? Not AlphaProof? AlphaProof has some chance to sweep USAMO/IMO within a year (I'm not as bullish as some others, but it's not impossible) but I think that LLMs have no chance.
@nathanwei Yeah, referring to LLMs. I don't think it's guaranteed by any means, but I wouldn't bet against an early 2026 LLM getting 90%+ on USAMO, and my average expectation is probably like 70%. If the rapid progress stops I'd definitely become much less optimistic however.
@nathanwei Yeah, this would still be gobsmacking and it's breaking my brain trying to decide what I think the right probability is
@dreev This is also CONDITIONAL on no "AGI" right? What if we have "AGI" but not "ASI" and this has not been done yet? Does this resolve NO?
@nathanwei Correct, and we're defining AGI the same way my other market does, in terms of automating away essentially all human labor that can be done remotely, ie, via the internet. See FAQ 1.
@bohaska Great question, like imagine that the AI is so blatantly superhuman that it cracks a famous open problem, yet it's routinely stumped or wrong on problems human mathematicians can do. I'm thinking the way to resolve in that case should be based on whether we'd consider a human with the AI's math abilities to be the greatest mathematician of all time.
Is that sounding fair? (I'm genuinely asking; don't trade based on this until I update the FAQ!)
@dreev I think that if AI does not make human mathematicians obsolete, I'll resolve NO. An AI that could write 300 Annals papers by constructing counterexamples to lots of conjectures might be considered the greatest mathematician of all time by some metrics but it would be stupid to resolve YES because of that.
@nathanwei No no, problem solving is right in the title, and see FAQ 4. It is not required to make human mathematicians obsolete for this to resolve YES.
@Sebastianus Do they though? Or is it that people have always thought of AGI as roughly "the intelligence of humans" or "able to think, learn and solve problems across an arbitrary range of domains," and similar, while the narrow capabilities that were thought to be bottlenecks to AGI turned out to be easier than AGI?
@DavidHiggs Exactly.We want to find the simplest sufficient condition for AGI. We're not moving goalposts, we just keep learning that candidates like chess, explaining jokes, perhaps all of math, aren't sufficient after all. This is a hard meta problem.
@dreev Actually, as soon as I've said that, I'm thinking 50% is too high. So I'm going to dive in after all. As usual, I'll be extremely mindful of my conflict of interest and I commit to making the resolution fair, outsourcing the final decision if needed. I'll also be entirely transparent about my thinking. Ask me anything!