Benchmark Gap #4: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, how many months will it be before an AI is listed as a (co) first author on a published math paper?
Basic
9
Ṁ5992050
37
expected
1D
1W
1M
ALL
This question is meant to measure the gap between solving the main math-based benchmarks at the time of market creation, and contributing to real world mathematics.
The co first author requirement is loose: I will also accept an AI being credited with significant contributions to both deciding what to prove and the actual proof (merely contributing to the proof is not enough - I am trying to get at "the AI does the work of a mathematician" not "the AI does the work of a proof assistant"). I would also accept, for instance, the human author of the paper expressing that they would have named the AI as a coauthor if it was human, or saying that the result could not have been obtained without the assistance of the AI.
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Related questions
Related questions
Will an AI achieve >85% performance on the FrontierMath benchmark before 2028?
61% chance
Will an AI achieve >30% performance on the FrontierMath benchmark before 2026?
28% chance
Benchmark Gap #5: Once a single AI model solves >= 95% of miniF2F, MATH, and MMLU STEM, will it be less than two years before AI models are used as entry-level data science / data analysis / statistics workers?
67% chance
Will an AI score over 30% on FrontierMath Benchmark in 2025
26% chance
Will an AI co-author a mathematics research paper published in a reputable journal before the end of 2026?
38% chance
Will an AI achieve >85% performance on the FrontierMath benchmark before 2027?
32% chance
When will an AI win the $5 million AI Math Olympiad Prize?
Will AIs be widely recognized as having developed a new, innovative, foundational mathematical theory before 2030?
32% chance
What year will the first AI exceed 80% on MLE-bench?
Benchmark Gap #3: Once a model achieves superhuman performance on a competitive programming benchmark, will it be less than 2 years before there are "entry level" AI programmers in industry use?
73% chance