Will OpenAI's next-gen math-focused model score at least 95% on the MATH benchmark?
Plus
27
Ṁ1802resolved Sep 16
Resolved
NO1D
1W
1M
ALL
Resolve to YES if OpenAI's next generation math-focused model achieves a score of 95% or higher on the MATH benchmark.
If the next generation of general models (e.g. GPT-4), code models (e.g. Codex), or any other models specialized for reasoning are released earlier than the math models and score 95% or higher, it will resolve this question to YES.
Benchmarking on a subset of MATH is acceptable.
Using tools(e.g. calculator) & code is allowed.
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Sort by:
Why is this resolving yes? I would have thought no? https://github.com/openai/simple-evals?tab=readme-ov-file#benchmark-results
Related questions
Related questions
Will any model get above human level (92%) on the Simple Bench benchmark before September 1st, 2025.
36% chance
Will OpenAI Release a Model Capable of Reliably performing Gradeschool Math from Reasoning by Jan 1, 2025?
77% chance
Will OpenAI models achieve ≥90% on SimpleBench by the end of 2025?
38% chance
Will OpenAI o1 (or any direct iteration) get gold on any International Math Olympiad by the end of 2025?
54% chance
Will openAI have the most accurate LLM across most benchmarks by EOY 2024?
39% chance
Will OpenAI's next major LLM (after GPT-4) surpass 70% accuracy on the GPQA benchmark?
66% chance
Will OpenAI release their o1 model before 2025?
83% chance
By the end of Q2 2025 will an open source model beat OpenAI’s o1 model?
61% chance
By the end of Q1 2025 will an open source model beat OpenAI’s o1 model?
24% chance
Will OpenAI's next major LLM (after GPT-4) surpass 74% accuracy on the GPQA benchmark?
81% chance