Will Gemini achieve a score above 90% on the MATH benchmark? | Manifold

Will Gemini achieve a score above 90% on the MATH benchmark?

Standard

20

Ṁ4926

resolved Sep 16

Resolved

YES

1D

1W

1M

ALL

The current SOTA is 84.3% from GPT-4 Code Interpreter. Code & tool use is allowed.

Get

1,000

and

1.00

Sort by:

I should have specified the exact model. What I intended was the first Gemini 1.0 family, not the entire Gemini series. My bad guys. Since the question itself can be interpreted as the Gemini series, so I just resolve this to Yes.

Since this market has no restrictions on public availability or zero shot, I think this should probably already resolve as yes per Gemini 1.5 report

This is a separate MATH than the one that Google reported the benchmark on. And I don't see it beating GPT4 by so much, given most of the other scores were very close.

limit order for yes 10

Related questions

Will Gemini achieve a higher score on the SAT compared to GPT-4?

Will Gemini-1.5-Pro-Exp-0801 Score Above 1165 in Scale AI's Math Evaluation

Will Gemini-1.5-Pro-Exp-0801 Score Above 1165 in Scale AI's Coding Evaluation

Will Gemini Ultra be on the chatbot arena leaderboard before the end of 2024?

Will "Gemini [Ultra, 1.0] smash GPT-4 by 5x"?

Will Gemini exceed the performance of GPT-4 on the 2022 AMC 10 and AMC 12 exams?

Will Gemini outperform GPT-4 at mathematical theorem-proving?

Will Gemini-1.5-Pro-Exp-0801 Score Above 90.35 (current #1) in Scale AI's Instruction Following Evaluation

Will Gemini Ultra outperform GPT-4V on visual reasoning by the end of 2024?

What will be true of Gemini 2?

Related questions

Will Gemini achieve a higher score on the SAT compared to GPT-4?

Will Gemini exceed the performance of GPT-4 on the 2022 AMC 10 and AMC 12 exams?

Will Gemini-1.5-Pro-Exp-0801 Score Above 1165 in Scale AI's Math Evaluation

Will Gemini outperform GPT-4 at mathematical theorem-proving?

Will Gemini-1.5-Pro-Exp-0801 Score Above 1165 in Scale AI's Coding Evaluation

Will Gemini-1.5-Pro-Exp-0801 Score Above 90.35 (current #1) in Scale AI's Instruction Following Evaluation

Will Gemini Ultra be on the chatbot arena leaderboard before the end of 2024?

Will Gemini Ultra outperform GPT-4V on visual reasoning by the end of 2024?

Will "Gemini [Ultra, 1.0] smash GPT-4 by 5x"?

What will be true of Gemini 2?

Terms & Conditions•Privacy Policy•Sweepstakes Rules