Will o1 (not preview) achieve a better score on LiveBench coding than Claude 3.5 Sonnet 10/22?
Basic
1
Ṁ75Jan 1
75%
chance
1D
1W
1M
ALL
Per LiveBench.ai Claude 3.5 Sonnet achieves 67.13 while o1-preview gets only 50.85.
Resolves when o1 is added to the LiveBench leaderboard
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Related questions
Related questions
Before February 2025, will a Gemini model exceed Claude 3.5 Sonnet 10/22's Global Average score on LiveBench?
55% chance
Will Claude 3.5 Opus beat OpenAI's best released model on the arena.lmsys.org leaderboard?
29% chance
What SimpleBench percentile range will full o1 achieve?
How well will OpenAI's o1 (not o1-preview) do on the ARC prize when it's released if tested?
Will Claude 3.5 Opus be able to draw me in tic-tac-toe while playing as O at least 1/3 of the time?
68% chance
Before February 2025, will a Gemini model exceed Claude 3.5 Sonnet 10/22's Global Average score on Simple Bench?
55% chance
Will I judge GPT-5 to be smarter than o1 (not preview) after both are released?
77% chance
Will GPT-5 perform better than o1 (not preview) at AIME 2024, Codeforces elo, GPQA, or the 2024 ioi?
66% chance
What will Claude 3.5 Opus's reported 0-shot performance on GPQA Diamond be upon release?
Is Claude 3.5 Sonnet a distilled or quantized version of a larger model?
43% chance