Will Claude 3.5 Opus beat OpenAI's best released model on the arena.lmsys.org leaderboard? | Manifold

Will Claude 3.5 Opus beat OpenAI's best released model on the arena.lmsys.org leaderboard?

49

100Ṁ5382

2026

9%

chance

1D

1W

1M

ALL

OpenAI's best released model could be GPT-4, GPT-4o, or something else. It does not count as an OpenAI model unless it's made available to the public to try, and is known to be from OpenAI (e.g. the model can not be a secret, pseudonymous release). If arena.lmsys.org is not available at the time, the successor site or most similar leaderboard will be used.

Resolves yes if Claude 3.5 Opus is ranked above all OpenAI models 1 week after it is put on the leaderboard.

Update 2025-01-01 (PST) (AI summary of creator comment): - Models must be listed on lmarena to be counted.
- Examples:
- o1 pro does not count since it's not on the arena.
- Regular o1 does count.

Technical AI Timelines

Get

1,000

to start trading!

Sort by:

If the model is not on lmarena, then it will not count. For example, o1 pro does not count now since it's not on the arena. Regular o1 does count.

People are also trading

Will Claude 3.5 Opus have a higher Chat Arena Elo than GPT-5?

What will Claude 3.5 Opus's reported 0-shot performance on GPQA Diamond be upon release?

What will be the *first* ELO Rating of Claude 3.5 Opus in the LMSYS Arena?

Will Claude Opus be ranked in the top 20 on the Chatbot Arena Leaderboard two years from today (3/10/24)?

Will Claude 3.5 Opus be available via API by end of 2025?

Will the top model by OpenAI rank 3rd (or lower) behind 2 other model families at any point before 2026?

Will Claude 3.5 Opus be able to draw me in tic-tac-toe while playing as O at least 1/3 of the time?

Will Gary Marcus dunk on OpenAI's next big model release by saying that the model still fails in predictable ways?

Related questions

Will Claude 3.5 Opus have a higher Chat Arena Elo than GPT-5?

What will Claude 3.5 Opus's reported 0-shot performance on GPQA Diamond be upon release?

What will be the *first* ELO Rating of Claude 3.5 Opus in the LMSYS Arena?

Will Claude Opus be ranked in the top 20 on the Chatbot Arena Leaderboard two years from today (3/10/24)?

Will Claude 3.5 Opus be available via API by end of 2025?

Will the top model by OpenAI rank 3rd (or lower) behind 2 other model families at any point before 2026?

Will Claude 3.5 Opus be able to draw me in tic-tac-toe while playing as O at least 1/3 of the time?

Will Gary Marcus dunk on OpenAI's next big model release by saying that the model still fails in predictable ways?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules