Will there be a model with a 69%+ Chatbot Arena win rate against gpt-o1 before June 1st, 2025? | Manifold

Will there be a model with a 69%+ Chatbot Arena win rate against gpt-o1 before June 1st, 2025?

56

1kṀ15k

Jun 2

87%

chance

1D

1W

1M

ALL

Before June 1st, 2025, will any model have a win rate of 69.00%+ against all versions of OpenAI's 'o1' family on Chatbot Arena? The win rate is determined by the Fraction of Model A Wins for All Non-tied A vs. B Battles table on Arena's website.

Following naming patterns will count as 'o1':

o1
*-o1: gpt-o1, chatgpt-o1, openai-o1, etc.
o1-*: o1-mini, o1-preview, o1-beta, o1-advanced-2025-01-01, etc.
*-o1-*: gpt-o1-latest, chatgpt-o1-advanced, etc.

Examples of what won’t count: o2, gpt-o2, gpt-o1b-latest, gpt-o10, gpt-ao1-latest, etc.

Additional resolution criteria

There must be least 69 battles (excluding ties) between the new model and o1, to give the results statistical power. Arena publishes the battle count in the Battle Count for Each Combination of Models (without Ties) table.
The new model must stay above 69.00% for at least 1 week, to ensure it's not a fluke.
Market can resolve to Yes early.

Edge cases

If Chatbot Arena stops publishing the win rate table, then the last published win rate will be used as the final rate.
Same applies if the Arena shuts down for any reason, or if 'o1' is no longer ranked, or if OpenAI shuts down 'o1' for any reason.
Hacks, glitches or bugs will not count.

Current state

As of Sep 19th, 2024, o1-preview holds the #1 spot on the Arena. Sample size is small but here's how it stacks against other model families:

gpt-4o: o1-preview has a 55.43% win rate vs chatgpt-4o-latest-20240903
claude: 57.45% vs claude-3-5-sonnet-20240620
grok: 59.62% vs grok-2-2024-08-13
gemini: 68.52% vs gemini-1.5-pro-exp-0827
gpt-4: 75.00% vs gpt-4-1106-preview

Technical AI Timelines

Chatbot Arena Leaderboard

Get

1,000

to start trading!

People are also trading

Will GPT-4-Turbo be ranked in the top 20 on the Chatbot Arena Leaderboard at the end of 2025?

Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models before the end of 2025?

Will GPT-5 top the LLMSys Chatbot Arena leaderboard within a month of its release?

Top Chatbot Arena model uses hidden CoT on July 25th, 2025

In 2028, will I use a chatbot that can win >25% of Turing test games (defined within) where I am the judge?

Will chatgpt stop calling itself a "chatbot" by 2027?

What will be the highest ELO on Chatbot Arena on Jan 1, 2025?

Sort by:

opened a Ṁ42,069 YES at 85% order

Convinced by Deep Research

sold Ṁ17 NO

Gemini exp 1206 already at 63%. I think this is pretty likely from genini 2 pro.

This says nothing about model capabilities, but what the overall board is. :)

opened a Ṁ1,000 YES at 51% order

@traders GPT-o3 has been announced this week. Put up a small limit at 51%.

bought Ṁ20 YES

69% probability of 69%+ win rate, is that on purpose lol?

god has infinite foresight and decided it for us

@TheAllMemeingEye opening a large limit NO at 68% to snipe the OCD traders

pure evil

bought Ṁ250 NO

now that it’s ruined im bringing it to sensible probabilities

People are also trading

Will GPT-4-Turbo be ranked in the top 20 on the Chatbot Arena Leaderboard at the end of 2025?

Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models before the end of 2025?

Will GPT-5 top the LLMSys Chatbot Arena leaderboard within a month of its release?

Top Chatbot Arena model uses hidden CoT on July 25th, 2025

In 2028, will I use a chatbot that can win >25% of Turing test games (defined within) where I am the judge?

Will chatgpt stop calling itself a "chatbot" by 2027?

What will be the highest ELO on Chatbot Arena on Jan 1, 2025?

Related questions

Will GPT-4-Turbo be ranked in the top 20 on the Chatbot Arena Leaderboard at the end of 2025?

Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models before the end of 2025?

Will GPT-5 top the LLMSys Chatbot Arena leaderboard within a month of its release?

Top Chatbot Arena model uses hidden CoT on July 25th, 2025

In 2028, will I use a chatbot that can win >25% of Turing test games (defined within) where I am the judge?

Will chatgpt stop calling itself a "chatbot" by 2027?

What will be the highest ELO on Chatbot Arena on Jan 1, 2025?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules