EOY 2025: Will open LLMs match closed-source LLMs on coding to within 50 ELO points? | Manifold

EOY 2025: Will open LLMs match closed-source LLMs on coding to within 50 ELO points?

13

1kṀ3390

2026

39%

chance

1D

1W

1M

ALL

On December 31 2025, will the LMSys code arena's best closed-source LLM out-perform the best open-weights LLM by less than 50 points?

As of July 27, 2024 the gap is 58 ELO points.

If LMSys ceases to exist or to evaluate models, I will resolve to 50%.

If a model is open-weights but the LMSys eval uses an API e.g. deepseekv2-API this still qualifies as open-weights (unless I get evidence that the API version was different enough to affect this question; in such a case I would resolve to 50%).

Chart from https://x.com/maximelabonne/status/1779801605702836454 This shows all-question ELO whereas this market resolves by coding-only ELO, the trend is similar.

Technical AI Timelines

Chatbot Arena Leaderboard

Get

1,000

to start trading!

People are also trading

What will be true of OpenAI's best LLM by EOY 2025?

Will the best public LLM at the end of 2025 solve more than 5 of the first 10 Project Euler problems published in 2026?

In 2025, will I be able to play Civ against an LLM?

Will LLMs be able to formally verify non-trivial programs by the end of 2025?

By 2025 end, will it be generally agreed upon that LLM produced text/code > human text/code for training LLMs?

400-point pwn solved by an LLM by 2025

Will there exist an LLM which can beat the latest version of AlphaZero by EOY 2024?

Will one of the major LLMs be capable of continual lifelong learning (learning from inference runs) by EOY 2025?

Will an open-source LLM under 10B parameters surpass Claude 3.5 Haiku by EOY 2025?

Will China have the best open LLM at EOY?

Sort by:

https://x.com/amebagpt/status/1836875571906666836

The LMSYS main arena gap over time (1st vs 2nd, not necessarily OS)

If no one objects, I'll update question to read: "We'll go along with any LMsys evaluation updates: e.g. if there's a code-hard / code-style control etc. we'll use whatever the fanciest LM sys eval ends up being as long as it's code-only."

For clarification: if open source LLM overtakes closed-sourced one, will market resolve as "Yes"?

Yes

bought Ṁ10 YES

Thanks for clarification. I would buy "yes". I expect that in even worst case open source will advance with similar speed to closed source. I think Arena will eventually saturate, and shrink gap between top tiers artificially

People are also trading

What will be true of OpenAI's best LLM by EOY 2025?

Will the best public LLM at the end of 2025 solve more than 5 of the first 10 Project Euler problems published in 2026?

In 2025, will I be able to play Civ against an LLM?

Will LLMs be able to formally verify non-trivial programs by the end of 2025?

By 2025 end, will it be generally agreed upon that LLM produced text/code > human text/code for training LLMs?

400-point pwn solved by an LLM by 2025

Will there exist an LLM which can beat the latest version of AlphaZero by EOY 2024?

Will one of the major LLMs be capable of continual lifelong learning (learning from inference runs) by EOY 2025?

Will an open-source LLM under 10B parameters surpass Claude 3.5 Haiku by EOY 2025?

Will China have the best open LLM at EOY?

Related questions

What will be true of OpenAI's best LLM by EOY 2025?

Will the best public LLM at the end of 2025 solve more than 5 of the first 10 Project Euler problems published in 2026?

In 2025, will I be able to play Civ against an LLM?

Will LLMs be able to formally verify non-trivial programs by the end of 2025?

By 2025 end, will it be generally agreed upon that LLM produced text/code > human text/code for training LLMs?

400-point pwn solved by an LLM by 2025

Will there exist an LLM which can beat the latest version of AlphaZero by EOY 2024?

Will one of the major LLMs be capable of continual lifelong learning (learning from inference runs) by EOY 2025?

Will an open-source LLM under 10B parameters surpass Claude 3.5 Haiku by EOY 2025?

Will China have the best open LLM at EOY?

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules