Who will have the best LLM at the end of 2024 (as decided by ChatBot Arena)?

Premium

791

Ṁ750k

Dec 31

78%

Google

19%

OpenAI

1.4%

Anthropic

xAI

I was browsing Twitter, and I saw a post by Karpathy positively talking about ChatBot Arena, which is a platform for ranking LLMs based on human ratings. As expected, OpenAI is holding positions 1, 2, and 3. I wonder which company will be #1 at the end of 2024.

Screenshot of the rankings table taken on the 13th of December:

This question is managed and resolved by Manifold.

Get

1,000

and

3.00

25 Comments

746 Holders

4.4k Trades

Sort by:

@traders Based on the comments below, I think it makes sense to resolve this question based on the ELO rating in case of a tie in "rank." When I created this question, a tie was not an option, so I doubt anyone even traded based on this assumption.

I created a similar question that only uses the rank. Feel free to trade on it.

View original context

How come o1 isn't on the list on the chatbot arena?

Gemini flash 2.0 strawberry in the api

https://ai.google.dev/gemini-api/docs/thinking-mode

10k limit order @75% for anyone feeling brave

bought Ṁ500 YES

@WillSorenson it is slightly short of exp 1206. Are you assuming a thinking 1206 will be added?

@Usaar33 It appears more pleasant than o1 to me so it makes it unlikely o1 will top the charts. The following all have to go right for OAI to win:

1. They have to release a new model today
2. It has to actually be better in the dimensions that chatbot arena evaluates
3. Chatbot arena has to update it in time.

Possible! Not more than a 20% chance.

Google deepmind was and is severely underrated by this market. The odds are looking more reasonable now though

@AJama The rumor is that OpenAI will release GPT-4.5 soon.

@NeuralBets i would give it a 80/90% that OpenAI releases a new model as part of their 12 day of christmas but I am not sure they will make it available to LMSYS before end of the year - i am too deep at this point anyways so 🤷‍♂️

i am too deep at this point anyways so 🤷‍♂️

hah same 😅

opened a Ṁ10,000 YES at 40% order

@Bayesian right now this position represents ~80% of my mana net worth but i am doubling down and put a large limit order at 40% on openai

@JasonDavies @EliLifland FYI

@Soli it should be said that new model doesn’t mean that it will become N1. Reason 1: google may have fine tuned to perform way better on lmsys. Reason 2: google may have another fine tuned ready to answer any score release from OAI. Maybe google ceo and PM have their compensation tied to end-year perfomance on LMSYS

@mathvc true, openai released the new preview model over the api yesterday (which is still not ranked in LMSYS) and I expect another major announcement sooon so we shalll seee how it goes

Gemini 1206 is now top 1 model in all categories by a small margin yet people think OAI will be better at the end of the year (59% at the moment). Do people believe in new release? GPT4.5?

@mathvc I think it's more a question of how often the leader board is updated.

I agree with your stance, I just don't know if I want more exposure to this market with my novice level of understanding of the subject.

is openai planning on doing another update before end of year? they used to be like every 2 weeks, but google lately has started that schedule

@NoahRich gpt 4.5 will be released this year

@Soli it is not announced anywhere officially

bought Ṁ50 YES

@mathvc i know, its a prediction

damn, it didnt even take the next 2 weekly openai update for the gemini model to drop below 1.

boughtṀ250 YES

@JasonDavies Google limit order of NO at 45%

https://x.com/lmarena_ai/status/1859673146837827623

reposted

interesting shift

bought Ṁ500 YES

@NoahRich i don’t think it is that interesting, if anything it shows google is out of the race for spot #1 this year. openai will pass them with the next minor update to 4o. they won’t even need to release a new model to pass google.

@Soli both could release another minor update in the time. there have even been reports shared here previously suggesting the potential lol

besides its interesting that google in its own has even reached this point. up until now they have been pretty far off. especially given their compute potential. interesting if they are finally starting to make use of their leg up in funding potential and compute

@NoahRich google reached this point already in july/august when they were ranked #1 for 1-2 weeks (see this other market that resolved yes) so imo no new information here that would be relevant for 2024 since there is a still a large enough gap between openai and google that can’t be closed this year. However for 2025 it is a different story and Google indeed might fully catch up instead of being 1-2 months behind.

/Soli/which-companies-will-outrank-openai

/Soli/who-will-have-the-best-llm-at-the-e-382ae559b471

@Soli Must've been right before I joined Manifold then! I joined in late August I think and at that time OpenAI was already leading. Thanks for sharing.

Related questions

Related questions