Is the LMSYS chatbot arena leaderboard trustworthy?
➕
Plus
16
2.0k
2027
55%
chance

LLMs can distinguish their own output from the output of different LLMs and they have a preference for their own output, so it's technically feasible to manipulate the leaderboard by throwing an LLM at the chatbot arena to upvote its own completions.

Has this happened yet? Will it happen soon?

Resolves NO iff, before 2027/7/1, credible media reports state that the lmsys leaderboard has been manipulated with sockpuppet accounts / fraudulent voting. A statement coming directly from lmsys would also count.

Resolves YES otherwise.

Get Ṁ600 play money

More related questions