Will Qwen2-Math be at the top of the Math category of the LMSYS Chatbot Arena Leaderboard?
Mini
5
88
Sep 30
55%
chance

The newly introduced Qwen2-Math model by Alibaba claims to "outperform proprietary models, including GPT-4o and Claude 3.5, in math related downstream tasks".

Question resolves YES if the model Qwen2-Math-72B-Instruct reaches rank 1 in the Math category on the LMSYS Chatbot Arena Leaderboard upon first release of its ranking on that leaderboard. This includes the case of a shared first rank.

If the model is not added to the leaderboard by the 30th of September 2024, the question resolves as N/A.

Get Ṁ1,000 play money