Resolves as a fraction to r/120, where r is the highest raw score achieved by an AI on the 2024 Putnam Exam, as reported by credible AI labs (models/details not necessary). The score must have been achieved and reported before Dec 31. It goes without saying that the model shouldn’t have been trained on the questions themselves.
I will count scores obtained by running the AI for longer than the human test conditions, such as in cases similar to AlphaProof.
This tweet claims o1 pro got a raw score of at least 80
@AdamK Buzzard gives it about a ceiling of 32 here: https://xenaproject.wordpress.com/2024/12/22/can-ai-do-maths-yet-thoughts-from-a-mathematician/
I see two estimates on Twitter so far for o1: scores of 40 and 60. Clearly this market is not going to be decision-relevant to anyone in time for the answer to be well-known, despite how surprised the market would have been with such high scores had it been up for longer with larger subsidies. At this point it's a statement about the inefficiency of Manifold. IMO 2025 market trading has a >100K mana subsidy pool for NO, and yet no one is paying attention to how the AIs are doing on the Putnam. No one else thought to write this question...Mind-boggling.