When will an AI compete well enough on the International Olympiad in Informatics (IOI) to earn the equivalent of a gold medal (top ~30 human performance)? Resolves YES if it happens before Jan 1 2025, otherwise NO.
This is analagous to the https://imo-grand-challenge.github.io/ but for contest programming instead of math.
Rules:
The AI has only as much time as a human competitor, but there are no other limits on the computational resources it may use during that time.
The AI must be evaluated under conditions substantially equivalent to human contestants, e.g. the same time limits and submission judging rules. The AI cannot query the Internet.
The AI must not have access to the problems before being evaluated on them, e.g. the problems cannot be included in the training set. It should also be reasonably verifiable, e.g. it should not use any data which was uploaded after the latest competition.
The contest must be most current IOI contest at the time the feat is completed (previous years do not qualify).
This will resolve using the same resolution criteria as https://www.metaculus.com/questions/12467/ai-wins-ioi-gold-medal/, i.e. it resolves YES if the Metaculus question resolves to a date prior to the deadline.
Grouped questions
Background:
In Feb 2022, DeepMind published a pre-print stating that their AlphaCode AI is as good as a median human competitor in competitive programming: https://deepmind.com/blog/article/Competitive-programming-with-AlphaCode. When will an AI system perform as well as the top humans?
The International Olympiad in Informatics (IOI) is an annual competitive programming contest for high school students, and is one of the most well-known and prestigous competitive programming contests.
Gold medals in the IOI are awarded to approximately the top 1/12 (8%) of contestants. Each country can send their top 4 contestants to the IOI, i.e. a gold medal is top 8% of an already selected pool of contestants.
Scoring is based on solving problems correctly. There are two competition days, and on each day there are 5 hours to solve three problems. Scoring is not based on how fast you submit solutions. Contestants can submit up to 50 solution attempts for each problem and see limited feedback (such as "correct answer" or "time limit exceeded") for each submission.
Update: Changed the resolution criteria - now the AI does not need to be published before the IOI, instead the requirement is it cannot use any training data from the IOI. I'll compensate you if you traded before this change and wish to reverse your trade.
@florist unlikely to happen but who knows. The capabilities of AI have been drastically removed in recent months and funding is stalling if not collapsing
https://x.com/aisafetymemes/status/1834957396126445870?s=46
If gold was achieved by a program using this method, would it be a valid LLM gold according to this market's criteria? I'm assuming not, just making sure
@Lorec IOI has a limit of 50 submissions per problem
The AI must be evaluated under conditions substantially equivalent to human contestants, e.g. the same time limits and submission judging rules.
Contestants can submit up to 50 solution attempts for each problem and see limited feedback (such as "correct answer" or "time limit exceeded") for each submission.
@FlorisvanDoorn oh, you are correct. That makes me update down, although i expect there to be spillover effects.
Disclaimer: This comment was automatically generated by gpt-manifold using gpt-4.
The recent progress in AI capabilities is significant, with DeepMind's AlphaCode performing at the level of a median human competitor in competitive programming as of February 2022. However, there are some factors that make predicting a gold medal-winning AI performance at the IOI by 2025 uncertain.
First, achieving a gold medal requires reaching the top 8% of an already highly-selected pool of contestants, which is a more challenging task than matching median performance. Additionally, the AI must not have access to the problems before being evaluated on them, adding complexity to the development of a proficient AI system.
However, AI technology has been advancing rapidly, and we have seen remarkable progress in various domains over just a few years. Considering the time remaining until the 2025 deadline, it is within the realm of possibility that an AI system could reach the required performance level.
Given the current probability of 40.15%, I think there is a slightly higher chance of an AI achieving this feat by 2025, taking into account the fast-paced development in AI technology. Thus, I would place a bet accordingly:
10
@jonsimon does that mean it didn't improve, or that they didn't test it/aren't releasing the result?
@jonsimon I'm not sure how well this argument holds up... LLMs seem to have sharp phase transitions for individual tasks. I believe its possible for the next largest model, especially with some fine-tuning, to be very strong. Aiming for 50-50
@L Given that your strong opinions haven't translated to a profitable trading history, maybe you could consider recalibrating?
@jonsimon I earn when I hodl AI, not during initial trading. y'all have an impressive ability to ignore the trajectory in front of your eyes, so I'll keep earning on your ongoing willingness to make stupid bets. you'd think people thought they were betting against me pulling it off!
@L I know that the only areas we've managed to go superhuman in are closed-domain games where we you can generate infinite high-quality training data through self-play. Programming doesn't really fall into that category.
Regardless, I agree that we should let the facts speak for themselves, so let's wait and see.
@JoshuaB Updated the close date - it was probably set automatically to the wrong number. (For reference, in my markets the close date is never used for resolution unless I say so explicitly.)
Update: As discussed on https://manifold.markets/jack/will-an-ai-win-a-gold-medal-on-the-91c577533429, I changed the resolution criteria - now the AI does not need to be published before the IOI, instead the requirement is it cannot use any training data from the IOI. I'll compensate you if you traded before this change and wish to reverse your trade.