When will AIs be good at solving complex problems? (read description)

1kṀ4514

2030

ALL

98.3%

2027

98.2%

2026

98.2%

2025

82%

2024

If you don't want to read the full description, the short version is this: The probability of each time category represents the percentage of people that will be worse at problem-solving than an AI that doesn't use extreme amounts of energy.

Codeforces contests will be used to measure the performance.

Competitive Programming and AI

AIs as of 2024 are good at retrieving previously discovered knowledge, they lack however the ability to navigate complex state spaces to arrive at new solutions.

CP problems provide a good framework for testing them in this area, as we can easily test their solutions with a computation.

As such, their rating in Codeforces will also reflect to some extent their ability to solve complex problems relative to other humans.

AI Qualification Criteria

fully automated agent that uses Codeforces API to read statements and submit solutions

to prevent the abuse of inefficient and massive computations (similar to AlphaCode) that may as well be equivalent to a large team of humans, the AI is limited to the use of resources worth at most $80 per contest

Scoring and Resolution

The rating will be evaluated based on the median from past 5 contests or based on the rating on the Codeforces website.

If the rating of the best agent at the end of the given year (or some other time) is x then that resolves to the percentage of people with less rating than x.

Technology

Technical AI Timelines

GPT-5 Speculation

Get

1,000

to start trading!

2 Comments

9 Holders

30 Trades

Sort by:

The obvious question is: do you use code force tests that were created before or after the training data of the AI in question? AI that gets almost perfect scores on old tests often get 0 correct on new tests. What's the standard?

@NeoPangloss Ideally only true participations (non virtual) should count.

Related questions

Related questions