Will the ARC-AGI Grand Prize be claimed in 2024?
➕
Plus
180
281k
Dec 4
22%
chance

https://arcprize.org/competition
>=85% performance on Chollet's abstraction and reasoning corpus, private set. As judged by Chollet et al.

2025 version: https://manifold.markets/JacobPfau/will-the-arcagi-grand-prize-be-clai-srb6t2awj1

Get Ṁ600 play money
Sort by:

Now that I've bet a not-insignificant amount of mana on this, can someone explain why anyone would bet in favor of this? From what I understand, no one is even close, we wouldn't see an award even if the number of correct answer doubled and the machines probably got the easiest questions right so the remaining way is even harder.

Is this basically a bet on whether someone cheats? I filled a fuckton of limit orders at 20 percent odds and it feels like the odds would be optimistic at, like, 8?

I'm already invested, someone tell me what I'm missing.

reposted

I made a version of this market which allows for closed source LLMs: https://manifold.markets/RyanGreenblatt/by-when-will-85-be-reached-on-the-p

Here’s someone claiming 100% accuracy on the eval set with a from-scratch transformer: https://x.com/spatialweeb/status/1803950481422848312?s=46&t=fdgdiEzkLwQ2qvItoWggvg

(Doubt this holds up under scrutiny, likely a bug somewhere.)

From the replies, it looks like they were accidentally including the answer along with the examples.

I think you can a priori assign very low probability on this kind of stuff. If GPT4 and other models that took a 100s of millions of $ compute and a ton of very good engineers and only got to mid 30s on ARC, it's very unlikely that 1 person will just think of 1 trick that solves deep reasoning and gets to 100%.

bought Ṁ250 NO

Betting no based on the difficulty of YES resolution, in particular requiring models to work offline.

Can't use Gemini in the challenge. It has to run offline and with a maximum runtime of 12 hours on kaggle.

They make it sound a lot more interesting than it really is. They used 1000x the compute of prior sota to achieve the same results. The real ARC challenge is limited in compute to 12 hours runtime on kaggle and has no internet access (so no access to large LLMs).

Note that this prize doesn't allow for close source models to be used in doing the actual task.

Of course, distillation is possible etc.

bought Ṁ100 YES

Ah, I should have read this comment before I made a bet on yes 😅

"No ARC human baseline exists! http://arcprize.org/arc: "most humans can solve on average 85% of ARC-AGI tasks." But this study used the train set http://arcprize.org/guide: "The public training set is significantly easier than the...public evaluation and private evaluation set""

I tried solving about 20 public test set problems and they were all pretty easy as well. I don't know what the average human would get, but I doubt it would be much lower than 85%.

bought Ṁ250 NO

Chollet believes we’ll see an improvement from 35% to 50% in 2024. A score of 85% or better is required to win the Grand Prize.

https://x.com/fchollet/status/1800646000865943578?s=46&t=fdgdiEzkLwQ2qvItoWggvg

Why does it close on Dec 4 rather than Dec 31?

@JacobPfau Ahh, thanks, I missed that.

@EggSyntaxgeometry dash At first, when I saw it, I was wondering the same thing as you lol