Will an LLM be able to pass something equivalent to Yann LeCun's 7-gear test by the end of 2024?
Standard
14
Ṁ504
resolved Sep 17
Resolved
YES

Current thoughts on resolving; will firm up over coming weeks:
- kicking the can on the actual question(s) to avoid it ending up in the training data (but will stick with the 7 gear question above if there is high confidence it isn't in training data)
- key aspect of the challenge seems to be (a) requires a few steps of deductive reasoning about the physical world (b) superficial similarity to a simpler question of this type (c) a quirk in the question that makes pattern-matching to solving the simpler question wrong
- with be deferent within reason to Yann LeCun as well as the comments when coming up with which question(s) to ask that best capture the intention of this market
- model should get it right >66% of the time; no clever prompting, just straight up asking it

Get
Ṁ1,000
and
S1.00
Sort by:

O1-preview gets it correct!
https://chatgpt.com/share/66e3416c-df80-800c-9342-31efa7885616

Closed this for now;
@traders, let me know if anyone objects to resolving yes.

Claude 3 Optus didn't get it (tried just the one time)

Ditto for Sonnet 2.5 and Llama 3.1 405b

@CalebW Ditto for Reflection 70b