Will Q* (Q Star) be a significant breakthrough in AI/ML research or engineering?
➕
Plus
34
Ṁ4741
Dec 31
92%
chance

As judged by me following general expert consensus (not just claims by OpenAI)

Get
Ṁ1,000
and
S3.00
Sort by:
bought Ṁ50 NO

There is no reason that o1 should be considered a „breakthrough“

@Philip3773733 well, they did some kind of new chain-of-thought training paradigm, but it is true that we have as far as I can see not much information on what exactly. You could characterise “breakthrough” either as “how large is the performance gain” or as “how clever/novel is the method”

@Donald This benchmark shows it only marginally improves the score. I mean sure it is better, but it also thinks way longer. Comparing to traditional benchmarks is also misleading, because it uses multi-step thinking, which could be trivially added to e.g. Claude as well using Auto GPT or similar, would be interesting to see a comparison then.

https://aider.chat/2024/09/12/o1.html

@Philip3773733 ill take a more comprehensive look at the different benchmarks a bit closer to the resolution date. for example those academic benchmarks that where provided by openai had some accuracy increases of 30% or so which is pretty huge, although of course relying on openai to benchmark their own product is not how i will resolve this market.

bought Ṁ700 YES

@Donald Do you agree that this resolves YES? have you seen the new o1 family of models (which is the new name for Q* it seems like)

@Bayesian from the information we have so far, I’d say it seems likely. I’m not sure we will get more specific information on the o1 model architecture and Q*, which would be nice for a clean decision

@Donald We probably won’t

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules