Will OpenAI Release a Chatbot Using the Q* Algorithm in 2024

Plus

Ṁ4233

Dec 31

25%

chance

ALL

This market is based on recent news that OpenAI is developing a new model called Q* (pronounced 'Q-Star'), believed by some within the company to be a potential breakthrough in the quest for artificial general intelligence (AGI). AGI refers to autonomous systems capable of surpassing humans in most economically valuable tasks. Notably, Q* has demonstrated proficiency in solving certain mathematical problems, albeit currently at a grade-school level. This initial success has generated optimism among researchers about its future capabilities. The question now is whether OpenAI will leverage Q* in a chatbot and release it to the public by the end of 2024. This market resolves positively if OpenAI officially announces or releases a chatbot powered by the Q* algorithm within the year 2024.

This question is managed and resolved by Manifold.

Get

1,000

and

3.00

9 Comments

26 Holders

61 Trades

Sort by:

bought Ṁ1,000 YES

@FedorShabashev Resolves YES I think?

@Bayesian No because Q* isn’t mentioned anywhere

@FedorShabashev Since OpenAI itself

Doesn’t call the new model Q* but calls it o1
Didn’t mention anything about Q* in the o1 model official blog post

Therefore the market cannot be resolved YES right now

@FedorShabashev ok they changed the name internally and will never refer to it as Q* ever again, probably. but it's almost definitely the same technology. ig that's fine. Only thing is

> This market resolves positively if OpenAI officially announces or releases a chatbot powered by the Q* algorithm within the year 2024.

This doesn't say they need to say it is powered by Q*, but that it needs to be powered by Q*. Shouldn't we use our judgement since OpenAI rarely shares precise implementation details?

o1 is based on the multistep reasoning approach that was described in a research paper released by OpenAI in may 2023:
https://arxiv.org/abs/2305.20050
this paper never mentions Q*.

Q* was mentioned by journalists but I don't see OpenAI officially mentioning it anywhere.
unlike o1, Q* was never described in a paper or in a blog post authored by OpenAI.

The market description specifically mentions "a chatbot powered by the Q* algorithm." If OpenAI does not officially use the term "Q*" or describe a model with that specific name, then strictly based on the wording, the market should not be able to resolve YES.

It can use the Q* algo without mentioning it anywhere. But sure

@Bayesian For the market to resolve YES, there needs to be verifiable evidence that the released chatbot is powered by the Q* algorithm. If OpenAI never publicly mentions or confirms Q* as the underlying technology, then there isn't enough evidence to resolve it to YES.

@FedorShabashev there is a lot of evidence to suggest that o1 uses Q*, but no official confirmation from OpenAI.

Reuters did link "Q*" and "Strawberry": https://www.reuters.com/technology/artificial-intelligence/openai-working-new-reasoning-technology-under-code-name-strawberry-2024-07-12/

Then there was reporting that "Strawberry" is coming out in the next week or two, and then a model with the same capabilities as were rumored for Q* came out in the predicted timeframe under the marketing name o1.

Also, o1 being a combination of Q-learning and STaR would map very well to the way o1 works.

The capabilities of o1 being very close to the capabilities of Q* also shift the odds in favor of them being the same thing - OpenAI inventing a different method with the exact same results is less likely.

All of these things combined have convinced me that o1 uses Q*, but it is ultimately up to you whether you consider this convincing enough for a resolution.

I can see the reasoning behind trying to link o1 with Q* based on external reports, but from what’s publicly available about o1—specifically from OpenAI's blog post on 'Learning to Reason with LLMs'—there’s no direct mention of Q* or Q-learning as part of its development. The blog post focuses on how o1 uses reinforcement learning to enhance multi-step reasoning, but it doesn’t explicitly reference Q-learning, which is a specific reinforcement learning algorithm.

The reinforcement learning used in o1 seems to be geared toward improving the model's ability to reason through problems over time, focusing on a chain of thought rather than action-reward loops typical of Q-learning. Without official confirmation from OpenAI, it’s challenging to conclude that o1 is based on Q* or that it employs Q-learning.

At this point, the available information doesn’t provide enough evidence to resolve the market based on speculation.

Related questions

Related questions