Will the performance jump from GPT4->GPT5 be less than the one from GPT3->GPT4?

1kṀ4637

2027

71%

chance

ALL

When GPT5 comes out, lists its capabilities according to all quantifiable metrics, and compare that to the same for GPT4 and GPT3.

If Capabilities(GPT5-GPT4) < Capabilities(GPT4-GPT3), this market resolves YES.

Otherwise, this market resolves NO.

This is regardless of whether there are diminishing returns in the cost required to get from GPT4 to GPT5.

Technical AI Timelines

Get

1,000

to start trading!

People are also trading

Size of smallest open-source LLM marching GPT 3.5's performance in 2025? (GB)

4.40

Will the ratio of inference runs to training runs on GPT5 decrease from the ratio on GPT4?

50% chance

What will the aggregate improvement of GPT5 be over GPT4 in terms of metrics?

157

Will GPT-5 have fewer parameters than GPT-4? (1500M subsidy)

21% chance

Will GPT-5 be released incrementally as GPT4.x for different checkpoints from the training run?

4% chance

Will I be impressed by GPT-5?

67% chance

Will GPT-5 be GPT-5o?

90% chance

What is the main reason behind GPT-4o speed improvement relative to GPT-4 base model?

What will happen before GPT 5 or GPT 4.5 is released to the public [ADD RESPONSES]

GPT-4 performance and compute efficiency from a simple architecture before 2026

Sort by:

https://www.theinformation.com/articles/openai-shifts-strategy-as-rate-of-gpt-ai-improvements-slows

https://techcrunch.com/2024/11/09/openai-reportedly-developing-new-strategies-to-deal-with-ai-improvement-slowdown/

Which gpt 4 and which gpt 3? If gpt 4o counts, it's 100 ELO better than some other versions of gpt 4, and gpt 3.5 at 1117 is less than a hundred worse than some versions of 4.

What is the answer to Capabilities(GPT4-GPT3)?

and how did that compare to Capabilities(GPT3-GPT2)?

Ah, subjective market.

I think it's hard to quantify, but I think (GPT3-GPT2) > (GPT4 - GPT-3) would be relatively non-controversial.

what if some metrics had a bigger jump but others didn't? how will this market resolve then?

@VictorLi We’ll take an average of all the top metrics and use that to start with.

If the results are still unclear or ambiguous, we’ll look at what each metric represents and I’ll use my best judgment based on what we can now do that we couldn’t do before.

Silly example: the standard metrics are all pretty middling, but also it teaches us how to speak to dolphins and turn lead into gold using nothing but Alka seltzer and chewing gum, and also decodes Linear A and proves P=NP, even if the metrics were boring in the face of undeniable new capability jumps like that which aren’t well captured by standard metrics alone I would have to call that a massive leap. (You could also express new quantifiable tests for all these new capabilities, and then test GPT4 and GPT3 against them)