On January 1, 2027, a Transformer-like model will continue to hold the state-of-the-art position in most benchmark

1kṀ4190

2026

80%

chance

ALL

Tracking external bet: https://www.isattentionallyouneed.com/

Get

1,000

to start trading!

People are also trading

Which AI will be the best at the end of 2025?

Will transformers still be the dominant DL architecture in 2026?

80% chance

Will the most capable, public multimodal model at the end of 2027 in my judgement use a transformer-like architecture?

63% chance

Will Transformer based architectures still be SOTA for language modelling by 2026?

79% chance

Will any model get above human level on the Simple Bench benchmark before September 1st, 2025.

61% chance

On LMSys, what will be the difference between the top model and GPT-4-1106 on Jan 1 2026?

By EOY 2025, will the model with the lowest perplexity on Common Crawl will not be based on transformers?

10% chance

Will superposition in transformers be mostly solved by 2026?

73% chance

When will a non-Transformer model become the top open source LLM?

By the start of 2026, will I still think that transformers are the main architecture for tasks related to natural language processing?

Sort by:

What constitutes Transformer-like?

Is it all, or some, of these?

Token inputs, token outputs
Positional encoding
(MH QKV SDPA -> MLP) layers
Residual connections

bought Ṁ50 NO

https://arxiv.org/abs/2501.00663

Titans: Learning to Memorize at Test Time

Abstract page for arXiv paper 2501.00663: Titans: Learning to Memorize at Test Time

@JaundicedBaboon Let's see if it can leave the academia limbo where the Mamba's got stuck...

https://x.com/_philschmid/status/1846452029582959012

@MalachiteEagle Transformers, yes. Attention, not really:

followed by two shared attention blocks.

https://blog.rwkv.com/p/eagle-7b-soaring-past-transformers

It even links the wager page in its bragging points:

All while being an “Attention-Free Transformer”

I thought this wager was made today haha

Does this strictly resolve to the outcome of the bet?

@FranekZak Yep. Whatever details they come up with are in.

People are also trading

Which AI will be the best at the end of 2025?

Will transformers still be the dominant DL architecture in 2026?

80% chance

Will the most capable, public multimodal model at the end of 2027 in my judgement use a transformer-like architecture?

63% chance

Will Transformer based architectures still be SOTA for language modelling by 2026?

79% chance

Will any model get above human level on the Simple Bench benchmark before September 1st, 2025.

61% chance

On LMSys, what will be the difference between the top model and GPT-4-1106 on Jan 1 2026?

By EOY 2025, will the model with the lowest perplexity on Common Crawl will not be based on transformers?

10% chance

Will superposition in transformers be mostly solved by 2026?

73% chance

When will a non-Transformer model become the top open source LLM?

By the start of 2026, will I still think that transformers are the main architecture for tasks related to natural language processing?

75% chance

People are also trading

People are also trading

Related questions