On January 1, 2027, a Transformer-like model will continue to hold the state-of-the-art position in most benchmark
➕
Plus
38
Ṁ4122
2026
82%
chance
Get
Ṁ1,000
and
S3.00
Sort by:

What constitutes Transformer-like?

Is it all, or some, of these?

  • Token inputs, token outputs

  • Positional encoding

  • (MH QKV SDPA -> MLP) layers

  • Residual connections

@MalachiteEagle Transformers, yes. Attention, not really:

followed by two shared attention blocks.

https://blog.rwkv.com/p/eagle-7b-soaring-past-transformers

It even links the wager page in its bragging points:

All while being an “Attention-Free Transformer”

I thought this wager was made today haha

Does this strictly resolve to the outcome of the bet?

@FranekZak Yep. Whatever details they come up with are in.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules