Will mechanistic/transformer interpretability [eg Neel Nanda] end up affecting p(doom) more than 5%?
Basic
2
Ṁ352223
36%
chance
1D
1W
1M
ALL
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Related questions
Related questions
Will mechanistic interpretability be essentially solved for GPT-2 before 2030?
29% chance
Will mechanistic interpretability have more academic impact than representation engineering by the end of 2025?
68% chance
Will Eliezer Yudkowsky publicly claim to have a P(doom) of less than 50% at any point before 2040?
31% chance
Will janus/@repligate meaningfully affect p(doom) by more than 5%?
40% chance
Will MIRI meaningfully affect p(doom) by more than 5%?
47% chance
Will my p(doom) be above 10% in 20 years (2043)?
31% chance
What will Manifold's P(doom) be at the end of 2024?
29% chance
Will mechanistic interpretability be essentially solved for the human brain before 2040?
23% chance
Will this project in mechanistic interpretability make me happy by the end of 2024?
64% chance
Will a model costing >$30M be intentionally trained to be more mechanistically interpretable by end of 2027? (see desc)
57% chance