When will Anthropic first train an AI system that they claim qualifies as ASL-3?
10
Ṁ516
10000
4%
2023 Q4
7%
2024 Q1
10%
2024 Q2
14%
2024 Q3
25%
2024 Q4
10%
2025 Q1
12%
Never
19%
Other

This will be evaluated according to the AI Safety Levels (ASL) v1.0 standard defined by Anthropic here.

This resolves based on the first clear public disclosure by Anthropic that indicates that they have trained a model and found it to qualify for ASL-3. If Anthropic announces a policy that would prevent this information from being disclosed, this will resolve N/A one year after that announcement. If Anthropic ceases to operate or announces that it has permanently ceased developing new AI systems, this will resolve “Never” after one year.

Note that the date in question is the date that the model that first reached ASL-3 finished training, not when the ASL-3 determination was made or reported.

Feel free to add new answer choices. Valid choices (besides “Never”) must be in the format YYYY QQ.

Get Ṁ1,000 play money
Sort by:

For coarser-grained dates, see this and associated markets.

Will any LLM released by EOY 2024 be dangerously ASL-3 as defined by Anthropic?
20% chance. As per Anthropic's scaling policy here: https://www.anthropic.com/index/anthropics-responsible-scaling-policy ASL-3 is defined as: "ASL-3 refers to systems that substantially increase the risk of catastrophic misuse compared to non-AI baselines (e.g. search engines or textbooks) OR that show low-level autonomous capabilities." Anthropic commits to not deploying ASL-3 models 'if they show any meaningful catastrophic misuse risk under adversarial testing by world-class red-teamers (this is in contrast to merely a commitment to perform red-teaming).' Resolves to YES if in my best judgment this happened. I will put large weight on Anthropic's statements on this question, and on general consensus including polls, but will go my own way if I feel sufficiently strongly about it. Resolves to NO if in my best judgment this does not happen. (Resolves to a percentage if there is genuine uncertainty but the bar for doing this is high and I find this highly unlikely.) If a model is created but not released to at least a substantial outside beta testing group by the deadline, it does not count. I interpret for now 'low-level autonomous capabilities' as something that would tempt reasonable people to give the model real-world actual-stakes autonomous tasks for mundane utility purposes, with the expectation this was economically wise, or the ability to otherwise make money on its own, or similar. If Anthropic clarifies I will use their definition. No currently released system currently counts, including GPT-4, Claude-2 and Llama-2, barring very unexpected advancements in autonomous capability scaffolding on top of them, but in theory that could also do it. I reserve the right to modify the resolution details for clarity and intent.