Will a major AI lab claim to use activation steering in its main chat assistant by EOY 2025? | Manifold

Will a major AI lab claim to use activation steering in its main chat assistant by EOY 2025?

14

150Ṁ551

2026

30%

chance

1D

1W

1M

ALL

Also includes methods inspired by activation steering, as long as they don't use any gradient descent step.

Only includes announcements about main chat assistants (e.g. Claude, ChatGPT, Bard, ...) of a major AI lab (OpenAI, Google Deepmind, Anthropic, Meta, Inflection or Mistral).

Does not include to fine-tuning API endpoints.

Technical AI Timelines

Get

1,000

to start trading!

Sort by:

Anthropic found two features (auto-labeled "Neutrality and impartiality" and "Multiple perspectives and balance") that improve BBQ benchmark scores.

According to Nathan Labenz on the Future of Life Institute Podcast, Anthropic is piloting custom activation steering in limited beta (make-your-own Golden-Gate-Claude).

Anthropic is running a demo of an activation-steered Claude obsessed with the Golden Gate Bridge: https://www.reddit.com/r/singularity/comments/1cz7kuh/claude_golden_gate_bridge_is_now_available_bridge/ (Context: https://www.anthropic.com/research/mapping-mind-language-model )

People are also trading

Will a major AI company acknowledge the possibility of conscious AIs by 2026?

+28% 1d99% chance

Will OpenAI hint at [read description] or claim to have AGI by 2025 end?

Will a OpenAI, Anthropic, Google or Meta release an AI chatbot that has ads in the responses in 2025?

Will it be public knowledge by EOY 2025 that a major AI lab believed to have created AGI internally before October 2023?

Will a company other than OpenAI, xAI, and Google top the Chatbot Arena Leaderboard in 2025?

Will chatbots/AI be powerful enough to make me unsad by EOY2025?

Will Anthropic announce one of their AI systems is ASL-3 before the end of 2025?

Will OpenAI claim that it has achieved AGI in 2025?

Will Anthropic announce one of their AI systems is ASL-4 or higher before the end of 2025?

Who will have the most popular AI assistant at the end of 2025? (judged by active users)

Related questions

Will a major AI company acknowledge the possibility of conscious AIs by 2026?

Will OpenAI hint at [read description] or claim to have AGI by 2025 end?

Will a OpenAI, Anthropic, Google or Meta release an AI chatbot that has ads in the responses in 2025?

Will it be public knowledge by EOY 2025 that a major AI lab believed to have created AGI internally before October 2023?

Will a company other than OpenAI, xAI, and Google top the Chatbot Arena Leaderboard in 2025?

Will chatbots/AI be powerful enough to make me unsad by EOY2025?

Will Anthropic announce one of their AI systems is ASL-3 before the end of 2025?

Will OpenAI claim that it has achieved AGI in 2025?

Will Anthropic announce one of their AI systems is ASL-4 or higher before the end of 2025?

Who will have the most popular AI assistant at the end of 2025? (judged by active users)

© Manifold Markets, Inc.•Terms + Mana-only Terms•Privacy•Rules