In the case that there is no AI that appears to have more control over the world than do humans, will the majority of AI researchers believe that AI alignment is solved?
Close date updated to 2029-12-31 6:59 pm
-Most AI researchers today don't take alignment seriously as a problem.
-In 2030 we'll have even better versions of shallow alignment techniques like RLHF to make AIs look aligned.
-Heck, maybe we'll have deceptively aligned models that perform really well on all the alignment benchmarks you can throw at them.
So I think people like Yann Lecun could likely think alignment is solved. Who are you planning to count as "AI researchers", though, and how do you plan to measure this?
@Multicore I'll be looking at things like the 2022 Expert Survey on Progress in AI, although I don't know what will be available when, and will most likely have to count smaller, less formal measures nearer to the date over more formal ones further from the date.