A game theoretic solution to alignment would be to create a function that rewards 2 Bitcoin for demonstrating a proper understanding of the "AI situation" (a sort of Eliezer litmus test for whether a person has a firm grasp of the risks posed by AI and what they can do to stop it) to a humanity verified public ledger (intellectual consent blockchain verification (a sort of crypto credential that's designed to train AI on what different people consent to being true)).
This function can be built.
Building it would also transform the way the world communicates and transform the economy into one that primarily rewards education. As in, people can pay other people to learn things. This will scale to replace all advertising, journalism, media, peer-review, educational institution, government and all communication platforms. It will create a true meritocracy and ensure true free speech (the ability to put an idea into the public domain for consideration) with plenty of resources for everyone on the planet to live an amazing life.
Then. After everyone is rich and gets along. We use that data to train AI instead.
Will resolve yes given Eliezer's consent.
Will resolve no given my consent.
I pledge on Kant’s name to try as hard as I can to consent.
If someone can supply the Bitcoin, I'll build this.
If you think that's crazy, please explain why or bet against me.
Thanks 🤓
everyone is rich and gets along. We use that data to train AI instead.
Your solution does not address various ways in which AI would ruin the world.
misalignment prevention: the goal is not to build an aligned AI, it is to make sure no one ever builds a misaligned, powerful AI. What do you do with your AI that is trained on near-utopian data?
deadline: building cryptotopia is a multi-decade project, before which misaligned AIs have probably already been deployed.
out-of-distribution (OOD) misbehavior: once society is being influenced by your AI, it will start to look vastly different from what the AI's training data looked like. Our current systems are known to be untrustworthy when dealing with OOD data and ensuring OOD robustness is an open problem you don't address.
reward hacking: AIs don't look at problems like humans do and might generalize differently from the data then what we intuitively would expect.
me pre-empting some rebuttals
You didn't address the crypto part.
Yes, the above problems persist even if the crypto scheme improves society.
The only thing you try to solve for is getting high quality data, but the alignment problem is by far not only the bad-training-data problem (which I don't claim that you solve).
You speak of preventing AGI-induced catastrophe, I'm just solving the alignment problem.
That's what the alignment problem is.
I can address your points by [raising something new].
I'll hear you out, please resolve your market to NO though. Or inferiorly, edit the market description to say that the "this" in "this is a solution to alignment" does not refer to the market description.
You ought invest more into my idea by reading / watching 60+ minutes of [recommended stuff].
I plausibly spent more time responding to your market than you did on making it. Read Yudkowsky's AGI Ruin post, since that addresses a variety of ways in which alignment plans like yours tend to fail before you ask me to spend more time.
Nonetheless, thanks for at least trying and good luck.
Great question. Humanity verification is critical.
Users creating multiple accounts to earn extra profit plays a large role in why my work is potentially infohazardous.
As for the same 'humanity verified user' making duplicate claims, that's not really how the program works.
It's similar to asking what prevents a given user from 'liking' a comment on X multiple times.
The basic design of the system?
It's the 'humanity grows up' possible future.
Not claiming it's guaranteed, just that creating this program drastically increases the odds.
Could work. $40 would be a good start to make everyone just watch a short video and demonstrate understanding. I think a lot of relevant information could be packed into one super high quality video. The idea would then obviously be that enough people will make governments slow down or stop AI capabilities research such that sufficient safeguards can be put in place.
I'm not sure whether there aren't less expensive solutions, however.
The distinction you are not recognizing is the set of people that get to choose which (Pigouvian taxes / incentive mechanisms) to create.
Sure, once you have control of the government, the media and everyone's attention, you could pick some safe topics for individuals to earn some credit for exploring.
What I'm talking about is the ability for any person on the planet to offer monetary incentives to any set of individuals they select to look at whatever information they choose without needing any governing body to oversee it.
The power lies not in the revenue gained from learning the curriculum.
The power lies in being able to define the curriculum.
@Krantz what keeps this from settling into the equilibrium where e.g. coca-cola pays many times more for young people's eyeball time than Khan Academy's entire operating budget?
Thank you for the legitimate question.
It isn't about 'eyeball time' it's about acknowledging steps of inference.
If Verizon is aimed at 'teaching' its consumers that their new phone is waterproof, then that requires individuals to understand their phone is waterproof. That's the objective Verizon has. Currently, they need to blast a ton of advertisements because they do not know which consumers actually saw or paid attention to their ad.
This is different with Khan Academy. There are vastly more points of verifiable information in Khan Academy's 'advertising campaign'.
If I were to ask both Verizon and Khan Academy, "What does the constitution of facts that you would like to get into people's heads look like?", although Verizon's budget for investment will be significantly higher, their constitution will not be nearly as large or profitable in the machine I'm aiming to build.
What it boils down to is 'How much is Verizon willing to pay you to acknowledge the fact that you understand their phone is waterproof?' (This takes a couple of seconds) vs 'How much is Khan Academy (though I'm sure your neighbors will want to contribute) willing to pay you to acknowledge all the facts required to demonstrate you have a good general education?'.
I agree. I'd much rather defer resolution to @EliezerYudkowsky.
To bad that isn't an option on the platform..
He's too busy to listen to me though.
Maybe bet against my proposal on his prediction instead?
We create a 'truth economy'.
https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence-539844cd3ba1?r=S3JhbnR6
I've been trying to get his attention (or anyone with 1/3 of his domain knowledge) for many years to charitably look at my work.
If I had that already, I wouldn't need to post predictions like this.
https://manifold.markets/Krantz/if-eliezer-charitably-reviewed-my-w?r=S3JhbnR6
Pick some powerful entity (eg the Chinese government, Microsoft, or the Catholic church)
Do you think your solution could align that entity to humanity, or at least to it's own citizens/customers/followers? If not, why would ASI be different? If so, how would that go, and how could one bootstrap the process?
Yes, I believe they can be aligned. You align a powerful entity like that by aligning the individual members that make up the group to perform the game theoretic actions that cause alignment.
For example, there might be an event 'E' that a corrupt government wants to occur while the vast majority of the citizenry do not want it to occur (this could be a particular bill, issue or task like sealing a physical boarder). One way to ensure this happens is to achieve a verifiable public record that the following individuals agree with the following propositions.
Vast majority of citizens:
# 1 I understand how Krantz works.
# 467952 - Event 'E' is significant and requires immediate action from the government.
#3497829 - This proposition is intended to provide record of my intention to support the specific legislature (insert bill here that defines action to be taken on 'E', could also be another proposition in the ledger).
Official congressmen:
#34875 - A majority of the citizens you represent support #3497829.
#68723 - As a congressmen you have pledged to uphold the verifiable requests of a majority of members of your district.
I could continue, but hopefully this is enough to get the point.
I'm talking about a fundamental change in the way we communicate in the public square.
It's a change that I think will accelerate 'communication about complicated ideas over the internet' so fast, that it will eliminate our understanding of traditional media, education, government, companies and essentially every other mechanism who's primary job in the end is to control the sharing of ideas.
That is the primary function of government. We give them tax dollars and they figure out the good stuff that everybody wants done, figures out who's qualified to do those things and then gives the money to those people. I think we can do all that on the internet. Surly this is where some people see AI and blockchain headed.
How that transition happens, is a much longer conversation.
'The thing that's in charge' is a complicated thing now. It used to be people. Kings, presidents.
Now its ideas, its technology, its infrastructure.
If (Bitcoin) wanted to make Donald Trump say the words 'Skibidi toliet' it could.
That's interesting to me.
The only thing actually required to bootstrap the process is to get the idea to the right people, privately. That's nearly impossible. It complicated and requires knowledge from several domains. I'd put my odds at around 20%.