A game theoretic solution to alignment would be to create a function that rewards 2 Bitcoin for demonstrating a proper understanding of the "AI situation" (a sort of Eliezer litmus test for whether a person has a firm grasp of the risks posed by AI and what they can do to stop it) to a humanity verified public ledger (intellectual consent blockchain verification (a sort of crypto credential that's designed to train AI on what different people consent to being true)).
This function can be built.
Building it would also transform the way the world communicates and transform the economy into one that primarily rewards education. As in, people can pay other people to learn things. This will scale to replace all advertising, journalism, media, peer-review, educational institution, government and all communication platforms. It will create a true meritocracy and ensure true free speech (the ability to put an idea into the public domain for consideration) with plenty of resources for everyone on the planet to live an amazing life.
Then. After everyone is rich and gets along. We use that data to train AI instead.
Will resolve yes given Eliezer's consent.
Will resolve no given my consent.
I pledge on Kant’s name to try as hard as I can to consent.
If someone can supply the Bitcoin, I'll build this.
If you think that's crazy, please explain why or bet against me.
Thanks 🤓
everyone is rich and gets along. We use that data to train AI instead.
Your solution does not address various ways in which AI would ruin the world.
misalignment prevention: the goal is not to build an aligned AI, it is to make sure no one ever builds a misaligned, powerful AI. What do you do with your AI that is trained on near-utopian data?
deadline: building cryptotopia is a multi-decade project, before which misaligned AIs have probably already been deployed.
out-of-distribution (OOD) misbehavior: once society is being influenced by your AI, it will start to look vastly different from what the AI's training data looked like. Our current systems are known to be untrustworthy when dealing with OOD data and ensuring OOD robustness is an open problem you don't address.
reward hacking: AIs don't look at problems like humans do and might generalize differently from the data then what we intuitively would expect.
me pre-empting some rebuttals
You didn't address the crypto part.
Yes, the above problems persist even if the crypto scheme improves society.
The only thing you try to solve for is getting high quality data, but the alignment problem is by far not only the bad-training-data problem (which I don't claim that you solve).
You speak of preventing AGI-induced catastrophe, I'm just solving the alignment problem.
That's what the alignment problem is.
I can address your points by [raising something new].
I'll hear you out, please resolve your market to NO though. Or inferiorly, edit the market description to say that the "this" in "this is a solution to alignment" does not refer to the market description.
You ought invest more into my idea by reading / watching 60+ minutes of [recommended stuff].
I plausibly spent more time responding to your market than you did on making it. Read Yudkowsky's AGI Ruin post, since that addresses a variety of ways in which alignment plans like yours tend to fail before you ask me to spend more time.
Nonetheless, thanks for at least trying and good luck.
@Jono3h You don't seem to understand my objectives.
Would you like to point out where my argument fails?
https://manifold.markets/Krantz/which-proposition-will-be-denied
@Krantz you linked me a list that did not contain the statement
Then. After everyone is rich and gets along. We use that data to train AI instead.
which was the one I was addressing.
If you no longer stand by that statement, then your market should reflect that (resolve NO, or edit the resolution criteria).
@Jono3h The system I am building prevents, game theoretically, the building of AI until we have proven a consensus for how to build it safely (which would include individuals like Eliezer to consent). Afterwhich, we could then use that data to build safe AI.
It is entirely possible, and in my opinion likely, that we would not reach such consensus and thus wouldn't build AI.
What you are asking, are the details of such a consensus, if it were to occur.
I cannot speculate on that anymore accurately than I can speculate on how AI would beat you in chess.
Great question. Humanity verification is critical.
Users creating multiple accounts to earn extra profit plays a large role in why my work is potentially infohazardous.
As for the same 'humanity verified user' making duplicate claims, that's not really how the program works.
It's similar to asking what prevents a given user from 'liking' a comment on X multiple times.
The basic design of the system?
It's the 'humanity grows up' possible future.
Not claiming it's guaranteed, just that creating this program drastically increases the odds.
Could work. $40 would be a good start to make everyone just watch a short video and demonstrate understanding. I think a lot of relevant information could be packed into one super high quality video. The idea would then obviously be that enough people will make governments slow down or stop AI capabilities research such that sufficient safeguards can be put in place.
I'm not sure whether there aren't less expensive solutions, however.
The distinction you are not recognizing is the set of people that get to choose which (Pigouvian taxes / incentive mechanisms) to create.
Sure, once you have control of the government, the media and everyone's attention, you could pick some safe topics for individuals to earn some credit for exploring.
What I'm talking about is the ability for any person on the planet to offer monetary incentives to any set of individuals they select to look at whatever information they choose without needing any governing body to oversee it.
The power lies not in the revenue gained from learning the curriculum.
The power lies in being able to define the curriculum.
@Krantz what keeps this from settling into the equilibrium where e.g. coca-cola pays many times more for young people's eyeball time than Khan Academy's entire operating budget?
Thank you for the legitimate question.
It isn't about 'eyeball time' it's about acknowledging steps of inference.
If Verizon is aimed at 'teaching' its consumers that their new phone is waterproof, then that requires individuals to understand their phone is waterproof. That's the objective Verizon has. Currently, they need to blast a ton of advertisements because they do not know which consumers actually saw or paid attention to their ad.
This is different with Khan Academy. There are vastly more points of verifiable information in Khan Academy's 'advertising campaign'.
If I were to ask both Verizon and Khan Academy, "What does the constitution of facts that you would like to get into people's heads look like?", although Verizon's budget for investment will be significantly higher, their constitution will not be nearly as large or profitable in the machine I'm aiming to build.
What it boils down to is 'How much is Verizon willing to pay you to acknowledge the fact that you understand their phone is waterproof?' (This takes a couple of seconds) vs 'How much is Khan Academy (though I'm sure your neighbors will want to contribute) willing to pay you to acknowledge all the facts required to demonstrate you have a good general education?'.
I agree. I'd much rather defer resolution to @EliezerYudkowsky.
To bad that isn't an option on the platform..
He's too busy to listen to me though.
Maybe bet against my proposal on his prediction instead?
We create a 'truth economy'.
https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence-539844cd3ba1?r=S3JhbnR6
I've been trying to get his attention (or anyone with 1/3 of his domain knowledge) for many years to charitably look at my work.
If I had that already, I wouldn't need to post predictions like this.
https://manifold.markets/Krantz/if-eliezer-charitably-reviewed-my-w?r=S3JhbnR6