Duplicate of https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence with user-submitted answers. An outcome is "okay" if it gets at least 20% of the maximum attainable cosmopolitan value that could've been attained by a positive Singularity (a la full Coherent Extrapolated Volition done correctly), and existing humans don't suffer death or any other awful fates.
Here are the steps.
Step 1. Eliezer (or anyone with actual influence in the community) listens to Krantz.
https://manifold.markets/Krantz/if-eliezer-charitably-reviewed-my-w?r=S3JhbnR6
https://manifold.markets/Krantz/this-is-a-solution-to-alignment?r=S3JhbnR6
Step 2. Krantz shows Eliezer how simple proposition constitutions can be used to interpretably align other constitutions and earn points in the process thus providing an option to pivot away from machine learning and back into gofai.
Step 3. We create a decentralized mechanism where people maintain their own constitutions privately with the intention that they will be used to compete over aligning a general constitution to earn 'points' (Turning 'alignment work' into someone everyone can earn crypto for doing).
Step 4. Every person understands that the primary mechanisms for getting an education, hearing the important news and voting on the social contract is through maintaining a detailed constitution of the truth.
Step 5. Our economy is transformed into a competition for who can have the most extensive, accepted and beneficial constitution for aligning the truth.
https://manifold.markets/Krantz/is-establishing-a-truth-economy-tha?r=S3JhbnR6
@CalebW In your opinion, what would be the right problem, methods, world model, and thinking? The vagueness of this option seems to turn it into a grab bag akin to "because of a reason"
@TheAllMemeingEye when AGI goes right, there will be many reasons for that and there likely won't be a consensus opinion on which one was the most important. This market can be resolved to only one option. This unifying option is simply "Eliezer is wrong in many ways". Also it's a meme, which makes me want to buy it up.
@ThothHermes I just bought $M 1 of a ~0% answer and it jumped to 18%. This doesn't feel right given there's ~$M 11k in the market. Maybe the $M 5.5k subsidy do weird things?
Every single thing that computers have done over the past 20 years seems incredibly difficult until suddenly people look back at how simple that was to solve.
Why would people think that "human values" are any different, and that they are some extremely complex thing that are impossible to represent concisely?
@SteveSokolowski So much of this depends on timescale, and how "humanlike" you need this to be. Near-term, I'm sure technical methods can improve such representations. But the far-future's alien politics will have little regard, for whatever lobby is "human values".
Especially the more distinctively-21st-century-human values. They are contextual and incoherent. Though we do have some core wants, like kinship, avoiding danger, exploration, resource gathering, etc. Things that persist because they are functional, and selected for. But those hardly cement anything humanlike into the future.
What plausible action is there, to make black-hole-farmers respect our wishes? It would be like Ardipithecus stopping us from paving roads. They could fantasize we'll secretly be like them, in some deep way. In some ways, yes. But of all the life that will ever be, almost none of it has much to do with humans. And what actually drives them shouldn't be described as "human values".
I seem to be, simultaneously, way more optimistic about alignment than many EAs (in the short-term), yet also way more pessimistic about that in the long-term. I don't know why some think "our values" will have greater relevance.
Personally I'm at peace with this. Though I do wish humans only "go extinct" as a side-effect of choosing a different design for themselves. And I do forecast that the majority of outcomes will look roughly like that. But those who feel more personally invested in "human values" than me will lament this outcome, until they get that fixed.
@SteveSokolowski Should it make us live longer? Should it eliminate diseases? Which ones? Should it prevent crimes? Redistribute wealth? Weigh in on social policy or norms? I doubt satisfactory answers exist.
@AdamAlexander It is true that it's difficult for people to figure out what to program the software to do, but isn't that what humans have always done? Humans have always had different values and competing values continue to wax and wane.
The "foom" arguments people are putting forth are just unrealistic - there isn't enough power generation capacity in the world to do that. In the meantime, we'll see a slow buildup that looks sort of like things do now, as electricity shortages limit the ability of any one person to impose his or her values on the world.
@SteveSokolowski I don't making an AI capable of acting on any of the difficult questions I listed would require prohibitively mich electricity. Certainly very little in comparison to how much people might like it to take one side or another and act. The prospect that within a decade or two, an AI could use less electricity than a hospital and effect much more life-lengthening seems extremely likely to me, and it's certainly not the limit case of difficult bio ethical questions people want to take action on. Taking the disruption to essays in schools as an example, I expect many disruptive and ethically thorny decisions to be made with little forethought, and I expect the magnitude of potential consequences to increase dramatically.
@ScroogeMcDuck
> Personally I'm at peace with this. Though I do wish humans only "go extinct" as a side-effect of choosing a different design for themselves. And I do forecast that the majority of outcomes will look roughly like that. But those who feel more personally invested in "human values" than me will lament this outcome, until they get that fixed.
I think it is acceptable that human go extinct, or biological life, or even individuality or life altogether. But I think it is strange to not care about human values.
In my point of view, we should avoid hell like futures at all cost, and this is certainly a human value.
And I don't feel like the current probability we avoid it is small enough.
After that, I think it would still be a waste to lose some other things for eternity, just because we were impatient and unwise, and we didn’t want to wait even a hundred years (which is mostly nothing on theses scales), to get a better grasp on security and what we was doing and wanted to do.
@dionisos Sure, some things are worth fighting for. But lots of our values don't seem like avoiding the unambiguously hell-like futures. I won't try elaborating on them here.
Probably lots of sacred things won't survive, and were specific to our time/place. Though I don't expect most people to stop feeling anxious about it. I'm sure their assertiveness is even adaptive, at some dose. But to me it's a bit like people in year 1100 trying to "advise" us today. I could cherry pick some things to agree with Year 1100 people on. But I don't really expect them to have good advice for us. That's similar to how I feel about big interventions we might try on the Year Million culture.
/Shrug
...And with that, I wish you a Happy New Year!
@LordWilmgaddark Why is this at 19%, and 2nd place behind the meme answer? Just because the market isn't serious in general? Or are people legitimately thinking an AI smart enough to conceive of and attempt a takeover, smarter than any individual human, would be dumb enough to try without a ridiculously massive clear advantage?
They'll know and understand things like comparative advantage, tail-end risk management, game theory, etc., far better and more easily than people because they won't have the human-specific cognitive biases that make many important modern considerations like these unintuitive.
Not sure if I agree.
Bear in mind that this market is conditional on a miracle so all answers should seem miraculous.
@DavidHiggs I mean, I didn't actually think any of my answers were serious, but it doesn't seem impossible for something along the lines of a failed takeover to happen. Something less intelligent than a human, and having been exposed to a lot of things that mention harming humans, might go out and harm a bunch of humans, even if it doesn't have a great shot at success.
I kind of expect there to be some "warning shots" before the end that AI can be dangerous, although I don't know if it'll actually take the form of a takeover per se - it could just be things like AI-engineered viruses, or ramped up misinformation that makes it even more difficult to trust anything at all on the Internet, or even just humans using AI assistance in their own harmful-to-other-people schemes.
The part of my answer that felt rather unlikely to me, though, was the part where everyone wakes the hell up and starts doing something about it. On my mainline, some misuse of AI causes a few minor catastrophes every couple years, and people either shrug and say that the benefits outweigh the harms (and maybe they even do, in the short term), or take some action that looks like it's restricting AI development but isn't actually nearly enough to address the actual problems. And then they forget all about it.
Unless, like, it's actually something on the caliber of some weak AI getting access to nuclear weapons and using them to kill a billion people (but not the other seven billion) - that would wake people up, I think. But I doubt anything that serious would happen; between able to kill more than a couple thousand people and able to kill everyone is a narrow range of capability.
@LordWilmgaddark There's also the question of whether a nuclear exchange that kills billions prevents us from achieving 20%+ of maximum score.