If Artificial General Intelligence has an okay outcome, what will be the reason?

332

17kṀ150k

2200

21%

Eliezer finally listens to Krantz.

17%

We create a truth economy. https://manifold.markets/Krantz/is-establishing-a-truth-economy-tha?r=S3JhbnR6

15%

Yudkowsky is trying to solve the wrong problem using the wrong methods based on a wrong model of the world derived from poor thinking and fortunately all of his mistakes have failed to cancel out

11%

Other

11%

Alignment is not properly solved, but core human values are simple enough that partial alignment techniques can impart these robustly. Despite caring about other things, it is relatively cheap for AGI to satisfy human values.

Humans become transhuman through other means before AGI happens

AIs will not have utility functions (in the same sense that humans do not), their goals such as they are will be relatively humanlike, and they will be "computerish" and generally weakly motivated compared to humans.

Power dynamics stay multi-polar. Partly easy copying of SotA performance, bigger projects need high coordination, and moderate takeoff speed. And "military strike on all society" remains an abysmal strategy for practically all entities.

"Corrigibility" is a bit more mathematically straightforward than was initially presumed, in the sense that we can expect it to occur, and is relatively easy to predict, even under less-than-ideal conditions.

1.2%

Aligned AI is more economically valuable than unaligned AI. The size of this gap and the robustness of alignment techniques required to achieve it scale up with intelligence, so economics naturally encourages solving alignment.

A lot of humans participate in a slow scalable oversight-style system, which is pivotally used/solves alignment enough

Duplicate of https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence with user-submitted answers. An outcome is "okay" if it gets at least 20% of the maximum attainable cosmopolitan value that could've been attained by a positive Singularity (a la full Coherent Extrapolated Volition done correctly), and existing humans don't suffer death or any other awful fates.

Get

1,000

to start trading!

People are also trading

If Artificial General Intelligence has a poor outcome, what will be the reason?

Will artificial general intelligence be achieved they the end of 2025 ?

16% chance

If Artificial General Intelligence has an okay outcome, which of these tags will make up the reason?

Will the control problem be solved before the creation of "weak" Artificial General Intelligence?

6% chance

The probability of extremely good AGI outcomes eg. rapid human flourishing will be >24% in next AI experts survey

54% chance

Will Eliezer's "If Artificial General Intelligence has an okay outcome, what will be the reason?" market resolve N/A?

29% chance

When artificial general intelligence (AGI) exists, what will be true?

Why will "If Artificial General Intelligence has an okay outcome, what will be the reason?" resolve N/A?

Will scaling lead to artificial general intelligence?

72% chance

Will General Artificial Intelligence happen before 2035?

Sort by:

@Krantz https://manifold.markets/Krantz/krantz-mechanism-demonstration?r=S3JhbnR6

Slightly off topic but I find very amusing the mental image of blankspace737 / stardust / Tsar Nicholas coming across this market, seeing the "god is real" option at 0% and repeatedly trying, with growing desperation, to bet it up for infinite winnings, only to be met each time with the server error message

bought Ṁ40 NO

Even in good cases, 20% of max attainable CEV seems unlikely. I expect that outcomes are extremely heavy-tailed such that even if alignment is basically solved, we rarely get anything close to 20% of maximum. There’s a lot of room at the top. May also be that maximum is unbounded!

not to pathologize, and I could very well be projecting, but seeing krantz's writing for the first time took me back Ratatouille-style to the types of ideas my hypomanic episodes used to revolve around before I knew how to ground myself. it's not I think that any of it is delusional, but from what I've seen of the general frantic vibe of his writing and of his responses to criticism it seems like he has such deeply held idealistic notions of what the future holds that it starts to get (in my experience, at least) difficult and almost painful to directly engage with and account for others' thoughts on your ideas at anything more than a superficial level if they threaten to disrupt said notions. maybe that's normal, idk

Here are the steps.

Step 1. Eliezer (or anyone with actual influence in the community) listens to Krantz.

https://manifold.markets/Krantz/if-eliezer-charitably-reviewed-my-w?r=S3JhbnR6

https://manifold.markets/Krantz/this-is-a-solution-to-alignment?r=S3JhbnR6

Step 2. Krantz shows Eliezer how simple proposition constitutions can be used to interpretably align other constitutions and earn points in the process thus providing an option to pivot away from machine learning and back into gofai.

Step 3. We create a decentralized mechanism where people maintain their own constitutions privately with the intention that they will be used to compete over aligning a general constitution to earn 'points' (Turning 'alignment work' into someone everyone can earn crypto for doing).

Step 4. Every person understands that the primary mechanisms for getting an education, hearing the important news and voting on the social contract is through maintaining a detailed constitution of the truth.

Step 5. Our economy is transformed into a competition for who can have the most extensive, accepted and beneficial constitution for aligning the truth.

https://manifold.markets/Krantz/is-establishing-a-truth-economy-tha?r=S3JhbnR6

bought Ṁ10 YES

anyone know what's going on with the unbettable 0% options returning NaN?

It seems this market is heavily suffering from being linked when many of the options are not mutually exclusive

yeah

big benefit of all possibilities being written by the same person is there's less of that

bought Ṁ10 NO

@CalebW In your opinion, what would be the right problem, methods, world model, and thinking? The vagueness of this option seems to turn it into a grab bag akin to "because of a reason"

@TheAllMemeingEye when AGI goes right, there will be many reasons for that and there likely won't be a consensus opinion on which one was the most important. This market can be resolved to only one option. This unifying option is simply "Eliezer is wrong in many ways". Also it's a meme, which makes me want to buy it up.

I couldn't have phrased this better myself

Just be be clear, it's an unfortunate situation.

bought Ṁ10 NO

@Krantz wdym?

I just submitted this answer, and it is now at 23% and in second place. May I possibly have caused a bug to happen?

@ThothHermes No, that always happens when Other has a high probability.

@Multicore The probabilities don't match what I'd intuitively expect.

bought Ṁ1 YES

@ThothHermes I just bought $M 1 of a ~0% answer and it jumped to 18%. This doesn't feel right given there's ~$M 11k in the market. Maybe the $M 5.5k subsidy do weird things?

bought Ṁ1 YES

Ah, this was a market in the old DPM style I think. They recently ran a script to update them all to the new multiple choice format with an "other" option but their liquidity is low so they'll behave strangely.

bought Ṁ1 YES

I suggest trading in the version of this question that Eliezer made with that new format when it came out:

But to make things less weird here I'll add a small subsidy.

Every single thing that computers have done over the past 20 years seems incredibly difficult until suddenly people look back at how simple that was to solve.

Why would people think that "human values" are any different, and that they are some extremely complex thing that are impossible to represent concisely?

@SteveSokolowski So much of this depends on timescale, and how "humanlike" you need this to be. Near-term, I'm sure technical methods can improve such representations. But the far-future's alien politics will have little regard, for whatever lobby is "human values".

Especially the more distinctively-21st-century-human values. They are contextual and incoherent. Though we do have some core wants, like kinship, avoiding danger, exploration, resource gathering, etc. Things that persist because they are functional, and selected for. But those hardly cement anything humanlike into the future.

What plausible action is there, to make black-hole-farmers respect our wishes? It would be like Ardipithecus stopping us from paving roads. They could fantasize we'll secretly be like them, in some deep way. In some ways, yes. But of all the life that will ever be, almost none of it has much to do with humans. And what actually drives them shouldn't be described as "human values".

I seem to be, simultaneously, way more optimistic about alignment than many EAs (in the short-term), yet also way more pessimistic about that in the long-term. I don't know why some think "our values" will have greater relevance.

Personally I'm at peace with this. Though I do wish humans only "go extinct" as a side-effect of choosing a different design for themselves. And I do forecast that the majority of outcomes will look roughly like that. But those who feel more personally invested in "human values" than me will lament this outcome, until they get that fixed.

@SteveSokolowski Should it make us live longer? Should it eliminate diseases? Which ones? Should it prevent crimes? Redistribute wealth? Weigh in on social policy or norms? I doubt satisfactory answers exist.

@AdamAlexander It is true that it's difficult for people to figure out what to program the software to do, but isn't that what humans have always done? Humans have always had different values and competing values continue to wax and wane.

The "foom" arguments people are putting forth are just unrealistic - there isn't enough power generation capacity in the world to do that. In the meantime, we'll see a slow buildup that looks sort of like things do now, as electricity shortages limit the ability of any one person to impose his or her values on the world.

@SteveSokolowski I don't making an AI capable of acting on any of the difficult questions I listed would require prohibitively mich electricity. Certainly very little in comparison to how much people might like it to take one side or another and act. The prospect that within a decade or two, an AI could use less electricity than a hospital and effect much more life-lengthening seems extremely likely to me, and it's certainly not the limit case of difficult bio ethical questions people want to take action on. Taking the disruption to essays in schools as an example, I expect many disruptive and ethically thorny decisions to be made with little forethought, and I expect the magnitude of potential consequences to increase dramatically.

@ScroogeMcDuck

> Personally I'm at peace with this. Though I do wish humans only "go extinct" as a side-effect of choosing a different design for themselves. And I do forecast that the majority of outcomes will look roughly like that. But those who feel more personally invested in "human values" than me will lament this outcome, until they get that fixed.

I think it is acceptable that human go extinct, or biological life, or even individuality or life altogether. But I think it is strange to not care about human values.

In my point of view, we should avoid hell like futures at all cost, and this is certainly a human value.

And I don't feel like the current probability we avoid it is small enough.

After that, I think it would still be a waste to lose some other things for eternity, just because we were impatient and unwise, and we didn’t want to wait even a hundred years (which is mostly nothing on theses scales), to get a better grasp on security and what we was doing and wanted to do.

@dionisos Sure, some things are worth fighting for. But lots of our values don't seem like avoiding the unambiguously hell-like futures. I won't try elaborating on them here.

Probably lots of sacred things won't survive, and were specific to our time/place. Though I don't expect most people to stop feeling anxious about it. I'm sure their assertiveness is even adaptive, at some dose. But to me it's a bit like people in year 1100 trying to "advise" us today. I could cherry pick some things to agree with Year 1100 people on. But I don't really expect them to have good advice for us. That's similar to how I feel about big interventions we might try on the Year Million culture.

/Shrug

...And with that, I wish you a Happy New Year!