If Artificial General Intelligence has an okay outcome, what will be the reason?

548

22kṀ350k

2200

21%

J. Something 'just works' on the order of eg: train a predictive/imitative/generative AI on a human-generated dataset, and RLHF her to be unfailingly nice, generous to weaker entities, and determined to make the cosmos a lovely place.

12%

K. Somebody discovers a new AI paradigm that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much much more alignable than giant inscrutable matrices of floating-point numbers.

10%

Something wonderful happens that isn't well-described by any option listed. (The semantics of this option may change if other options are added.)

I. The tech path to AGI superintelligence is naturally slow enough and gradual enough, that world-destroyingly-critical alignment problems never appear faster than previous discoveries generalize to allow safe further experimentation.

C. Solving prosaic alignment on the first critical try is not as difficult, nor as dangerous, nor taking as much extra time, as Yudkowsky predicts; whatever effort is put forth by the leading coalition works inside of their lead time.

B. Humanity puts forth a tremendous effort, and delays AI for long enough, and puts enough desperate work into alignment, that alignment gets solved first.

O. Early applications of AI/AGI drastically increase human civilization's sanity and coordination ability; enabling humanity to solve alignment, or slow down further descent into AGI, etc. (Not in principle mutex with all other answers.)

M. "We'll make the AI do our AI alignment homework" just works as a plan. (Eg the helping AI doesn't need to be smart enough to be deadly; the alignment proposals that most impress human judges are honest and truthful and successful.)

H. Many competing AGIs form an equilibrium whereby no faction is allowed to get too powerful, and humanity is part of this equilibrium and survives and gets a big chunk of cosmic pie.

A. Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity's window of fragility.

G. It's impossible/improbable for something sufficiently smarter and more capable than modern humanity to be created, that it can just do whatever without needing humans to cooperate; nor does it successfully cheat/trick us.

1.8%

D. Early powerful AGIs realize that they wouldn't be able to align their own future selves/successors if their intelligence got raised further, and work honestly with humans on solving the problem in a way acceptable to both factions.

1.4%

E. Whatever strange motivations end up inside an unalignable AGI, or the internal slice through that AGI which codes its successor, they max out at a universe full of cheerful qualia-bearing life and an okay outcome for existing humans.

1.3%

If you write an argument that breaks down the 'okay outcomes' into lots of distinct categories, without breaking down internal conjuncts and so on, Reality is very impressed with how disjunctive this sounds and allocates more probability.

1.2%

L. Earth's present civilization crashes before powerful AGI, and the next civilization that rises is wiser and better at ops. (Exception to 'okay' as defined originally, will be said to count as 'okay' even if many current humans die.)

N. A crash project at augmenting human intelligence via neurotech, training mentats via neurofeedback, etc, produces people who can solve alignment before it's too late, despite Earth civ not slowing AI down much.

An outcome is "okay" if it gets at least 20% of the maximum attainable cosmopolitan value that could've been attained by a positive Singularity (a la full Coherent Extrapolated Volition done correctly), and existing humans don't suffer death or any other awful fates.

This market is a duplicate of https://manifold.markets/IsaacKing/if-we-survive-general-artificial-in with different options. https://manifold.markets/EliezerYudkowsky/if-artificial-general-intelligence-539844cd3ba1?r=RWxpZXplcll1ZGtvd3NreQ is this same question but with user-submitted answers.

(Please note: It's a known cognitive bias that you can make people assign more probability to one bucket over another, by unpacking one bucket into lots of subcategories, but not the other bucket, and asking people to assign probabilities to everything listed. This is the disjunctive dual of the Multiple Stage Fallacy, whereby you can unpack any outcome into a big list of supposedly necessary conjuncts that you ask people to assign probabilities to, and make the final outcome seem very improbable.

So: That famed fiction writer Eliezer Yudkowsky can rationalize at least 15 different stories (options 'A' through 'O') about how things could maybe possibly turn out okay; and that the option texts don't have enough room to list out all the reasons each story is unlikely; and that you get 15 different chances to be mistaken about how plausible each story sounds; does not mean that Reality will be terribly impressed with how disjunctive the okay outcome bucket has been made to sound. Reality need not actually allocate more total probability into all the okayness disjuncts listed, from out of all the disjunctive bad ends and intervening difficulties not detailed here.)

Showcase

Get

1,000

to start trading!

People are also trading

Will scaling lead to artificial general intelligence?

72% chance

Will General Artificial Intelligence happen before 2035?

75% chance

If Artificial General Intelligence has a poor outcome, what will be the reason?

If Artificial General Intelligence has an okay outcome, which of these tags will make up the reason?

When artificial general intelligence (AGI) exists, what will be true?

Will artificial general intelligence be achieved they the end of 2025 ?

16% chance

The probability of extremely good AGI outcomes eg. rapid human flourishing will be >24% in next AI experts survey

54% chance

Will the control problem be solved before the creation of "weak" Artificial General Intelligence?

6% chance

Who first builds an Artificial General Intelligence?

If AGI has an okay outcome, will there be an AGI singleton?

Sort by:

@EliezerYudkowsky People will be arguing which one of these outcomes has occurred after supposed AGI is invented, which will be at a time when none of the outcomes has actually occurred because what we have is not exactly AGI and is not exactly aligned either. People will think wrongly that the critical window has passed and that the threat was overblown.

@Krantz want to exit our positions here? i put up a limit right next to market price. you should probably use that money on things that resolve sooner!

I appreciate the offer, but I feel pretty confident that this could resolve soon.

I'd feel more confident with this phrasing though:

Someone discovers a new paradigm in intelligence (collective, not artificial) that's powerful enough and matures fast enough to beat deep learning to the punch, and the new paradigm is much more alignable than giant inscrutable matrices of floating-point numbers.

What I'm hoping we build isn't actually AI, so I might lose for that technicality, but I am confident that it will be the reason for an outcome we might consider 'ok' and that's worth putting money on to me.

In general, I value the survival of my family and friends above winning any of these wagers.

The primary objective for wagering in these markets, for me, is to convey information to intelligent people that have an influence over research.

i think you should sell anyway!

I try not to do things unless I have reasons to do them.

Fun fact. If my system existed today, you could add the proposition 'Krantz should sell his position in K.' and the CI would then contrast and compare each of our reasons that support or deny that proposition, map them against each other and provide me with the optimal proposition (that exists on your ledger and not mine) that would be most effective in closing the inferential distance between our cruxes.

bought Ṁ1 YES

Isn't non-epistemic betting harmful to the reputability of prediction markets as a whole?

I bought E from 1% to 8% because maybe the CEV is natural — like, maybe the CEV is roughly hedonium (where "hedonium" is natural/simple and not related to quirks about homo sapiens) and a broad class of superintelligences would prioritize roughly hedonium. Maybe reflective paperclippers actually decide that qualia matter a ton and pursuing-hedonium is convergent. (This is mostly not decision-relevant.)

This seems likely to be a poorer predicter than usual - as you can't collect if you're dead

I am trying to do this differently here : https://manifold.markets/dionisos/if-we-survive-general-artificial-in-z3suausl60

bought Ṁ50 NO

Which options would be the most popular among ML researchers who aren't concerned about AI risk? From what I've read, the top choices would be "solving alignment is easy" (C or J) or "alignment isn't even a problem" (E?).

Yes, and G too I think.

@dionisos good point, G is probably the position of the "there's no such thing as intelligence" crowd

bought Ṁ1 YES

Two questions:

Why is this market suddenly insanely erratic this past week?
Why are so many semi-plausible sounding options being repeatedly bought down to ludicrously low odds <0.5% when you'd expect almost all to be within an order of magnitude of the base rate uniform distribution across options of ~6%?

opened a Ṁ25 NO at 24% order

Some of the options like H seem... logically possible yes, but a bit out there.

As for #2 that's why I bet up E a bit. I find it the most plausible contingent on a slightly superhuman AGI happening within the next 100 years, since it's the only real "it didn't work, but nothing that bad" happened option

The probilities are currently very easy to move.

this is way too high but it's a keynesian beauty contest

I thought this market was going to be resolved by Eliezer after AGI happens?

opened a Ṁ10,000 NO at 80% order

@Krantz take my orders on that option

@jacksonpolack Why did this go to 80% lmao

@benshindel

krantz has strong beliefs about it

This is verified.

Enough humans survive to rationalize whatever the outcome is as good, actually, and regardless of cost 90%+ of society treats anyone who points to visible alignment failure as a crazy person

At the time of writing this comment, the interface tells me that if I spend M3 on this answer, it will move it from 0% to 10% (4th place).

I just wanted to check that this is right given that the mana pool is currently around M150k, subsidy pool ~M20k.

(The other version of this question seemed to have a similar issue.)

Why is this market structured with mutually-exclusive options? These don't seem remotely mutually exclusive. Indeed I would be pretty surprised if, conditional on survival, we didn't get some mixture of several of these options.

@MugaSofer Per the inspiration market: 'It resolves to the option that seems closest to the explanation of why we didn't all die. If multiple reasons seem like they all significantly contributed, I may resolve to a mix among them.'