"Where I agree and disagree with Eliezer" by Paul Christiano


by paulfchristiano, 20th Jun 2022.

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

(Partially in response to AGI Ruin: A list of Lethalities. Written in the same rambling style. Not exhaustive.)

  1. Powerful AI systems have a good chance of deliberately and irreversibly disempowering humanity. This is a much easier failure mode than killing everyone with destructive physical technologies.
  2. Catastrophically risky AI systems could plausibly exist soon, and there likely won’t be a strong consensus about this fact until such systems pose a meaningful existential risk per year. There is not necessarily any “fire alarm.”
  3. Even if there were consensus about a risk from powerful AI systems, there is a good chance that the world would respond in a totally unproductive way. It’s wishful thinking to look at possible stories of doom and say “we wouldn’t let that happen;” humanity is fully capable of messing up even very basic challenges, especially if they are novel.

