Published: April 25, 2023
50
16
320

Possible but hardly inevitable. It becomes moderately more likely as people call it absurd and fail to take precautions against it, like checking for sudden drops in the loss function and suspending training. Mostly, though, this is not a necessary postulate of a doom story.

...it appears that Metzger has appointed himself the new arbiter of what constitutes my position, above myself. I dub this strange new doctrine as "Metzgerism" after its creator. https://x.com/perrymetzger/sta...

Rapid capability gains, combined with a total civilizational inability to slow down to the level Actually required, form half of my concern. The other half is how observations from weak AIs will predictably fail to generalize to more powerful AIs.

The capability gains do not need to take the place over hours, and do not need to go undetected, for the scenario to go on wandering down convergent pathways to everyone being dead. That element of the Metzgerian doctrine is a Metzgerian invention.

Image in tweet by Eliezer Yudkowsky ⏹️

@perrymetzger I'd consider that scenario less probable now now than when I was younger, but still not quite rule it out entirely. "Can't rule it out, can't rely on it" seems a reasonable epistemic position to have about something you're unsure about, as a kid?

@ESYudkowsky @perrymetzger "can't rule it out entirely" as in >.01%? this seems incredibly naive as to the amount of predictive or manipulative power that even arbitrary intelligence with unlimited information could wield

@SHL0MS @perrymetzger Our model of physics has changed a lot over the last 200 years. It's presently begun to saturate a bit, but even that was less obviously true at the point I wrote that email message (I think as a teenager?). Seems a bit hubristic to be *that sure* there's no big errors left.

@ESYudkowsky @perrymetzger are you equally uncertain whether a sufficiently advanced AI might be infinitely compassionate and ethical?

@SHL0MS @perrymetzger That takes work and doesn't happen spontaneously. 1 in 10,000 for it happening spontaneously seems high. 1 in 10,000 for "actually you can just do that with RLHF"... actually seems low, if anything? But people would have to actually bother, and that's not a certain step.

@ESYudkowsky @SHL0MS That’s quite some spin on suggesting that we should take the possibility that AGIs might perform magic seriously. I get that it’s kind of embarrassing to have said it of course.

@perrymetzger @SHL0MS Ain't embarrassed by nothin' I actually said. I suppose Metzgerism has some much stranger take on what you hallucinated to be "implied" in the text.

Share this thread

Read on Twitter

View original thread

Navigate thread

1/12