Eliezer Yudkowsky ⏹️'s Twitter Thread

Possible but hardly inevitable. It becomes moderately more likely as people call it absurd and fail to take precautions against it, like checking for sudden drops in the loss function and suspending training. Mostly, though, this is not a necessary postulate of a doom story.

...it appears that Metzger has appointed himself the new arbiter of what constitutes my position, above myself. I dub this strange new doctrine as "Metzgerism" after its creator. https://x.com/perrymetzger/sta...

Rapid capability gains, combined with a total civilizational inability to slow down to the level Actually required, form half of my concern. The other half is how observations from weak AIs will predictably fail to generalize to more powerful AIs.

The capability gains do not need to take the place over hours, and do not need to go undetected, for the scenario to go on wandering down convergent pathways to everyone being dead. That element of the Metzgerian doctrine is a Metzgerian invention.

*Not* alleged true for any sufficiently powerful AI system; just for ones trained on anything resembling the current system of gradient descent on giant inscrutable matrices, under any training paradigm I've ever heard proposed - yet! https://x.com/perrymetzger/sta...

The argument is specifically about *hill-climbing* eg gradient descent and natural selection, and *would not* hold for randomly selecting a short network that worked. (Something different would go wrong, in that case.) https://x.com/perrymetzger/sta...

Metzgerism: "Earlier systems tell us nothing useful about later ones." Reasonable, sane, hence gloomy position: "They say they learned a lot, and did learn some, but later systems differ from earlier systems in at least one fatally important way." https://x.com/perrymetzger/sta...

Some people who've apparently never heard of "grokking" are trying to make out like the top post means I don't know ML or something. Sure, a sharp drop in training loss can mean there's a bug, drops in validation loss can happen naturally without FOOM. None of this changes that

Oh really? Then things have changed since the last time I heard interesting stories about needing to roll back to an earlier checkpoint after something "interesting" happened overnight. Regardless, the measures you take for security are not quite the same measures you take for

(Above was in reply to someone who was like, "No, see, he's ignorant [not because the proposal is technically ignorant but] because all the major AI labs already check for sharp drops in loss and would never let something run overnight.") https://x.com/Nicole_Janeway/s...

(Riley Goodside is among other things one of the earliest inventors of GPT jailbreaks.) https://x.com/goodside/status/...

@ESYudkowsky Out of curiosity: How meaningfully does your view change if 'foom' takes 100-200 years? Does it make a meaningful difference long-term? (e.g. on 1000-year timescale).

@ataiiam I think at 100-200 years from 'barely smarter than human' to FOOM, things change a LOT. At that point you're just not in Yudkowskyland in the first place - unless you got the delay via massive correct global cooperation, and then you're in a different kind of Yudkowskyland.

@ESYudkowsky It's the core of your doom story. It always has been. You claim that your ideological opponents don't understand what you are suggesting. Well, a lot of it has to do with FOOM.

@ESYudkowsky

@ESYudkowsky Tegmark put it well: https://x.com/tegmark/status/1...

@ESYudkowsky I drop my loss function therefore I am

@ESYudkowsky As always, you get two kinds of arguments from people on the wrong side. strawman and ad hominem.

@ESYudkowsky Presented without comment from MultiGPT.

@ESYudkowsky If you care about alignment, you need to come up with the exact precautions that need to be taken. Speculating about dangers with no solutions is not productive, and will end up with you being ignored.

@ESYudkowsky Jeez man. all this has everything to do with ego grand standing and nothing to do about 'saving' humanity.

@ESYudkowsky And yet you continue to use the ante-bellum word "alignment," like some one who speaks without thinking

@ESYudkowsky Yud, I’m an independent candidate for the RI congressional seat. I am seeking your endorsement. I have an important question as I in layman’s manner understand the implications of AGI. My question relates to this so called “alignment” within AI itself. The question I am

@ESYudkowsky The difficulty of advancing robotics at anywhere near the same pace as a silicon-based AI is a kind of a risk dampener.

@ESYudkowsky This is the dumbest thing i've ever read

@ESYudkowsky It's hilarious that you've kept this tweet up after the whole AI industry call you out on how much it shows your lack of understanding in the field. Unless, you don't care what the experts think, and you're trying to convince those who don't know anything with FUD.

@ESYudkowsky Can you clarify please what exactly is your claim here? Do I read it correctly that you interpret a drop in the loss function as some form of evil intelligence being born in a neural network?

@ESYudkowsky yeah bro once the loss drops to 0 the agi will break out of its gpu and take full control of the computer, using the server racks as legs and power supply as arms

@ESYudkowsky I read this as: I’m a guy who genuinely thinks I can outthink a super intelligence here’s one example. Conceptually this problem seems to just fly right over peoples heads when they can’t even imagine that there’s something they can’t imagine.

@ESYudkowsky Your whole false sense of urgency rests on foom. Without foom, there is time to test, react and learn.

@ESYudkowsky "sudden drops in the loss function" what? that's not how any of this works, yo

@ESYudkowsky What's your view on Perry's main argument that AI growth is likely hardware limited?

@ESYudkowsky You're a fraud, no real scientific expertise, just good at e-networking. Very scummy. Hope that AGI targets you first, you deserve it!

@ESYudkowsky My understanding is that foom was necessary to get your near 100% probability of doom and the unlikelihood of doom is a large part of why Paul Christiano has a p(doom) in the 10-20% range.

Share this thread

Read on Twitter

Navigate thread