Hi! I don’t usually send my podcast conversations out to newsletter subscribers. I figure that people who like listening to them can check their NonZero podcast feed or the NonZero YouTube channel—and that one thing the world doesn’t need more of is inbox clutter.
But I think the conversation we’re releasing today, about the “AI race” between China and the US, is a particularly important one, and I wanted to explain why. You’ll find that explanation below this video. (As usual with our podcasts, the first part of the conversation—about an hour in this case—is available to the public and the rest to paid NZN subscribers, who can access the complete audio on their special podcast feeds and the complete video by using their magical powers to penetrate the paywall at the bottom of this post.)
The conversation is with Dan Hendrycks, a big figure in AI. It’s about a paper he just co-authored (and in fact published today), with some other notables: Eric Schmidt, former CEO of Google and a big figure in tech policy and national security policy circles, and Alexandr Wang, CEO of Scale AI. The paper, called “Superintelligence Strategy,” is putting forth some very big ideas about tech and national security, including the one this conversation focuses on: what the authors call “mutually assured AI malfunction,” or MAIM.
In principle, the dynamic this term describes could apply to any two adversaries that are trying to beat each other to so-called artificial general intelligence (AGI) and then to so-called superintelligence. But by far the most obvious and important application is to the US-China relationship.
AGI is considered by some to be a major threshold that leads in very short order to superintelligence, which in theory could allow one country to utterly dominate countries that are just months behind it in AI progress. So, as Dan and his co-authors point out, if one country seems to be approaching AGI, its adversary could feel so threatened that it takes extreme measures, like cybersabotage of AI infrastructure or even more aggressive moves.
The “Superintelligence Strategy” paper argues that awareness of this possibility of sabotage or other aggression could keep countries from seeking the kind of clear cut AI dominance that might trigger such countermeasures by an adversary. This theoretically stabilizing dynamic is what they call mutually assured AI malfunction. That is of course a reference to mutually assured destruction, or MAD, the stabilizing dynamic that helps keep nuclear powers from attacking each other (since a would-be attacker knows that destruction would likely be mutual).
I thought Dan and I had a great conversation, with lots of healthy disagreement, but I realized afterwards that I had failed to make explicit a big point that was implicit in some of the pushback I gave him. Namely: I do not see the dynamic that he and his co-authors are calling mutually assured malfunction as a very plausibly stabilizing force, even if you implement some of their ideas for making it more stabilizing.
And the reason is that I think in critical respects this dynamic is not analogous to mutually assured destruction. The rules of the game are fuzzier than with mutually assured destruction, and the actions of the two sides are more ambiguous, more prone to differing interpretations.
For example: I suspect China finds current US restrictions on its microchip imports—restrictions that all three authors of this paper support—more deeply threatening than American policy makers realize. And these kinds of asymmetrical perceptions—one country not realizing how threatening its actions seem to another country—are very common in international relations (and in fact have been known to start wars). Yet mutually assured malfunction, if it is to be a truly stabilizing dynamic, would seem to require that such asymmetrical perceptions are rare and that a country like the US is consistently pretty good at understanding how its rivals view things and when they do and don’t feel threatened. And, in my view, the US is historically very bad at exercising this kind of cognitive empathy.
Indeed, as regular NZN podcast listeners know, I’ve marveled (and despaired) at the failure of the US foreign policy establishment to recognize how US chip restrictions have made war in Taiwan more likely—even though the logic is pretty simple. Namely:
The world’s most advanced AI chips are made in the TSMC factories in Taiwan, and the US chip restrictions mean that China can no longer get those chips (except through a kind of black market), whereas the US and its allies get lots of them. So what used to be a deterrent to Chinese invasion—the likelihood that war would disable factories whose most precious output China shared—is much less of a deterrent. What’s more: According to the logic of the Hendrycks paper, if the US gets closer to AGI, and seems appreciably closer to it than China is, disabling those factories could become a goal for China in and of itself (as opposed to just a not-so-regrettable byproduct of an invasion motivated mainly by other concerns).
So, in the end, thinking about the theoretically stabilizing effect of mutual assured malfunction only deepened my sense that this whole AI race with China is destabilizing and dangerous. And it only strengthened my belief that the US and China urgently need to open a serious dialogue about AI and agree on some rules of the road and together try to steer the world toward some degree of cooperative guidance of a technology that has unprecedented potential benefits and unprecedented potential dangers.
But your mileage may vary. So I hope you’ll listen to my conversation with Dan and see what you think. And I hope that, if you like the conversation, you’ll click—I mean smash—the YouTube like button and/or rate and review the podcast and/or share this post with friends.
Overtime video: