Gradual Empowerment?
The subtitle is “Systemic Existential Risks from Incremental AI Development,” and the authors are Jan Kulveit, et.al. Several of you have asked me for comments on this paper. Here is the abstract:
This paper examines the systemic risks posed by incremental advancements in artificial intelligence, developing the concept of `gradual disempowerment’, in contrast to the abrupt takeover scenarios commonly discussed in AI safety. We analyze how even incremental improvements in AI capabilities can undermine human influence over large-scale systems that society depends on, including the economy, culture, and nation-states. As AI increasingly replaces human labor and cognition in these domains, it can weaken both explicit human control mechanisms (like voting and consumer choice) and the implicit alignments with human interests that often arise from societal systems’ reliance on human participation to function. Furthermore, to the extent that these systems incentivise outcomes that do not line up with human preferences, AIs may optimize for those outcomes more aggressively. These effects may be mutually reinforcing across different domains: economic power shapes cultural narratives and political decisions, while cultural shifts alter economic and political behavior. We argue that this dynamic could lead to an effectively irreversible loss of human influence over crucial societal systems, precipitating an existential catastrophe through the permanent disempowerment of humanity. This suggests the need for both technical research and governance approaches that specifically address the risk of incremental erosion of human influence across interconnected societal systems.
This is one of the smarter arguments I have seen, but I am very far from convinced. When were humans ever in control to begin with? (Robin Hanson realized this a few years ago and is still worried about it, as I suppose he should be. There is not exactly a reliable competitive process for cultural evolution — boo hoo!)
Note the argument here is not that a few rich people will own all the AI. Rather, humans seem to lose power altogether. But aren’t people cloning DeepSeek for ridiculously small sums of money? Why won’t our AI future be fairly decentralized, with lots of checks and balances, and plenty of human ownership to boot?
Rather than focusing on “humans in general,” I say look at the marginal individual human being. That individual — forever as far as I can tell — has near-zero bargaining power against a coordinating, cartelized society aligned against him. With or without AI. Yet that hardly ever happens, extreme criminals being one exception. There simply isn’t enough collusion to extract much from the (non-criminal) potentially vulnerable lone individuals.
I do not in this paper see a real argument that a critical mass of the AIs are going to collude against humans. It seems already that “AIs in China” and “AIs in America” are unlikely to collude much with each other. Similarly, “the evil rich people” do not collude with each other all that much either, much less across borders.
I feel if the paper made a serious attempt to model the likelihood of worldwide AI collusion, the results would come out in the opposite direction. So, to my eye, “checks and balances forever” is by far the more likely equilibrium.