Are smarter entities more coherent?

There is an assumption behind this misalignment fear, which is that a superintelligent AI will also be supercoherent in its behavior¹. An AI could be misaligned because it narrowly pursues the wrong goal (supercoherence). An AI could also be misaligned because it acts in ways that don’t pursue any consistent goal (incoherence). Humans — apparently the smartest creatures on the planet — are often incoherent. We are a hot mess of inconsistent, self-undermining, irrational behavior, with objectives that change over time. Most work on AGI misalignment risk assumes that, unlike us, smart AI will not be a hot mess.

In this post, I experimentally probe the relationship between intelligence and coherence in animals, people, human organizations, and machine learning models. The results suggest that as entities become smarter, they tend to become less, rather than more, coherent. This suggests that superhuman pursuit of a misaligned goal is not a likely outcome of creating AGI.

That is from a new essay by Jascha Sohl-Dickstein, speculative but interesting. Via N.