Eight Things to Know about LLMS

A good overview from computer scientist Samuel R. Bowman of NYU, currently at Anthropic:

1. LLMs predictably get more capable with increasing investment, even without targeted innovation.
2. Many important LLM behaviors emerge unpredictably as a byproduct of increasing investment.
3. LLMs often appear to learn and use representations of the outside world.
4. There are no reliable techniques for steering the behavior of LLMs.
5. Experts are not yet able to interpret the inner workings of LLMs.
6. Human performance on a task isn’t an upper bound on LLM performance.
7. LLMs need not express the values of their creators nor the values encoded in web text.
8. Brief interactions with LLMs are often misleading.

Bowman doesn’t put it this way but there are two ways of framing AI risk. The first perspective envisions an alien superintelligence that annihilates the world. The second perspective is that humans will use AIs before their capabilities, weaknesses and failure modes are well understood. Framed in the latter way, it seems inevitable that we are going to have problems. The crux of the dilemma is that AI capability is increasing faster than our AI understanding. Thus AIs will be widely used long before they are widely understood.  You don’t have to believe in “foom” to worry that capability and control are rapidly diverging. More generally, AIs are a tail risk technology, and historically, we have not been good at managing tail risks.

Comments

Comments for this post are closed