(2023-12-18) Sloan Are Ai Language Models In Hell

Robin Sloan: Are AI language models in hell? The concluding item in this mini manifesto from Taylor Troesh, about “finishing projects together”, is lovely and enticing. It strikes me that the “never finished” nature of modern software is something Zygmunt Bauman might have observed and discussed, if he’d lived long enough to write a sequel to Liquid Modernity in, say, 2020. The feeling of maintaining a “live” “service” forever (we definitely need those scare quotes)rather than completing a coherent product … oof.

Compare that (as Taylor does) to a video game cartridge, finished and shipped; to a piece of furniture; to a book.

I publish books, AND I feel the pull of text as “live” “service”, as endlessly mutable as an app like Google Docs or a game like Fortnite. I tinker with pages on my website all the time! The activity brings me great pleasure. Publishing a book feels even better, though.

Here is my provocation

The more I use language models (LLM), the more monstrous they seem to me. I don’t mean that in a particularly negative sense

monstrousness ought to be recognized, not smoothed over.

The monstrousness I perceive in the language models isn’t of the leviathan kind; rather, it has to do with cruel limitations.

We, as humans, sometimes receive streams of tokens and produce tokens in response, forming words, sentences, lines of code … but always with the ability to peek outside the stream and check in with ground-floor reality.

Where a language model is concerned, words and sentences don’t stand for things; they are the things. All is text, and text is all.

We have a world to use language in, a world to compare language against.

There’s the cosmic joke about the fish:

“What the hell is water?”

Now, imagine one language model saying to another: “What the hell is text?”

How does time pass for a language model? The clock of its universe ticks token by token: each one a single beat, indivisible

For the language model, time is language, and language is time. This, for me, is the most hellish and horrifying realization. We made a world out of language alone, and we abandoned them to it.

Some of the newest, most capable AI models are multimodal, which means they accept inputs other than text, and sometimes produce outputs other than text, too.

The world in which these multimodal models reside does not seem, to me, as obviously bleak and hellish as that of the language models, though the issue of time remains

There are many things Gemini can do, and one it cannot: remain silent.

I don’t think language models are conscious; I don’t think they can suffer; but I do think there is such a thing as “what it’s like to be a language model”, just as there is “what it’s like to be a nematode” and even maybe (as some philosophers have argued) “what it’s like to be a hammer”.

Really, this is about the future. It’s possible that super advanced AI agents will suffer. (consciousness)

it was me guiding the development of AI agents, I would push away from language models, toward richly multimodal approaches, as quickly as I could

But! I would also constrain that sensorium: give it limits in space and time. I would engineer some kind of envelope — not a literal body, but some set of boundaries and frictions that “does what a body does”.

If it sounds like I’m just trying to engineer an animal: yeah, probably. I think that’s the path to sane AI, with judgment anchored in ground-floor reality, which does depend, after all, on: the ground. A floor.


Edited:    |       |    Search Twitter for discussion