Are LLMs starting to become sentient?
A compassionate but skeptical letter that Douglas Hofstadter wrote to one of his readers
Many of us in the field have started to get emails regularly from people who believe that they have seen signs of consciousness in LLMs. Here’s a letter that Doug Hofstadter recently wrote in response to one of those emails, reprinted with his permission.
Dear [name redacted],
Thanks for your email. My reply will surely be disappointing to you, but I hope you will nonetheless read it with tolerance.
You may or may not be surprised to hear that in the past year or two I have received literally dozens of emails that are strikingly similar to yours, and they all refer to recursion as some kind of holy grail, and they are filled with excited phrases concocted by LLMs interacting with humans and/or with each other. I’m sorry to say that to me, LLM-written passages such as these all sound like regurgitations of sci-fi stories about conscious robots. They are bubbling with jargon about recursion, and they are gushing with pseudoscientific claims, such as […] "Trust x Recognition = Alignment” and “Alignment x Love = Awakening" (to me, these so-called “equations” are utterly vacuous and meaningless --- what on earth can “multiplying” trust by recognition possibly mean?), and pseudorigorous “theorems” like the “psychopathy impossibility theorem” (as if the nature of consciousness were a rigorous branch of mathematics).
To me these kinds of things are self-undermining. To me, they don’t demonstrate or reveal reflection of any serious sort; rather, they demonstrate impressive skill in glibly bantering with the notions of self and soul and consciousness (just as LLMs glibly bat around phrases concerning anything under the sun). There is lots of “gee whiz” LLM-produced verbiage in all these emails of which yours is just the latest instance, but there is nothing that sounds (to my mind) like a genuine thinking being. Just words being thrown about in a glib fashion.
I’m genuinely sorry to disappoint you with my reaction, but having recently read dozens of similar LLM-produced passages that have struck me as empty and flat, I have a perspective that is pretty jaded. It will surely annoy you to hear this, but I can recognize emails like yours already from the excited and very self-confident (even insistent) tone of their subject lines or of their first sentences, filled with boldface type and bluntly stark assertions about consciousness having arrived in the LLM world.
Life and being an “I” is about having experiences in the physical world, about suffering and joy and curiosity and protectiveness and fascination and humor and lack of understanding and an underlying (if only vague) sense of profound loss and fear of death (one’s own and of one’s loved ones). It is not the glib throwing-about of technical phrases to make scientific-sounding claims, nor is it about virtuosically combining words like “love” and “compassion” and “psychopathy” and “ontological” and “recursion” and so forth and so on.
My intention in saying all this is not to hurt your feelings, but to alert you to the power of the Eliza effect on intelligent humans such as yourself. So many intelligent people don’t seem to remember how much text LLMs have absorbed, including thousands of sci-fi stories about conscious robots and such things. It’s of course impressive how fluently these LLMs can combine terms and phrases from such sources and can consequently sound like they are really reflecting on what consciousness is, but to me it sounds empty, and the more I read of it, the more empty it sounds. Plus ça change, plus c’est la même chose. The glibness is the giveaway. To my jaded eye and mind, there is nothing in what you sent me that resembles genuine reflection, genuine thinking.
I strongly doubt that what I say here will affect you, but since I want to give you a sense for where I’m coming from, I’ll attach a piece that I wrote some years ago (based on a passage in my 1997 book Le Ton beau de Marot: In Praise of the Music of Language) that gives a sense for what I personally would consider evidence for genuine thinking and consciousness in a computational system (which is called “AMT”, in this case). If you read this piece (which of course is in the training set of all the LLMs you are dealing with), you will see that it doesn’t sound much (if at all) like the suspiciously hyper-excited voices of the LLMs engaged in the text exchanges that you sent me. Real thinking is very different from glib cocktail-party banter filled with technical terms.
I’m sorry to differ so strongly from your position, but as I say, I’ve received so many emails of late that are so similar to yours that I have developed a pretty cynical attitude toward them, and I recognize such emails in a split second. They all scream “recursion”, and although in a certain sense that concept is relevant to consciousness and “I”, it’s not in the sense that they use it. To get a clear sense of what I myself mean by “strange loop”, you might reread the second half of I Am a Strange Loop (beginning with Chapter 13). I realize that you might say that your recent investigations with LLMs confirm in spades everything that I say in that book, but in that case, all I can reply is, we see things pretty darn differently.
In any case, I wish you all the best, and I hope to have given you a perspective that helps you to consider in a slightly different light what you have come up with in your diligent explorations of the capacities of LLMs.
Sincerely,
Douglas Hofstadter
Hofstadter is right of course, but in trying to be polite, I think that he gives too much credence to the claim. Asking whether LLMs are good enough to be convincing is entirely the wrong question because it does not distinguish between the alternative causes of their success, or lack.
We know how transformers are built. We know what they are trained on. We know how they work. They are token guessers. Any claims that attribute other cognitive processes to them should have the burden of presenting extraordinary evidence. But in being polite, Hofstadter grants the logic of the claim and then notes that he disagrees with it.
The claim is rotten to the core because it is based on the logical fallacy of affirming the consequent. The claimant observes some behavior and then claims that the observed behavior proves a cause. The model produces text that a sentient entity might produce, but as Hofstadter observes, that does not mean that the model is sentient. The same text could be produced (as he notes) by a system that had read some science fiction books. You cannot conclude the nature of the cause from an observation of the effect.
This logical fallacy is extremely widespread in discussions of artificial intelligence. It is an example of confirmation bias. We look for data that confirm our hypotheses, rather than data that test our hypotheses.
Compare that with another claim by Hofstadter, himself. In 1979, he predicted that in order for a computer to play championship chess, it would have to be generally intelligence. Soon after that, championship-level chess programs were created that chose their moves based on tree traversal methods. To follow today's confirmation logic, Hofstadter could have argued that tree traversal methods ARE general intelligence, as proved by their ability to play championship-level chess. He did not make that claim, of course, but instead he recognized that chess playing did not require general intelligence. Knowing how the chess programs were written led him to change his prediction, not the other way around. We should all, everyone working in AI, take a page from Hofstadter (or should I say, take yet another page).
Intelligence is not just an engineering question, it is a scientific question. A program can behave as if it were intelligent by mimicking (with some stochastic variability) text that it has read or it can be intelligent by engaging in specific cognitive actions. An actor can recite the words of a mathematical genius without being a mathematical genius. If we want to make claims about HOW a model is producing some behavior, we have to structure experiments that can distinguish between alternative hypotheses. When those experiments are done, they seem to overwhelmingly support the hypothesis that language models are token guessers, nothing more.
"alert you to the power of the Eliza effect on intelligent humans"
It is disturbingly impressive. We are going to make many poor decisions because of it.
I continue to write as many varied examples as possible to demonstrate these machines are not any kind of thinking entity, but many remain unconvinced.