81 Comments

As someone who works with LLMs every day trying to get them to work consistently at scale (that is, applying the same prompt template thousands of times), I can confidently say that LLMs don't understand anything. Something that understood the directions would not give a completely different answer when the prompt is amended to say "Format your answer using JSON". Why would or should that change anything? There is even a paper where they simply changed the format of the prompt /format/ and got different results, "QUANTIFYING LANGUAGE MODELS’ SENSITIVITY TO SPURIOUS FEATURES IN PROMPT DESIGN or: How I learned to start worrying about prompt formatting". That's not understanding.

There are certainly some semantics that come along for the ride when you have trained on massive amounts of text, but that doesn't mean you have semantics. We know this from Eliza.

They are...transformers. We are doing some very clever things with LLMs but we have to be clever because of the LACK of understanding.

Expand full comment

Exactly. I have said that so many times I’ve lost count, and stopped trying to make the point: LLMs have ZERO understanding. It struck me suddenly and became crystal clear when I used ChatGPT to help me edit a philosophy article. It’s really a kind of joke now to think these machines would have real understanding.

Expand full comment

Truth

Expand full comment

AI terminology is at best misleading. We assume these machines must be behaving like humans because... Well, we've always assumed these machines must be behaving like humans. And so we allow ourselves to use terms like "hallucination". Dijkstra pointed out long ago the problems with anthropomorphisms in the software industry. But anthropomorphisms in AI are pathological.

Expand full comment

I think we've recently crossed the line between anthropomorphism as an accidental function of our language and anthropomorphism as a deliberate technique of pro-AI propaganda. I see more and more content attacking people who refuse to partake in this pathology.

Expand full comment

The solution to resolving the initially-mentioned hallucination issue is obvious: you must purchase a hen and name her Henrietta.

(Speaking from experience, they make excellent pets.)

Expand full comment

🤦‍♂️Hinton’s entire argument is a prima facia demonstration that if you are a deep learning engineer or modeler who is *not* also cross-trained in any/all of neuroscience, cognition, or philosophy, then you are saying really stupid things based on the AGI fallacy.

I don’t expect Hinton to know better, as he has no idea (as in “zero” ability to know ) what he is taking about with respect to comparisons between automata and living creatures - especially humans. But I *do not* accept Hinton feeling he can attack a fellow academic, author, and mainstream voice of reason such as Gary.

Me? Sure - I called him a simpleton. I meant it, and he is free to call me anything he wants to (not that he is likely aware of my work). But Gary? Bullshit.

Expand full comment

The problem extends well beyond Hinton, though this particular bit of behavior is exceptionally egregious.

The fact that someone is expert in deep learning does not imply that they are also expert in human perception and/or cognition, and so forth. The analogy I came up with to make this deficiency vivid is that of the captain of a whaling ship who is an expert in the ship's construction and operation, but knows little about long distance navigation in the open sea and little about whales. Two months ago I published an article based on the analogy: https://3quarksdaily.com/3quarksdaily/2023/12/aye-aye-capn-investing-in-ai-is-like-buying-shares-in-a-whaling-voyage-captained-by-a-man-who-knows-all-about-ships-and-little-about-whales.html

Expand full comment

People. Want. To. Believe.

That is why, IMO, https://www.mattball.org/2022/09/robots-wont-be-conscious.html

Expand full comment

I stopped listening to Hinton already for quite a while now, but I do wonder: how can someone so intelligent be so wrong?

Expand full comment

powerful people get isolated from reality over and over again…

Expand full comment

You are right he is, well, fetishizing LLMs. He is an LLM idolater. This is a common human behavior. I want to think X loves me, so I see that X does love me in everything X does. Each of those things X does symbolizes the love I want to believe exists. We also readily confuse symbols with what they symbolize. "I love money" but money is only the symbol for what money buys. That too is a fetish. Humans are innately fetishistic. We fall in love with icons, but icons are only the symbols of the divine. That is idolatry; so yes, we are readily predisposed to become LLM idolaters.

Expand full comment

There's so much BS in the AI world these days I've largely tuned out, or at least I'm not hanging on every new paper like some people seem to. Five years ago, the 50,000 ft view was "NN, NN, NN, NN, NN"; now it's "LLM, LLM, LLM, LLM, LLM". It's hard to see any real imagination or innovation.

Expand full comment

I'm always frustrated that some people can have such an uncharitable assumption of others knowledge. Thinking an A.I. researcher doesn't understand that neural nets have weights is about as uncharitable as it gets.

Expand full comment

I find it bizarre that Hinton would not understand that LLMs can "memorize" training data. I would think overfitting sequences of at least some training data is expected, or even desirable. Obviously there's no particular place in the neural net that contains verbatim sequences of text, but the number of parameters in LLMs is on the order of the size of their training data, and IIUC they're trained in the same examples many times. The token encoding is literally taken from a compression algorithm (byte pair encoding) to efficiently encode text, and LLMs are specifically designed and trained on the task of predicting the next token in their training data. If they performed perfectly according to their loss function, then they would have memorized all the text in the training data pretty much by definition. And he can't have it both ways: if an LLM *didn't* exactly memorize some sequences of text (eg the Lord's Prayer in King James Version or Reagan's "tear down this wall" speech), it would lack understanding of English literature or history.

Expand full comment

A little chicken whispered in my ear that the Hinton ego machinery is telling him he’s responsible for a coming super-intelligent AI, inspiring fear and guilt, so is on the attack in many ways. Unfortunately, he’s hallucinating both about AI and his own importance in the grand scheme of things, and sadly, he outputs a shallow understanding from his brain’s neural net regarding true understanding (that's a joke, folks).

I think he should accept how little we know about what intelligence is, and join forces with Gary and lend a hand to a more productive path forward. That would be a better way to make use of his time in retirement, in my humble opinion.

Expand full comment

what a great piece! I guess some may argue that hallucinations are signs that the models are capturing some patterns, or “generalising”. But cats can also learn patterns, we wouldn’t exactly call that understanding. Can hallucinations be thought of as overfitting?

Expand full comment

LLM "hallucinations" aren't what human beings do, at all. When people hallucinate, there's still a referent; The hallucinations are all about something, right? How do people have hallucinations about nothing at all? "Describe what you see and hear when you were tripping..." "Oh, nothing."

???

LLMs don't deal with referents. They're "about" nothing specific- They match corresponding signal patterns spread across their whole "neural" nets. The whole reason why those things "hallucinate"/confabulate is because they don't deal with anything specific.

Expand full comment

Hinton has a very shallow understanding of AI and neural networks (despite being touted as one of the godfathers of deep learning). I don't know whoever takes him seriously.

Expand full comment

"Understand" and "know" mean something when applied to human conciousness. Isn't that the point of Searle's Chinese Room thought experiment? A digital computer does not "know" or "understand" anything at all.

Expand full comment

Yes, Searle's argument is a good one. Meaning and understanding don't come from mechanically putting parts together. It comes as a whole, in an instant. Explain that! That's hard for a engineering or science-as-materialism mind to accept. We may have lots of data on how minds behave (behaviour through time), but no clue whatsoever what consciousness is: that which is aware of all the mind happenings and has the attributes of intelligence, and an understanding of truth and meaning.

Expand full comment
Comment removed
Feb 5
Comment removed
Expand full comment

If someone — say: a magpie — would be able to understand the difference between 5, 6, and 7, but it would not be able to do the same with 105, 106, 107, we would not say it understands 'numbers'. It does not have what Dehaene has called 'The Number Sense' (which is where the magpie example comes from).

Once you *understand* numbers, you understand all, regardless of size. You can make errors when doing arithmetic, but not because you do not understand. But LLMs do not understand numbers at all. For an LLM, numbers are token sequences, not numbers. Even if it comes up with the right answer to a sum, it doesn't understand, because it comes up with that correct answer for the wrong reason. And that is why the chatbot people build 'dumb' filters that detect potential arithmetic in prompts and then sends it to an LLM to create python code, execute that python code, that silently add that to your prompt and then let an LLM generate fluent text with the correct answer. They engineer these tricks/solutions *because LLMs are unable to understand*.

Expand full comment
Comment removed
Feb 5
Comment removed
Expand full comment

No, it is not a dumb trick, it actually inspired engineering (even if the detection/filtering is 'dumb').

I do not agree with you about the people. I agree that the only way we learn (especially 'get skilled') is by doing. The point is the 5,6,7 vs. 105,106,107 example.

Expand full comment
Comment removed
Feb 5
Comment removed
Expand full comment

What do you mean by “run an experiment?” Can you give some examples.

Expand full comment
Comment removed
Feb 5Edited
Comment removed
Expand full comment

We could certainly build a robot that adapts and has better adaptations over time in its system, but I don’t see what bearing that has on what Searle’s argument is. He is more pointing to what we don’t know, such as what or where meaning is. It's obviously not held in behavior or in objects.

Expand full comment
Comment removed
Feb 7Edited
Comment removed
Expand full comment

Bingo!!!!

Expand full comment

What is it with pathological need to use biological and psychological metaphors in Computer Science? It rarely has anything to do with what people mean by these terms. Understanding is reduced to something so shallow that apparently I can now say that my calculator understands arithmetic. Just cleaning up the language to use more precise technical terms would go a long way to making these sort of discussion much more productive.

Expand full comment