27 Comments
Feb 3, 2023Liked by Gary Marcus

The core issue here is that human cognition is indeed compositional and systematic. We form and understand a sentence like "Sue eats pizza" by combining its words in an agent-action-theme sentence structure. This ability is systematic, because with the same words we can also form and understand "pizza eats Sue". E.g., we know that this sentence is absurd precisely because we identify "pizza" as the agent.

A cognitive architecture can achieve this only if it has 'logistics of access'.

Newell analyzed this in detail in his Unified Theories of Cognition (1990, e.g., p. 74-77). In short, his analysis is:

1. Local storage of information is always limited. So, with more information to deal with, the system needs 'distal access' to that information.

2. Then it needs to 'retrieve' that information to affect processing.

For example, we can form arbitrary sentences with our lexicon of around 60.000 words or more (e.g., "pizza eats Sue"). Trying to do this based on chaining words from other (learned) sentences will not work, if only because the amount of sentences that can be created is simply too large for that.

Instead, this requires an architecture that provides distal access to arbitrary words in the lexicon, and can combine them in arbitrary sentence structures.

The architecture that Newell analyzed uses symbols to achieve distal access and to retrieve that information for processing (as in the digital computer). It is interesting to note that his use of symbols and symbol manipulation thus derives from the more fundamental requirement of logistics of access.

This opens up a new possibility: to achieve logistics of access without using symbols. For example, with an architecture that achieves distal access but does not rely on retrieval.

An architecture of this kind is a small-world network structure. An example of that is the road network we use for traveling. It is productive and compositional, because it gives the possibility to travel (basically) from any house to any other. Not by direct connections between them, but via dense local roads and sparse hubs. Also, access from a new house to any other can easily be achieved just by connecting that house to its nearest local road.

Neural networks can achieve this for language as well. An interesting consequence is that 'words' are network structures themselves that remain where they are. A sentence is just a connection path that (temporarily) interconnects these word structures. As a consequence, words are always content addressable, which is not the case with symbols or vector operations.

(For a more detailed analysis, see e.g., arxiv.org/abs/2210.10543)

Expand full comment

That paper describing 'linguistic inputs' in children as if that's actually how we make sense of the world is such a great illustration of the head-banging problem at the heart of this. How difficult is it to understand? We make sense of the world and navigate it with our bodies. The Stochastic Parrots term is great (LLMs will only ever be able to output a sort of empty meta-language) but it still suggests an organic being that uses the meta-language to communicate, even if the 'words' it's saying are mimicry - the fact it is using its vocal cords, tongue, beak, to make sounds that attract other animals (us) likely to give it food and attention is not meaningless.

But an LLM is always virtual, never 'needing' anything, never caring about anything, never feeling physical agitations of the nervous system that signals anything about the environment. So what's the point? We got to the stage where chatbots can run mundane tasks that are - at best - boring for humans, at worst generating abuse from angry customers. That's useful, if limited. But trying to 'solve' the problem of meaning? Surely that's a category error; understanding the world is not what a chatbot does or is even supposed to do. Neither is it a problem to be solved, or one that a machine can ever solve unless we literally learn how to give them a human body. And automating creativity? What is WRONG with these people? Automation was supposed to free us up to do the things we love. If creativity isn't exactly that, then what's left? It's all just so weird to me.

Expand full comment
Feb 3, 2023·edited Feb 3, 2023

Agree. I don't understand how people think they can get machines to think and understand the world by pure linguistic input. The physical input is missing here; sight, sound, taste, smell, touch. That's how humans and animals come to understand the world around them, its how they BEGIN to understand. You wouldnt say that a 6 month old child with no sophisticated linguistic capability has no understanding of the world around them. Babies put keys and other objects in their mouths for a reason, they're "tasting" the physical world before they gain the ability to encode it. The encoding of this understanding of the world comes later in the form of a language. And the idea that its a fully reversable process, that you can extract an accurate model of the world purely from its lossy encoding in language seems nonsense to me. That's why we need a 1000 words to describe a single picture, and even then it's a poor substitute. The understanding horse pulls the linguistic cart, not the reverse.

Expand full comment

Precisely.

Expand full comment
Comment deleted
Expand full comment

Right, and this is why language can change over time while the things it represents are - generally - the same shared phenomena, or the words can stay the same but the meaning change. It's a necessarily fluid system. The fundamentals of semiotics: the signifiers and the signifieds... The fact that a LLM will always be behind on this curve is concerning when considering the necessity for language to echo social change, and even help accelerate it. LLMs quickly 'becoming' racist is a good example of this: not only is it structurally regressive, it naturally reproduces the internet's bias towards extremism.

Expand full comment
Feb 11, 2023·edited Feb 11, 2023

Thats one aspect. Another is about language rule breaking to effect. Lewis Carroll's "Jabberwocky", Spoonerisms, malepropisms. How much can a LLM understand these things? Language is not just about connected words but also how we choose to use it for expression rather than for strict communication purposes, and for that there are no rules. I tried to get chatGPT to "spoonerise" a small piece of text. It was technically correct but failed because it lacked the humorous element. It just doesn't get it.

Expand full comment

You simply can't boil down the human experience of the world, in all of its senses and essence, to data and nets and algorithms. There will always be something missing. Every person alive today carries with them an entire world of lived experience, memory, and meaning—this is why we have so many emotional or felt expressions and reactions to the words we read, the things we see, the conversations we have, our own inner thoughts, etc. This is why some people click instantly, while others can't stand each other for no apparent reason. This is (probably) one of the reasons why we still do not fully understand the workings of the brain—or the mind.

It's almost like we're trying to play god. Not suggesting we drop all this AI stuff and go back to the campfire—just that there's a difference between trying to replicate an organic, biochemical and probably quantum entity, the human mind, and creating tools and systems to address our real-world problems. Which a lot of AI is already doing of course.

Expand full comment

Back in the early 1980s David Hays and I decided that the entire AI enterprise was intellectually bankrupt. So we decided to look at several technical literatures – cognitive psychology, linguistics and psycholinguistics, neuroscience, developmental psych, comparative psyc – and see what we could come up with. The resulting paper: Principles and Development of Natural Intelligence. Nothing on the current scene comes close. Here's the abstract:

The phenomena of natural intelligence can be grouped into five classes, and a specific principle of information processing, implemented in neural tissue, produces each class of phenomena. (1) The modal principle subserves feeling and is implemented in the reticular formation. (2) The diagonalization principle subserves coherence and is the basic principle, implemented in neocortex. (3) Action is subserved by the decision principle, which involves interlinked positive and negative feedback loops, and resides in modally differentiated cortex. (4) The problem of finitization resolves into a figural principle, implemented in secondary cortical areas; figurality resolves the conflict between pro-positional and Gestalt accounts of mental representations. (5) Finally, the phenomena of analysis reflect the action of the indexing principle, which is implemented through the neural mechanisms of language. These principles have an intrinsic ordering (as given above) such that implementation of each principle presupposes the prior implementation of its predecessor. This ordering is preserved in phylogeny: (1) mode, vertebrates; (2) diagonalization, reptiles; (3) decision, mammals; (4) figural, primates; (5) indexing. Homo sapiens sapiens. The same ordering appears in human ontogeny and corresponds to Piaget's stages of intellectual development, and to stages of language acquisition.

You can download it here: https://www.academia.edu/235116/Principles_and_Development_of_Natural_Intelligence

While you're at it, take a look at a paper that mathematician Miriam Yevick published in 1975: Holographic or fourier logic, Pattern Recognition, Volume 7, Issue 4, December 1975, Pages 197-213, https://doi.org/10.1016/0031-3203(75)90005-9. As far as I can tell, that paper has dropped off the face of the earth, which is a sign of the intellectual myopia that characterizes the academy. The paper is, in effect, a mathematical argument on why both symbolic and distributed neural networks are necessary to make sense of the world. Here's the abstract:

A tentative model of a system whose objects are patterns on transparencies and whose primitive operations are those of holography is presented. A formalism is developed in which a variety of operations is expressed in terms of two primitives: recording the hologram and filtering. Some elements of a holographic algebra of sets are given. Some distinctive concepts of a holographic logic are examined, such as holographic identity, equality, contaminent and “association”. It is argued that a logic in which objects are defined by their “associations” is more akin to visual apprehension than description in terms of sequential strings of symbols.

Here's a short commentary on that paper, Miriam Yevick on why both symbols and networks are necessary for artificial minds, https://new-savanna.blogspot.com/2022/06/miriam-yevick-on-why-both-symbols-and.html

Expand full comment

You have said that LLMs fail at abstraction. To the contrary, they are astoundingly good at abstraction. They perform generalization over examples, they treat slots and fillers correctly, they exhibit convincing knowledge of a great wealth of everyday concepts and relations. They very effectively apply context derived from combinations of the prompt and prior discourse. Certainly, this is all within the confines of linguistic competence---but that is nothing to sneeze at! As the Mahowald et al preprint points out, they fall short on cognition grounded in meaning connected to commonsense knowledge; formal reasoning; running situational awareness; theory of mind; agency; and goals.

The nonetheless remarkable abilities that LLMs do have require some means to mix and match elements representing objects, events, attributes, and relations. (Albeit, oftentimes illogically and sometimes incoherently.) How do they do it, in view of your citations?

The question for Connectionist models has always been, how do you get combinatoric mixing in a statically wired network? The answer, it seems, is through gating. The LSTM or GRU gates of RNNs, and more recently, transformer attention heads are the secret sauce the NN folks never had until recently. Oh, and don't overlook the magical representational power of vector embeddings.

What LLMs are not good at is the rest of executive function, because their cognitive architectures are primitive. In current form, they lack structured knowledge representations and access; medium-term memory storage and retrieval; sequential step-wise processing; a context stack; and procedural search with backtracking. Some of these functions seem to be kind of kludged from the straight transformer architecture, but as you repeatedly point out they are at this moment very crude. These architectural deficiencies are glaringly obvious and it would be extremely risky to believe they are overlooked by the research community and will remain un-addressed for a long time.

This is to say, while I really appreciate your outspoken justified criticisms of LLMs, history is moving very fast now. Please don't get on the wrong side of it by thinking we're still back at 1990.

Expand full comment

Abstraction is conceptualising physical phenomena. This is precisely what LLMs cannot do! The data is the abstraction that we humans have already done the work of. Recognising patterns within said data is only ever going to output content that's contextualised by the relationships between data that we humans have already agreed on. The only yardstick an LLM has for whether its output is statistically likely to be deemed correct is not whether it thinks it makes sense (it has NO idea if it makes sense or not), but whether our training of it told it that it appeared to make sense. Again: WE do the abstraction at point of input and WE determine if the abstraction appears correct at point of output.

LLMs exhibit zero knowledge of anything; whether they do it convincingly is just a matter of how much that fact is obscured.

Expand full comment

Not quite. Abstraction is the ability to generalize over data that behaves in constrained ways to formulate appropriate responses. Whether the domain is physical or social or mathematical is irrelevant. Once sensors have done their jobs, at the level of mind, it's all information. Sometimes, effective ways to perform abstract generalization is through construction of rules and structured knowledge that align with generative processes of the domain. Sometimes, very complex and mysterious "statistical" mechanisms effectively reconstruct semblances of such rules. What is so surprising about LLMs is that they are able to construct amazingly powerful representations that perform abstraction about the real world, from purely textual data. Do they use the representations correctly all the time? No. That is the rub about their not really understanding, and making egregious mistakes. However, the abstract representations that LLMs do construct for linguistic syntax and semantics are their own, built through induction, without deliberate instruction. And they closely mirror not only humans' linguistic knowledge, but pragmatics. This is the surprising development that AI researchers are all taking note of.

"Making sense" means different things for different domains. In physical domains, it means aligning with the behavior of the physical world. LLMs are weak on this because, to our embodied selves, much of this knowledge is either obvious or ill-suited to pure linguistic expression, so there is not a lot of useful text to be found. That's why transformers and Reinforcement Learning and other ML approaches are being applied to visual and robotic sensor data. Some of these architectures mimic LLM learning by training to predict masked or future data.

Most of what people write and converse about is socially constructed domains. Things like, how much a cup of coffee costs in Nebraska. In these areas of discourse, the expressivity of LLMs is very high, including correct use of types and tokens, quantifiers, and analogies. In the fields of computational intelligence, these are all forms of abstraction. On the other hand, LLMs' abilities to reason using abstract concepts are shallow for the reasons I outlined.

Ultimately, "making sense" means delivering appropriate outputs. Certainly, people can be more or less gullible about accepting superficially sensible responses from these things. It is important to amass examples of the limits of LLM's understanding of the world that humans inhabit both textually and otherwise, and to systematically map them. But the fact that the limits are non-trivial must not be under-appreciated.

Expand full comment
Feb 5, 2023·edited Feb 5, 2023

"However, the abstract representations that LLMs do construct for linguistic syntax and semantics are their own, built through induction, without deliberate instruction. And they closely mirror not only humans' linguistic knowledge, but pragmatics."

Yes, yes, yes! I have been systematically studying ChatGPT since it was put online. I've got example after example of ChatGPT providing explicit definitions of concepts, definitions consistent with what you'd find in reference books. For all I know, it cribbed them from reference books. But it can also work with those definitions in interaction sequences that could not have been cribbed. I've put some of this work online, https://new-savanna.blogspot.com/search/label/ChatGPT%20MTL

"Most of what people write and converse about is socially constructed domains."

Yes. And concepts like truth, justice, nation, charity and a whole lot more, they aren't defined by physical characteristics. They are defined in terms of actions and behavior that can be expressed in terms of narrative, story, and explanation.

Expand full comment

Very much appreciate this elegantly laid-out correction. Thank you.

Expand full comment

Abstraction is a lot of things. Charity and justice are abstract ideas. ChatGPT can define them and work with them. See, Abstract concepts and metalingual definition: Does ChatGPT understand justice and charity? https://new-savanna.blogspot.com/2022/12/abstract-concepts-and-metalingual.html

Expand full comment

I'm not referring to abstract concepts here so much as language being a system of abstraction. So in terms of language, abstraction is literally everything. Words are abstract representations of concepts. It doesn't matter what subject we're dealing with here, whether concrete or abstract concepts: none of it is *understood* by LLMs because LLMs have no way to understand or reason. ChatGPT can only parse examples of text that deal with these concepts and rearrange them to output text that follows contextually from the input text. That is not 'understanding', it is organising data with a view to maintaining an illusion of understanding.

Expand full comment
Feb 3, 2023·edited Feb 3, 2023

When you can tell me, in detail, what ChatGPT is doing, I'll accept your answer. Until then, as far as I'm concerned, you're just espousing a party line. It's a line I happen to agree with, more or less by default. But it's still a party line.

No one knows what ChatGPT is doing. It's not a stochastic parrot except perhaps in a technical sense, which needs to be spelled out carefully. Otherwise the phrase is misleading. Gary's alternative idea, pastiche, is not much better. Do I know what ChatGPT is doing? No, of course not. Does Yann LeCun? No, of course not. Does anyone. No.

This whole field of debate is a colossal waste of time. Why? Because it forces people to pretend they know more than they do in order to score debate points. This is a 20th century squabble about 21st century technology.

Expand full comment

That's a fair point, and I'm really not here to score debate points - in fact this is the first time I've waded in at all, because I know how little I know (I certainly know less about this tech than you). I'm just using it, really, as a way to organise my thoughts and put a pin in them.

I'm not espousing a party line, though - I've arrived here through learning about the underlying structure of the tech, thinking about it, and landing on scholarly articles that chime with how I understand it. To say that just because it's black boxed means it's pointless even discussing it seems dangerously defeatist considering the stakes.

Anyway, I respect your position here and I'll happily leave it there.

Expand full comment
Feb 3, 2023·edited Feb 3, 2023

"scholarly articles that chime with how I understand it." What about scholarly articles that don't chime with your understanding?

Sure, we have to discuss it. But we need to be careful how we do it.

Expand full comment
Comment deleted
Expand full comment
Feb 3, 2023·edited Feb 3, 2023

"Do children learn the definitions of words by being told other words?" To some extent, yes. For that matter, most of what I know about physics I know from having read words, lots of words, but also looked at pictures and some math as well. I certainly don't know it from laboratory experimentation or computer simulation.

In that post I report a sequence of prompts and replies. If those had come from a human, you'd say they were reasoning about justice. Since they come from a machine, you're sure it can't be reasoning because machine's can't reason. In either case we don't know what's going on.

Saying one case involves thinking while the other does not doesn't seem to tell us much about what's going on in either case. We're just slapping labels on the behavior that are based, not on the behavior itself, but on information about the thing that's exhibiting the behavior. That was OK back in, say, 1980, when Searle first published his Chinese Room argument. But now, it begs too many questions to do much serious intellectual work.

Expand full comment
Comment deleted
Expand full comment

Then you should have said that. How else was I to know? You were responding to a comment in which I linked to a post. That post was not about first language acquisition. I assumed you were responding to that. I can't read minds.

Expand full comment

Thanks for an informative post. I will definitely read it. I am not an infant or a stochastic parrot but a weary human adult (female). I cannot be sure but somehow "farmed out to sweatshops" definitely sounds appropos in some way.

I find it depressing too that asinine stuff like anything Musk (histrionics about skynet) or less crazy but still industry funded shilling (EFF, public-private partnerships) stands in the path of serious regulation of a lot of this drivel that has real world impacts on the environment and any sort of democracy.

A lot of performative gate keeping charlatnry..

Expand full comment

Groundhog Day 1: We invented one power of existential scale which puts everything at risk.

Groundhog Day 2: We're inventing another power of existential scale which might put everything at risk.

Groundhog Day 3: Our goal is to repeat this process on every following day, preferably at an ever faster pace.

Groundhog Day 4: We finally figure out that we're dumber than groundhogs, on the day when it becomes too late to do anything about it.

Expand full comment

This post illustrates beautifully how we're still in the same boat that von Neumann and Minsky built for us nearly a century ago.

Expand full comment

What if the neuron is simply the best method that evolution discovered to deal with the complexity of the world?

Neural networks, while powerful for dealing with the patterns of the universe, have the same problems that evolution was dealing with. The need to reduce complexity to a managable size. The ability to discover rules that generalize the massive amounts of information that it can't retain.

And no matter how many neurons you throw at it, no matter how much data you train it with - you get the same results.

Silly errors, hallucinations, and created memories to name a few.

Expand full comment

As someone who grew up on Hubert Dreyfus critiques of GOFAI, this is fascinating...

Would it help or hinder to describe it this way:

GOFAI presumed that human intelligence could be codified in terms of _deduction_

Connectionist / ML strategies presume that human intelligence can be replicated with sufficiently powerful (probabilistic) _induction_

Neither acknowledges what infants are also expert at: _abduction_ = generating and generalising patterns (and rules) from a small set of experiences.

?

Expand full comment

(I'll check out Two Distant Strangers and recommend Hulu's Palm Springs—starring the normally-execrable Andy Samberg—for a comedic, post Groundhog Day time-loop movie.)

Expand full comment