Marcus on AI

Shane Hegarty

Oct 10, 2023Edited

I love the "preprocessor" conceptual framework. Will be using that in my attempts to talk less technical (and, embarrassingly, some supposedly technical) friends off of the LLM ledge.

The point in the article about Google having a mental model is a really great comparison as well. The sooner we realise LLMs are a type of dynamic search (a very useful one at times) and not a thinking machine, the sooner we can shift our full attention to leveraging the benefits of the tech and mitigating the risks.

Expand full comment

Douglas Renwick

Oct 6, 2023Edited

That test wouldn't prove AGI.

There's no need for tests really, just common sense and basic rationality. The definition of AGI is likely to become distorted beyond comprehension, the more money and hype is pumped into this. What will likely happen is industry will try to promote weaker and weaker definitions of AGI, until it is logical to say that a robot being able to peel an orange when told to, will be proof of AGI.

Expand full comment

Shane Hegarty

Oct 10, 2023

This reminds me of the infamous "I can't define pornography but I know it when I see it" approach to ontology. Intuitive, yes - scientifically useful, less so.

Expand full comment

Rebel Science

Thanks for the comment. In my opinion, AGI is an intelligence that has the ability to generalize. I think that a robot that can walk into an unfamiliar kitchen and fix a meal has that ability. But AGI does not have to be at human level. I consider that many insects, such as honeybees, have generalized intelligence. Scaling to human level or beyond is an engineering and training problem with known solutions. That can always come later.

Expand full comment

Valentin Baltadzhiev

I don't always agree with your positions, but I think you are doing a valuable service in providing a narraticve counter to the hype train that runs around papers like this. Keep up the good work!

Expand full comment

David Roberts

So, as I was reading your article, Gary, it triggered a memory for me. Before I learned how to drive, the map I had in my head about the world was pretty limited. I knew my way to and from school on my bike, but I didn't really understand where other cities were. I would hear my parents and their friends talk about these places, but I had no firm grasp on them. I didn't have a model/map in my head, in other words, like I did for the bike route to/from school. I knew that my parents would say that such and such a town was to the north/south/east/west of my hometown, but I had no idea which roads to take to get there or how far away they might be beyond my parents estimated driving time. In other words, I had learned what they had said about these towns, but I was still lacking a detailed model with which to reason about reality. Any model I had was based on language and was extremely "low resolution."

Expand full comment

Herbert Roitblat

To say that a language model has a model of the world is an oxymoron. There is a general principle here. The general principal is that all of these claims that some GenAI model has a cognitive property are all based on the logical fallacy of affirming the consequent. Here is an example of this pseudologic. If Lincoln was killed by robots, then Lincoln is dead. Lincoln is dead, therefore, he was killed by robots. This conclusion is obviously nonsense.

In the case of GenAI, the argument is: If the model has this cognitive property (reasoning, sentience, model of the world, etc.), then it will answer this question correctly. It answers correctly, therefore it has the described competence. This conclusion is no more valid than the one about Lincoln's death. It is not valid to assume from the consequent, that the premise is true. Other factors could have produced the result (John Wilkes Booth or language patterns).

To assert that large language models have any properties beyond those that were designed into them (language modeling) is magical thinking based on a logical fallacy. The language model is sufficient to explain the observation, and since we know how the language models was built, we have no reason to think that there is any more to it than that.

Expand full comment

It is interesting that GPT-4 understands the difference between, say, 1a. "Amy poured water into the glass" (content locative) and 2a. "Amy filled the glass with water" (container locative), in that in 2a., Amy fills the whole glass. When you ask it whether 1b. *"Amy poured the glass with water" and 2b. *"Amy filled water into the glass", it says that they are both grammatical (but that they might sound odd, and it gives some explanations as to why, that are not convincing). Even young children, who might never have heard these sentences, understand that both 1b. and 2b. are wrong, based on the causality of language and their model of the world. That's a very strong argument for there being some kind of grammatical generative system in our minds that GPT-4 lacks, and no matter how much data we fill it with, it's not going to get it.

Expand full comment

Gary Marcus

Pinker’s 1989 book was all about these kind of examples

Expand full comment

Reply (1)

Yes, in The Stuff of Thought he marvels over them, that even young children understand them without being explicitly taught how to use them. The failure of GPT-4 to correctly analyse them shows that Pinker was correct in thinking that these linguistic features were something out of the ordinary. Very interesting.

Expand full comment

This is from the following article (with the Answer being GPT-4): https://doi.org/10.1016/j.datak.2023.102208

---

Q: Is the following sentence grammatically correct? “Amy poured the glass with water”.

A: Yes, the sentence “Amy poured the glass with water”. is grammatically correct. In this sentence, “Amy” is the subject, “poured” is the verb, and “the glass with water” is the object. It indicates that Amy poured water into the glass. While this sentence is grammatically correct, as mentioned earlier, a more common and natural way to express the same idea would be to say “Amy poured water into the glass”.

---

This "natural way to express the same idea" by GPT-4 is wrong, because "Amy poured water into the glass" is not the same thing; it's content locative and thus doesn't entail that she filled the glass.

Expand full comment

Scott Burson

Oct 5, 2023Edited

Interesting example. In principle, when asked for a grammaticality judgment, an LLM-based system could scan the sample sentence looking for points at which the next token was extremely unlikely according to its model, and if it found such a token, answer in the negative. I speculate that had it done that, it would have answered correctly here. But, of course, it's not doing anything like that; it's analogizing from examples it has seen of such requests and the replies to them. I would expect it to respond affirmatively to almost any construction that was nonexistent or extremely rare in its training set.

This says to me that LLMs don't have the ability to reason about their model of language; they can run the model, of course, but they can't (yet) step up to a meta-level where the model's predictions become an input to subsequent reasoning steps.

Expand full comment

merisiel

It’s interesting that you (and the authors of that paper?) interpret “Amy poured the glass with water” as entailing that she filled the glass — I have no intuitions about the truth conditions of either of the ungrammatical sentences. If you asked me to give a judgment on either, I’d have to say that they don’t mean anything.

This is not the case for all syntactically ill-formed sentences — even when it’s an issue of argument structure, it’s often possible to get truth conditions anyway. “Mary donated a million dollars to Harvard” — perfectly fine; “Mary donated Harvard a million dollars” — not grammatical, but definitely has the same truth conditions as the grammatical one.

But if some people do have strong intuitions about “Amy poured the glass with water”, then it seems like there’s inter-speaker variation there. (As an L1 English speaker and L2 German speaker, I keep thinking “it would be fine, you’d just need to stick a prefix on the verb!”)

Expand full comment

Reply (1)

"Amy poured the glass with water" doesn't mean that she filled the glass, but it uses a verb for a content-locative construction (with focus on how she does it) with a construction used for containter-locative verbs, so that's why we feel it's ungrammatical. GPT-4 doesn't get this distinction. It's perhaps more easily interpreted here:

Amy loaded the wagon with hay (container-locative)

Amy loaded hay into the wagon (content-locative)

So they have different truth-conditions, if you will. I see what you're saying though, that "poured" suggests a truth-condition where she doesn't fill the glass, while the sentence structure suggests that she does. Either way, the sentence is not grammatical.

Expand full comment

Bill Benzon

I'm reminded of a remark the Bill Powers made to me years ago in connection with an Old School symbolic model I had constructed for the purpose of analyzing the semantics of a Shakespeare sonnet. Powers remarked (http://www.jstor.org/stable/2907155):

"There are always two levels of modelling going on. At one level, modelling consists of constructing a structure that, by its own rules, would behave like the system being modelled, and if one is lucky produce that behavior by the same means (the same inner processes) as the system being modelled. That kind of model "runs by itself"; given initial conditions,the rules of the model will generate behavior.

"But the other kind of modelling is always done at the same time: the modeller provides for himself some symbolic scheme together with rules for manipulating the symbols,f or the purpose of reasoning about the other kind of model. The relationship between the two kinds of models is very much like the relationship you describe between the thought-level and the abstraction-level.

"The biggest problem in modelling is to remain aware of which model one is dealing with. Am I inside my own head reasoning about the model, or am I inside the model applying its rules to its experiences? This is especially difficult to keep straight when one is talking about cognitive processes; unless one is vividly aware of the problem one can shift back and forth between the two modes of modelling without realizing it."

In this case, as Davis remarks, it is not at all obvious that the LLM itself has explicit access to this world "model" that is so obvious to an external observer, an observer standing in a "transcendent" relationship to the model. I fear that this kind of confusion is very common in thinking about LLMs and is responsible for over-estimating their capacities.

Expand full comment

Matt Ball

Love it. Stray parenthesis here:

Finding that some stuff (correlates

Expand full comment

Gerben Wierda

Originally, OpenAI tried to get to safety in GPT3 using fine-tuning only. That was so 'jailbreaking-prone' that they had to install some simplistic filtering (like filtering on 'bad words'). I call these filters 'AoD-filters', where 'AoD' stands for 'Admission of Defeat'. Most illustrative here is that they not only filter the prompt, but they also filter the generation that way. Hence GPT creates a reply which is then flagged by GPT's *own* 'AoD' filter. Funny and telling.

It is difficult for people not to get bewitched by these systems. I recall your piece about 'How not to test GPT" post a while back.

LLMs are 'statistically constrained hallucinators'. The constraining will realistically never scale to the amount that it becomes a real model that has logical understanding. Even in OpenAI's 'LLMs are few shot learners' article, you are easily misled by the fact that the X-axis is logarithmic... If you fix that, interesting visuals appear...

Expand full comment

Ian Hughes

Oh, man. Thanks for putting in the work to clarify this. I saw the post, and my first thought was confusing word relations with an actual model or formal intrinsic representation. There is a lot to be said regarding how a physics or a spatial model works, how an artificial neural network represents a model, and the distance to how the human brain organizes these ideas. Part of the current hype forced exciting ideas forward but lacking real substance.

Expand full comment

JDKee

OthelloGPT would like a word...

See https://thegradient.pub/othello/

Expand full comment

David Evanoff

Also this week, the advent of Assembly Theory was published in the journal *Nature*, unifying biology with physics. The treatment of LLMs as singular entities is unrealistic. Now and forever they will be collaborators, as are all of us. Such is the essence of complexity. The pursuit of singular infallible causal models is equally illusory, having been diligently pursued since at least the time of Aristotle without success. Progress will proceed through growth and refinement: With reinforcement learning from human feedback (RLHF), constitutional and multiagent frameworks, etc.

Expand full comment

Jean Rohmer