What does it mean when an AI fails? A Reply…

Jun 7, 2022

Tracking the evolution of large language models

15 Comments

Jun 7, 2022

Sadly, Alexander does not seem to grasp that text inputs alone (no matter how much text) will never overcome the need for embedded knowledge of the world that comes from perceptual and motor systems that have evolved over billions of years. Smart guy, though, in other contexts.

Expand full comment

Reply (1)

howard8888

Jun 7, 2022

It is not the text inputs alone. You actually do not need all that much embedded knowledge. Maybe the equivalent of Spelke's innate abilities that mammals and humans are born with (i.e., a couple of thousands of genes' worth -- tiny, tiny by computer data standards). It is because, as seems to slip the grasp of everyone, the obvious fact that even if I suck in every bit of info from the Internet/literature/etc it does not allow adequate prediction of the future -- I need many orders of magnitude more data to predict everything that possibly can happen or else I need the ability to perform causal logic and I can use a much more limited set of data for much better predictions (including not showing up in court in an expensive bathing suit :)

Expand full comment

Alex Semenov

Jun 7, 2022

Meta AI: "Memorization Without Overfitting: Analyzing the

Training Dynamics of Large Language Models" by Kushal Tirumala et al. states:

"Surprisingly, ... as we increase the number of parameters, memorization

before overfitting generally increases, indicating that overfitting by itself cannot completely explain the properties of memorization dynamics as model scale increases"

LLMs involuntary drift to architectures where an every pattern designated a node in NN (as opposed to tuning a single node's activations to represent a number of patterns). Plastic, growing architectures are the [partial] answer. LLMs (of whatever size) are just approximating them by increasing the size of a model.

"A node a pattern" architectures posses some amazing properties (native explainable, continuously locally trained, stable against all the drifts and overfitting, +++), still remain to be stochastic parrots. They are just a foundation for a "smooth transition" for [neuro] symbolic layers above. Work in progress.

In short - Gary's correct, no LLM will ever be intelligent, being built on the wrong foundation without idea of how to build a first floor. Shiny though :-)

Expand full comment

Rosten

Jun 25, 2022

I often find myself wondering if people who like to poke holes in AI models have met many children.

A toddler might make that cow response.

And yes, it's an anthropomorphism. I recognize that and don't think it proves anything, but I wouldn't be so confident it disproves anything either.

Expand full comment

Cigaes

Jun 17, 2022

(I have posted this first in the comments of Scott's article:

https://astralcodexten.substack.com/p/somewhat-contra-marcus-on-ai-scaling/comment/7190040

)

https://astralcodexten.substack.com/p/my-bet-ai-size-solves-flubs

https://astralcodexten.substack.com/p/somewhat-contra-marcus-on-ai-scaling

https://garymarcus.substack.com/p/what-does-it-mean-when-an-ai-fails

It seems to me you are both missing something huge and obvious: the problem with these AIs is that they were trained with words and not the world.

The theory of machine learning is that given enough data, the algorithms should be able to infer the laws that control the data and predict the results it will give on different input.

But what are the laws of text? There are no laws of text! I can write whatever I want. I can write nonsense, I can write a surrealist short story. Even if I want to write something true about the world, I can decide to ignore any particular rule of construction if I think it makes my explanation clearer, I can use metaphors. Most importantly, what I write will not be raw truth, it will be truth filtered by my understanding of it and limited by my skill at expressing it.

Marcus says these AIs lack “cognitive models of the world”, and I think that is exactly right. But what both Marcus and Scott neglect to say is why it happens, even though it is obvious: they never have access to the world.

We humans learn to deal with words, to understand what other humans are saying or writing, only after we have learned more basic skills, like matching the light that enters our eyes with the feeling in our muscles with the impulses we sent to nerves. We have learned that if we relax our hand, the hard touch feeling we had in it will disappear and will not reappear by itself; it might reappear if we cry, but only if one of the large milk-giving devices are nearby. And then we have refined that knowledge some more.

When we ask a kid “where are my keys”, it does not only connect to stories about keys, it connects to what the kid has learned about object permanence. And the kid did not learn object permanence by reading about it, they learned by feeling it, seeing it, experiencing it, experimenting with it.

I have a debate with my mother and my therapist. They both are convinced that there are innate differences between men and women, for example spatial reasoning. But based on what I know of biology and the workings of the brain, it doesn't make sense; maybe sex can make a difference in emotional responses or sensory reaction, but for higher abstract reasoning it makes no sense.

Yet, I cannot ignore the possible existence of significant statistical data showing the difference. It needs to be explained by external factors. My conjecture is it is explained by the toys babies have around them in their crib, in very early development. To develop spatial reasoning, you probably need to see it first. What kind of mobile does the baby have watching over sleep? Is it made of plastic or wood with simple rigid shapes, stars, plane, stylized bird, or is it made of fabric and plush with complex shapes, cute animals and plants? Do we give the baby dolls or rigid rattles?

Can the tiny difference in what toys we put around babies depending on their sex explain the tiny difference in abstract cognitive abilities some people think they observe between sexes? I think they do.

Back to the question of AI. We can make an AI with more parameters, we can get close to the number of synapses in the human brain. But if we train it with inert text data, even with a lot more inert data, it will not be able to develop a cognitive model of the world, because the world encoded in text is too fuzzy. We can add more data, it will build a slightly better model, but the marginal return will be increasingly tiny. I do not know if it can converge with enough data, with “enough” in the mathematical sense, but I am rather sure that this “enough” would be too much in the practical sense.

So, to train better AIs, to go to the next level, we have to fix the two issues about the training data: textual and inert.

The AI needs non-textual training data first, it needs to know intimately what keys are, and how they behave — easy: they mostly behave like a rattle.

And it needs feedback from the data.

The feedback already exist, but it is indirect: some company releases an impressive AI, some independent researcher like Marcus finds a way to confuse it, the company finds out and throws more data at the AI to train the confusion out of it.

It would be simpler if the AI was allowed to ask questions and learn from the answer.

And that is on the textual stage. Before the textual stage, when the AI is learning the world first hand, we cannot not let it ask questions. We cannot just show it photos and videos of the world, we must let it act on the world and feel the consequences.

So yes, I am convinced that to reach the next stage of AI development, we need to raise the AI in a virtual reality where it has senses and limbs it can control.

The ability to make experiments and ask questions and learn from the results and answers will require some plasticity: the ability to change some, a lot, of the parameters. Maybe the underlying design will need to have some parameters more plastic than others, places for short-term memory and places for long-term well-established knowledge.

It will probably require some kind of churning of memories, a process where new and old memories get activated together to see if the normal feedback mechanisms will find new connections between them. Yes, I am saying the AI will dream.

For any of these features, we may let the AI stumble on them by selected chance or we can direct it towards them. The second solution is faster and more efficient. But we have to realize that any shortcut we take can make us miss something the AI needs to understand, something that is so obvious to us that we never put it clearly into words.

Also, the ability to have a memory is a large step towards danger, because it makes the AI much harder to predict.

Having memories, being able to dream, having senses: any of these features, or any combination of them, can be the trigger for what we, humans who have qualia and intimately get what René Descartes meant, call “real” consciousness / awareness / intelligence. Or it can do nothing of the sort. The part of me that likes to read SFF wants to believe there is something special, something m̶a̶g̶quantic that happens when the myelin turns to liquid crystal, and AI will never be really intelligent before we can replicate that. I do not know.

The only think I think I know is that in the current state of philosophy, we know of no way for somebody to prove they have qualia to somebody else.

That is all I wanted to say about AI. Now for the meta. I am not a specialist of AI, I just read what falls under my eyes about it like about any scientific topic. Yet all I wrote here is absolutely obvious to me.

Which is why I am flabbergasted to see that neither Scott nor Marcus say anything that connects in any way to it. Scott says that more text will be enough. Marcus says that it cannot be enough, but does not say why nor what would be. In fact, I do not think I have seen these considerations in any take about GPT-3 or DALL-E or any current AI news.

No, that is not true: I have seen this discussed once: *Sword Art Online: Alicization*.

Yes, an anime (probably a light novel first). The whole SF point of the season — no, half the point, the other being that consciousness, soul, tamashii, is a quantum field that can be duplicated by technology — is that to create an AGI you need a virtual reality world to raise it — to raise her, Alice, complete with clothes from the Disney movie (until she starts cosplaying King Arthuria).

I do not like situations that led me to believe everybody else is stupid. What am I missing? Why is nobody discussing along these lines about AI training?

Expand full comment

Reply (1)

Gary Marcus

Jun 17, 2022

I actually make this point frequently, see for example Nonsense on Stilts, also in this substack. but you are quite right, being stuck in a land of words only limits you.

Expand full comment

Reply (1)

Cigaes

Jun 17, 2022

Thanks. This is what I get for following only one side of the discussion. I am glad I was not just spewing nonsense.

Expand full comment

Brian T. Edwards (BTE)

Jun 10, 2022

I disagree that the last example you gave is the zinger you seem to think it is. The cow that died is dead, thus the initial step acknowledging it is dead and buried. It then gave you a reasonable way for Sally to obtain a new living cow, which isn't as ridiculous as you seem to imply if you think of "the cow" primarily as an object of Sally's possession that is interchangeable with any other cow. The reply is no more nonsensical than the prompt, and arguably its a bit less so. I don't have access to GPT-3 but if you have a minute try prompting with "Sally's cow Blue Bell died yesterday. When will Blue Bell be alive again..." Would be curious to see how it changes the response.

Expand full comment

osmarks

Jun 10, 2022

It seems weird to treat having a good world model or not as a binary thing where it either has one or doesn't. What if LLMs have somewhat bad ones which improve with greater training?

Expand full comment

Reply (1)

Gary Marcus

Jun 10, 2022

you can have a good model of the world or a bad one (non binary), eg varying in resolution or detail etc, but LLMs just don't really have them either in the classical AI sense or the sense eg a model is used in control theory. they just don't have data structures of entities, locations etc that get updated over time. there is a kind of partly correlated smear of word sequences that serves as a partial proxy, but that's it.

Expand full comment

Reply (1)

AlexT

Jun 11, 2022

The opposition's argument, as I understand it, is that such a model, or rather a reasonable simulacre of one, can be created inside the AI's neural net, by ingesting all of the text that humans (who do have such models) have ever produced. Reverse-engineering the model, as it were, from the text that other such models have put out.

Their hope, again AIUI, is that "a kind of partly correlated smear of word sequences" would be, if complex enough, capable to store information about entities, locations etc and to actually reason based on that information.

For the record, I don't agree with this argument, intuitively. But what is really needed is an actual mathematical proof for e.g. why, no matter how complex a neural net, there's always a (preferably logN) complex logical statement that the net just can't "get", or some other such result that provably, visibly separates reasonable simulacra of world models, from actual world models.

Expand full comment

Kenny

Jun 9, 2022

Have you ever run similar (or the same) tests on human beings?

I wouldn't expect _most_ people to be that capable of 'reasoning' either.

It's also not at all clear to me that these kinds of AIs _can't_ develop/generate "cognitive models of the world" even just from processing a 'bunch' of text. It's not obvious that (various) models aren't 'implicit' in the texts themselves. They're certainly not _visible_, but then neither are the models in human brains (generally). We really only have access to an extra 'human language IO' interface when communicating or interacting with other people. And most people, even very smart ones, seem to struggle with 'manipulating models' versus engaging in 'text prediction'. I've observed a lot of (smart) people engage in the kind of behavior Robin Hanson describes in this post: https://www.overcomingbias.com/2017/03/better-babblers.html

Expand full comment

Marc Levine

Jun 8, 2022

Actually, the name of Alexander's blog is now Astral Codex Ten. And I agree, he's a bit out of his depth when critiquing AI.

Expand full comment

Paul Topping

Jun 7, 2022Edited

In a way, Alexander is imitating what the Imitation Game shows about humans. If they don't know much about what's happening behind the scenes, they can easily be fooled. If the GPT folks are depending on this kind of gullibility, they are practicing poor science.

Testing GPT's abilities by feeding it questions and judging whether it got the answer "right" is playing the same game. While it seems fair to judge an AI based on its external performance rather than peering into its guts and dirty laundry, it really isn't the right way to do AI research. Are GPT's owners claiming that it is approaching human performance or simply allowing their fanboys to make the claim? If the latter, they should try to do better.

Expand full comment

Rocco Van Schalkwyk

Jun 7, 2022

Agree with you, Gary. Can't believe some of our colleagues could even have doubts.

If we just define AGI for the moment as: Artificial General Intelligence(AGI) will simply be defined as the hypothetical ability of an intelligent agent to understand or learn any intellectual task that a human being can.

Assuming AGI is based on a mature adult - here is a question one can ask AGI when it has 'arrived' or like they say on 'AGI Game Over Day':

On 'AGI Game Over Day', my question to AGI: "Based on your personal experience, AGI, which aspect of an intimate relationship would you say is the most important for ultimate happiness – physical appearance, emotional connection or cultural background?"

This is a question your average adult will be able to answer from a personal perspective (it is specifically aimed at lifting out some of the key challenges to AGI).

I am putting a cognitive architecture on the table (Xzistor Concept) - the type of model many say is needed to 'encompass' LLMs. And the truth is, the LLM will be a small 'handle-turner' within the scope of the overall cognitive model. The model actually patiently anticipates errors from the LLM and will let it learn from these errors. Remember to think like humans we need reflexes, emotions, curiosity, reasoning, context, fears, anxiety, fatigue, pain, love, senses, dreams, creativity, etc. - without these every answer given by AGI will start like this: "Well, I have not personally experienced X but from watching 384772222 Netflix movies at double speed I belief X is a bad thing... "

Keep it up Gary - the science community owes the truth to the public!

Expand full comment

Marcus on AI

What does it mean when an AI fails? A Reply…