30 Comments
Feb 13Liked by Gary Marcus

No surprise that lack of understanding is on display, in two modes - text, image - and will likely be, in others as well (audio, video etc).

'Multimodal' can't fix 'clueless'.

'The Emperor has no clothes', in every language and modality :)

Data isn't a substitute for direct understanding, that is the heart of the matter. Fixing things piecemeal after they are shown to be broken, isn't scalable. Reality isn't amenable to ongoing and perpetual dot release upgrades, there is no magic crossover point at which the machine will suddenly 'get it'.

Expand full comment
author

Love this; Multimodal' can't fix 'clueless'

Expand full comment

Lol, thanks Gary :)

Expand full comment
Feb 13·edited Feb 13Liked by Gary Marcus

OpenAI: "We use the term “hallucinations,” though we recognize ways this framing may suggest anthropomorphization, which in turn can lead to harms or incorrect mental models of how the model learns." — GPT-4 System Card, an addendum of the GPT-4 Technical Report.

At least some people at OpenAI understand the 'bewitchment by language' good enough to have had this footnote added. Too bad they did not add the same footnote in all caps regarding the word 'understanding'... (or 'learning' for that matter)

The use of the term 'hallucination/error' triggers the assumption in our minds that the 'default' of the system is 'understanding/correct'. In an extreme example, someone who says "For me, Jews are people too" is an antisemite, because they allow doubt by implicitly stating this is a valid question in the first place (cf. Godfried Bomans). The opposite of something we say is often also implicitly said.

I seriously think we should refrain from calling these errors or hallucinations. We might call them 'failed approximations' to signal the correct ones are also 'approximations'.

https://ea.rna.nl/2023/11/01/the-hidden-meaning-of-the-errors-of-chatgpt-and-friends/

Expand full comment

True. Words matter.

We say 'created' by AI, even though today's AI merely computes, doesn't create. But that ship has already sailed, lol.

Expand full comment

Actually, I think 'creation' could be OK to use. The underlying system is unpredictable (using randomness, even if it is probably pseudorandom) so we could call what comes out of it creation of something that is not calculated. Calculation, I think, evokes an image of predictability.

This is going to play a role in the legal wrangling. OpenAI and Friends hand over responsibility of the output to the user (Disney: please go after our users when they create trademarked imagery, not after us) but it is unclear if that will hold up to legal scrutiny. If the 'robot' of OpenAI called GPT is deemed to be a 'creator', they will be responsible for infringement.

Expand full comment

Good points, true - esp about this relating to coming IP challenges. Given that all AI-derived generation amounts to weighted interpolation in embeddings space, wouldn't it all still be computation though? :) We humans create, for sure being influenced by others' creations, but not by numerically blending numeric representation of styles in our brains - no? Our creations would be influenced by our own experiences as well, which aren't stored symbolically in our brains and are therefore non-computational. In other words, creation needs a creator - absent which, it's all, calculations from a calculator [lol]. The calculations do look/sound/... amazing though - to us humans who can perceive them as creations.

Expand full comment

Humans do not 'compute' the way computers do, but what neurons do is a form of 'computation'. It is just computation with much more powerful 'numbers' (reals instead of the integers of digital computing, even chaotic elements probably play a role). The 'number space' (complexity) of the brain is infinitely larger than what *digital* technology ever can reach. With a vast amount of digital resources, we can create 'dumb creativity' (which is a pretty good term for what we have now).

The first AI-hype (1960's-1990's with a dip in the 1970's-1980's) was based on the idea that we use (or intelligence is) in the end symbol-manipulation, which has deep roots in us thinking that intelligence comes from handling logic ("2500 years of footnotes to Plato"). This failed miserably. (Even if being able to do a little bit of logic on top of 'estimation' is a really powerful combination)

The pure neural net approach of the GenAI crowd harks back to very early ideas (even from before most digital computing), namely that if we can build artificial neurons we can build artificial brains and thus artificial intelligence. Logically, this is necessarily true (religiously not, of course): we're living proof that such machines can exist, at least in our biological form. We are intelligent (not overly much) and conscious biological machines (or 'machine collections' as each human is not a single entity — biome and all). The question is if the architectures we use to get there for 'artificial' are correct. I am pretty convinced they are not.

Current GenAI has two serious architectural imitations: (1) "'understanding' ink/pixel-patterns is a fundamentally limited approximation to 'understanding' (the meaning of) text/images" and (2) digital computation is infinitely less powerful than real computation (all the assumptions that we can approximate this well enough is based on the idea that signals are in some way regular, which probably isn't the case everywhere in the brain). Especially the second one spells doom for any purely digital approach. In the end, any logic can be translated in other logic, so whatever you do at a more abstract level, in the end you're doing massive amounts of classical logic on true/false states (Turing machine). A Turing machine will never be intelligent, I think, regardless of how large you are going to make it. Useful machines, yes, especially if we solve (1). Real intelligence, no, unless we move away from — purely — (2).

Expand full comment

One word: *wow* :) Agree with all you said, including NN/genAI architectures etc.

One fundamental point/diff/qn - it's about neurons doing computation using reals... I see them as not *explicitly* computing using numbers at all. Eg. when asked which of two pair of points on a piece of paper is closer, we don't compute Euclidean distances, and output the min(), same w/ myriad other real-world actions and behaviors. For sure, we can computationally model neuron firings etc, and they agree with observation - but that doesn't mean they themselves compute. On top of an inherent non-computational architecture, we do compute, but only in limited ways. Eg. we can answer what is sqrt(144), but not sqrt(144.31421718); we can reverse a list of 5 colors in our heads, but not 500. We can't solve diff eqs in our heads, so we externalize that using code/computers.

Not refuting you using concrete proof/evidence, just expressing what I believe after having pondered this for 30 years :)

Expand full comment

No agreed, it is not calculating what we do (certainly not in the conscious way you describe with Euclidian distances — that is more the AI-1960's idea of intelligence). I am using the word 'computing' because I have no better word. If five neurons produce a strong enough signal that a sixth fires (which itself is not a discrete on-off affair), then what happens can be seen as a 'computation' on those inputs. We mostly do 'analog estimation' and I called that 'computation'. Some of that may very well be in effect 'doing' differential equations.

Your list example is discrete (there we are really limited) not reals. The same is true for sqrt(144). That is that puny power of discreteness we have, where (see The Number Sense by Dehaene) we understand that 5-6-7 and 105-106-107 are the same 'thing' (we're unique on this planet).

Expand full comment

The human brain merely computes

Expand full comment

Sure, why not

Expand full comment
Feb 14Liked by Gary Marcus

The reaction you get from folks who just want AI to be better than it is - fascinating. If it walks like a duck but can't do anything else a duck does - does not matter. They will still call it a duck. my point I keep making about AI therapy. If you think it understands you and you really want it to understand you - it will appear to understand you and that's all some folks need. But the rest of us will still have a critical eye. Cheers Gary! Keep up the good work.

Expand full comment

I have to say I do like that fantastical rhinelephant in one picture.

Expand full comment

“…the problem was with its *language understanding*, rather than with illustration per se.” Yup.

Seems to be a lot of resistance to this idea out there… or misunderstanding of it.

Expand full comment

Hope you are well and enjoying Fall. Recently AIBot in my iphone 14 showed me a collage it made from 2 of my paintings (in response to the “this day in 2017” program in phone.?) it waited for response.

As the creator I was appalled that it assumed okay to do a task not requested? to respond to the waiting bot, i stated my art was “personal” to me like a family, a child and I did not want interference with my work,

Was any of this appropriate for a “helper” not taught the boundaries of it’s program and human endeavor? This the Hal issue?

I saw no way to remove AI from my IPhone 14. Who might I contact, FB or Apple? if not, any ideas how to remove AI .Must take down all images from FB. Here Fall Windy Day and ocean. New.

Expand full comment

While some variants of a horse riding astronaut failed when they shouldn't the original formulation was flawed. Google shows 45,000 hits for "a horse riding girl" and all the images have the girl on the horse not the other way around. "a horse riding ON an astronaut" would be better. Again - your post points to some really shortcomings but be careful with ambiguous English. Admittedly these programs should not ignore the "an" in "a horse riding an astronaut"

Expand full comment
author

Problem is too robust across too many cases for any to think that is a major factor here

Expand full comment

Let's talk about understanding. I have to come at this from the entire framework of ancient Vedic philosophy. "Intelligence" simply does not exist. I've been vehemently against the use of the concept of "intelligence" for 40 years. The word "intelligence" was coined from Latin and started gaining use as recently as the end of the Medieval Ages in Europe 500 years ago.... 500 years in homo sapien evolution is nothing!

I speak Tamil, my mother tongue in one state in South India, and a bit of Sanskrit. Tamil and Sanskrit are 5000 years old and they are the two oldest and highly literate languages in the world. Even Arabic, Cantonese, Farsi and Urdu that are much much older than Latin-Germanic languages in Europe do not support an equivalent word for "intelligence". All of these languages support concepts describing logic, reason, understanding, yes. But never intelligence.

Hence I've never used the word "intelligence" because I can't use a word if I don't fully understand what it means. If the western world claims humans are "intelligent" then the current state of our planet run by humans equals "intelligence."

As far as I can tell nobody the Latin Western hemisphere is able to define "intelligence". So why put 1 trillion or even 7 trillion into a creating a concept that doesn't exist, and on machines?

"There is perhaps no better a folly of human conceits than this distant image of our tiny world" - Carl Sagan

Expand full comment

Your voice is very important in this conversation of our AI dominated future. I agree human understanding involves a whole compositional view of whatever we are thinking about. For example, a songwriter hears the whole song in his/her head, then writes it down. Its not just a succession of one arbitrary note after another. A mathematician discovers a proof all at once, not a just a list of one symbol and Greek letter followed by another. Perhaps human consciousness has something involving quantum coherence in the microtubules as suggested by Penrose and Hameroff.

But on the other hand, these examples seem to be the same kind of mistakes humans make. Especially kids. We have prepositional phrases like “on the back of” precisely to emphasize the position and order of the things we are talking about. People don’t speak in plain strings of commands and objects. Also, we chunk common phrases together and sometimes mix them up like in a Freudian Slip.

AI is not conscious, but it’s starting to look like it is our collective unconscious.

Expand full comment

This is a solid essay, and provides an in-depth look at what the problem is, rather than offering uncharitable and unproductive criticism. This is a very welcome approach.

Current algorithms do not truly know how things fit together, and what they really mean.

Yet, these methods are not a dead-end or "sucking oxygen" from potentially better things. They are first-cut approaches at solving what are impossibly hard problems.

Rigorous approaches would model anatomy and spatial relationships. Or at least start with a rough correct skeleton and let the algorithm fill the details (that part it can do well). Given how far we've come, I think these issues will be solved in a year or two.

Expand full comment
Feb 13·edited Feb 13Liked by Gary Marcus

LLMs (and what is currently called GenAI) are like the dirigibles of the early 20th century. Both use a fairly simple, easy to grasp concept (buoyancy/ next token prediction) to achieve some impressively looking at first glance results. Both were hyped up had high expectations, the spire at the top of Empire State Building was even originally planned as an anchoring mast for trans-Atlantic zeppelins. However, both have fundamental problems that can only be fixed by a paradigm shift on the scale of inventing airplanes. And just like the dirigibles, LLMs and GenAI will fade into history, supplanted by the tech that really solves intelligence. And since LLMs are based on neural networks, by extension, neural networks are like hot air balloons :)

Expand full comment
author

100% agree

Expand full comment
author

and posted screen grab on X

Expand full comment

People as yourself are looking for some elegant human-like intelligence, but fail to understand that human intelligence looks neat only in retrospect.

We are, what we are, because of a massive amount of knowledge and experience, out of which we build representations for the world, that we continuously refine.

This hard fact just cannot be avoided. If it isn't neural nets, it has to be some other architecture that can learn from massive and not always coherent amount of experience.

Expand full comment

That is precisely what I said, that it will be some other algorithm that solves intelligence, not neural nets. And for the record, I believe there is only one intelligence, whether human or machine, and it's not going to be mathematically elegant at all. You see, most problems from the real world are computationally unsolvable exactly, so the only way to get around that is to use approximations, and approximations are certainly not elegant, although I would argue that for example Newton's method for finding roots is just beautiful in its simplicity.

Expand full comment

So there is agreement that the solution to intelligence won't be elegant. Some folks idealize our thinking process.

I would not bet against neural nets, because they are very powerful machinery. It is surely not enough, because long-term planning and reasoning don't map neatly onto the paradigm of info going one-way on in some network modulated by weights.

I think neural nets can be some nodes in an AGI, but there's going to be other modules as well, including simulators, verifiers, maybe even symbolic logic for some things, and also a large database of known tricks and knowledge.

Expand full comment

Neural nets have a fundamental problem of unreliability, and another one of uninterpretability/unexplainability. Both are inherent to the neural net algorithm. Unreliability comes from the fact that NNs can not guarantee a smooth approximation manifold. Regularizations helps to some extent but doesn't solve the problem entirely. The result is unpredictable behaviour, like in adversarial examples where a tiny amount of otherwise humanly imperceptible amount of noise added to an image changes completely the output of the NN and with high confidence. The reason why the approximation manifold can not be guaranteed to be smooth is that NNs are highly non-linear functions of many parameters which are practically impossible to be analysed and hence their output is also practically impossible to be interpreted or explained. This is why I don't see NNs as any part of a true machine intelligence.

Expand full comment

People said the exact same thing about the previous versions of these models last year.

Expand full comment