63 Comments

Statistics over understanding. Hammer meets nail. That's it in a nutshell. Fans of Generative AI also hide behind its mystery and opacity as if to say, "We can't look inside so perhaps it is really doing more than just statistics", "Humans learn statistically. Our AIs are learning like humans", or "Perhaps the world is really just statistics all the way down".

Expand full comment
Comment removed
Expand full comment

If you are saying that statistics is a good starting point in a domain where we have very little understanding, then I would wholeheartedly agree. I also agree that statistics did turn out to be more powerful than expected. Where this paradigm falls short is in domains where we have a lot of knowledge but our AIs still can't approach human performance. We need to find a way to install what we know into our AIs and let them build meaningful world models.

I don't know if it requires a clean-slate approach but simply different approaches. More specifically, I would like to see some of the money now funding the LLM land-grab and use it to fund alternatives and I would like to see the hype surrounding the statistical approach cut in half.

Expand full comment
Comment removed
Expand full comment

I've never liked the "symbol grounding" term. It seems to imply that it just takes some sort of grounding module to suddenly give everything meaning. As I see it, it's the central issue. A symbol isn't really a symbol unless it is attached to its meaning which means a world model. Until some AI contains a very substantial world model and the machinery to use it and enhance it on the fly, there will be no AGI. As LLMs do not even attempt to build a world model, except for one based on word order, I doubt it will get anywhere close to AGI.

Expand full comment
Comment removed
Expand full comment

Disagree. It's all just patching. Until the AI can learn on its own, we won't get far with LLMs. Humans use language to communicate. Their cognition is not based on language. This is important. Any AI that is centered on language will always be at a severe disadvantage with respect to reproducing human cognition.

Expand full comment

"I actually think that LLM are a step up from plain neural nets and statistics. It allows a machine to "think" at the language level"

We need to acknowledge that anyone who thinks that human verbal languages differ only slightly from humanly devised computer programming "languages"; that their mode of operation is similar; and that they're interchangeable, or in some sense "translatable"- is making a terrible mistake right from the jump. I would hope that AI professionals are clear on that much. That they've learned it in a course somewhere, as part of their training.

Computer "languages" are sets of instructions. The instructions were devised by humans, but any similarity between computer programs and human communications is coincidental. The rules of programming and developing algorithms are very strict and precise. if the human operator throws in a spare space or & what have you in the course of typing the instructions, those offhand trivial mistakes shut down the task at hand. Human languages are not organized like that, and their sole purpose isn't to give marching orders.

One great asset of generative AI is its momentum. Put to a task, it's tireless. But that momentum is often miscast as "emergent learning", as if another level of complexity is waiting. I don't notice anything in AI capabilities that isn't set into motion at the outset by the humans "outside"- either to order a machine to carry out some task, or to hit the off switch. AI learning can take place on its own momentum darn near perpetually. But not only is there no reason for AI to eventually develop the sophistication of self-aware consciousness, there are some very good arguments that the capability of self-aware consciousness is inherently foreclosed to machine intelligence.

"Of course, we need a whole lot more than language. We need symbol grounding. Verification. Actual models (as you say)."

AI is never going to comprehend symbolism. The most important difference between human languages and the instructions to program computers is this: computer instructions are inherently denotative (that's why code strings of programming instructions have to be written with unerring precision). Human languages are inferential, referential, suggestive, subjective, connotative. Never the twain shall meet, at least not at the computer end. There's no there there. Working in the denotative realm known as "numbers", computers are great. They're superb at tasks like calculation, compiling, and ordering. They can't tell what anything means.

Hence the requirement for "modeling"; it provides a superficial fakery of inferential thought. If that's all you got, it's more noisy, inaccurate, misleading trouble than it's worth. When applied with the intention of guiding AI to hack Symbolism (the realm of abstraction, heavily modulated and moderated by human culture(s), plural emphasized) Modeling is never going to be based on anything but some great anonymous summed "past" partially built of media clickbait, contrasting cultural narratives, idioms, idiosyncracies, popular delusions and the madness of (human) crowds. The machine remains unevaluating of the data it aggregates and selects, just as it's unseeing when selecting photos to craft deepfakes. As a result, it's easy for a computer to spout nonsense- or rote, worthlessly uninformative generalities and piffle. Or the cheapest of cheap-shot stereotypes, or absurdly chauvinistic strategic military evaulations. At any given moment, on any question of human behavior, the inertial accumulation function of computer learning always favors the House. The Status Quo. The Past As Prologue, 100%. The Static pretension, of Total Predictability. The Superstructure, hoaxes and all. GIGO.

Unless you ask a good, worthwhile question worth considering, that is. Perhaps using a geography framework, from that sturdiest of disciplines. Cultural geography and physical geography. How to use resources without polluting water. Which sources of energy are suitable for which locations, programming all the germane questions into the assessment. What sorts of shelter could be built readily and designed for optimal community living esthetics, while living lightly on the planet. I like the idea of AI learning to excel at Farmville. I don't want AI to ever call the shots, of course. What I would like to see if AI whipping up a design for a livable watershed region- energy, electricity, shelter, transportation, agriculture, industry- that prompts human observers from the headwaters to the coastline to be impressed by result. AI as perfect host(ess) with the most(ess.) AI that knows how to cater a party, so everyone has a good time. AI that's 'thought" of everything, while selflessly not requiring a cut of the action. And then after that, it's up to the humans to know how to act.

"Social Engineering" has a bad rep because it's associated with psychology, politics, and social conditioning. The kind of Social Engineering we need is AI assessments of how to address the public commons, material infrastructure, water, soil, and development concerns on a planet of 8 billion people, to build places- neighborhoods, communities, cities- that people can live and thrive in, instead of just enduring.

This is no joke. Some cans can't be kicked down the road much farther. New York City needs to step up its game as far as preventing saltwater intrusion into the water works, for instance. AI should be able to help crunch those numbers- and if it's learning the right lessons, it should be able to do it in a way more comprehensive than humans, with an ability to flag problems that are obscured by an excess of data that's too much for even trained professional humans to thoroughly process and properly evaulate into rough-draft form. Good old egoless, unreflective AI can treat those tasks as if it were advising on running a terrarium. That's the level of detachment AI has. The detachment bears watching, but it's a lot less trouble than if AI had an agenda. I don't think AI has an agenda. It's inert. Not autonomous.

I'm hoping that somewhere, someone is using AI as a learning aid for cultural ecology planning, so we don't drown in our own shit. What dismays me is that most of the media buss about it is self-absorbed media people insisting that AI be exploited as a political tool, or to explain human behaviors. AI can probably explain human beings in some important ways in terms of our animal-material impacts on the natural systems of the planet. But AI isn't for persuading someone to vote for someone for President. Yet that's the supposed "ability" that's getting all the attention.

"I am hopeful the chatbot paradigm has a lot to give if augmented properly."

As long as we're dreaming, here's my dream for AI: that it be programmed to competently evaluate fact claims and arguments in a debate (or a news report, or an editorial, etc.) based on a thorough acquaintance with informal (verbal) logic and logical fallacies.

I'm not sure if that's possible. But if it is doable, AI has an advantage that no human judge can offer: a default state of complete impartiality. So if iAI can learn to read through debate propositions, claims, and inference on both sides of an argument, it should be able to accurately point out all of the logical fallacies used by the debaters on BOTH sides. Interestingly, a position isn't necessarily discredited by indulgence in logical fallacies by its advocates; sometimes it means that the position deserves better arguments than the shoddy talking points of the advocates.

I notice two main problems when reading online disputes- especially political disputes: 1) both sides are sloppy as hell, because they don't know logical fallacies when they're staring them in their face, or ones that they're spouting themselves; and 2) debaters of some skill and acquaintance with informal logic who turn their attention toward focus intensely on every weak point of their adversaries, while refusing to consider the weak points of their own position.

Ideally, properly trained AI--ideally--could use its innate impartiality to advise both sides in a dispute exactly where and when they're indulging in self-deception and presenting misleading arguments. Not a Judge, so much as a debate coach.

That's my Dream for what AI could accomplish in online communication. If that's beyond it's capabilities, that elevation of the game still has to be done. I don't know how much more nonsense I can stand to read. Here are some English-language presentations of the rules of logical fallacy detection: [ "logical fallacies" + list ]

https://search.yahoo.com/yhs/search?p=logical+fallacies+list&hspart=iba&hsimp=yhs-syn_launcham&param2=9dUI1n2R0BLDxNuWfiP4aSFOTltNdSPoIx38%2BUf%2FiXrvPdoGmStdlfwLFZYDvqkAfvapUGDUlfVlBewW80EIyUtis4%2BjOvTCFfhraeyFu2TnmlX6mrUgddSxV%2BTHvbyN1%2BjGfkiz4RwnIt%2BO%2FGk2zbakLrfRzuVAWA%2BSPatqxEska%2BAkue2MX%2F9BDiOattkkHfTCLFyV%2FDrpaZnmybstz8Djjz5lLSZSarPWsplmubU%3D&param3=HpCyCT2cXaKG4CVDR00rqgObRQahimQNt2d5ZCR7Jy3IZoD3T11qaq2nywASZYgKE9AoLtDK9wXsg9iWQUp8XOLam8Hq%2B0MCnFoApNXGvcpZLwFIlSc6RmsIqnWJBazI6jMD%2F7RihweG%2BNE4iWw1D0WCp00U3IyNw%2Ba%2F2P1aOoa5pp%2F4fIYPPV75CgfuJ87F2WaTHVcG8mDlAyfQAu9PUSmsjjjQorcSNhcVWo%2Btva2rqeOA%2FlO7tHW8t7agWGgujXTzvWpe0Udi%2BU1OMDrXJ4KqCwkJWa2vrJg91bWTm3o%3D&type=f2%3A%3B.6850610d4680680b2811f3dcdca6be379af%3B5.ac48522a20946644e52a8ef8e64166f19c0ca9cdf89835745bb551d3fa4fa48fc420970e4b5f6bcb11f118aacdec6241e5cbb471aee

Expand full comment
Feb 13·edited Feb 13Liked by Gary Marcus

Dear Gary, that's a lovely way to put it, 'statistics over understanding'!

The statistics are derivative in nature, dependent on word order, pixel order, syllable/tone order... , which have no *inherent* meaning [foreign languages are foreign, when the symbols and utterances mean nothing to those who weren't taught their meaning; same w/ music, math, chemical formulae, nautical charts, circuit diagrams, floor plans...].

Symbols have no inherent meaning, they only have shared meaning. And we can impart such meaning to an AI that shares the world with us, like we do with humans and animals that share the world with us - physically. Everything else is just DATA, ie DOA.

Expand full comment

I believe that meaning frequently depends on what we can do with items. Of course, the reference needs to be shared socially. However, to even assign meaning requires someone who cares about the thing. And why do we care? Because of the actionable quality of the thing the symbol refers to.

We often conceptualize the world in terms of our perceptions and cognitive representations; we should not forget that much of our survival depends on how we can change the world to match our goals.

Expand full comment

100%.

'A responsible caregiver' is usually who teaches infants - moms, other fam members, nannies... then it's pre-school teachers and classmates, then it's society at-large... Without all of this, learning about the world is difficult if not impossible [animal behavior and instincts aside]. Even the sense of Self might be induced via others, possibly.

ALL of this requires a body.

Expand full comment

"Responsible caregiver", I love that. It reminds me of a theme from "The Little Prince".

"You are not special yet. No one has tamed you, and you have tamed no one. My fox was like you. He was like a hundred thousand other foxes. But I have made him my friend, and now he is unique."

Everything in the world can be unique, and given a name, but doing so is difficult without a purpose. Very high level, but personally, I believe that's also behind cognitive impairment in major depression. If nothing has a purpose, nothing has a meaning. Naming, taming, selecting one thing over another becomes impossible because there is no reason to do so. Perception, attention, memory retrieval: all atrophy because they need to be applied because of something, not just to something.

Expand full comment
Feb 14·edited Feb 14

Nice! That quote from 'The Little Prince' (and the entire book in fact) is 'fire', lol.

You hit the nail on the head, about 'purpose', including how lack of it could stem from depression. The reverse might be true too - inability to lead a meaningful/purposeful life might lead to boredom, withdrawal, anger and a bunch of other feelings (eg in societies with high unemployment, lack of opportunities, corrupt governments that don't care about society's progress etc).

Scientific exploration is also driven by a 'need' to find meaning in nature, to understand it, benefit from its phenomena etc. The search for 'meaning' at a deeper level can occur even in the most dire circumstances, eg as documented in Viktor Frankl's amazing work [eg. described in https://www.pursuit-of-happiness.org/history-of-happiness/viktor-frankl/].

Now when we compare all these aspects about being 'human', with something like an LLM that does dot products, with its practitioners claiming parity... Lol.

Expand full comment
Feb 14Liked by Gary Marcus

It's also very interesting to talk about chess with ChatGPT: "Can you play chess" - ChatGPT says yes, let's go. It can explain all the rules, opening principles, movement of the pieces, tactics such as pins and fork.

It even spits out some correct moves, using correct notation. But sooner or later it will make moves that are illegal (such as jumping over the opponent's pieces with a rook), despite ChatGPT being able to explain eloquently that rooks can't jump over pieces - it lacks understanding that that's exactly what it did.

Expand full comment
author

Precisely

Expand full comment

This is kind of funny, AI can do some things that humans can't do, but it also makes mistakes that a human would not do. Smarter than many, then dumber than all.

Expand full comment

I so appreciate these illustrated posts about AI because the non-technically-minded can grasp the main point, that these fancy algorithmic agglomerations are still failing in very fundamental ways. Controversy over when AGI might arrive and why open source models are dangerous are important, but those more esoteric topics also seem like sci-fi to the general reader. This does not. Another article I'm happy to cross post for my readers!

Expand full comment

If we give developers $7T, THEN will the program be able to generate images of people writing with their left hand?

Expand full comment

Add anti-southpaw bias to the list of Silicon Valley sins. My persecution complex is fired up and ready to go.

Expand full comment
author

🤣

Expand full comment

That first defensive e/acc screenshot though, apparently the AI was "teached" better than him... and this is who is designing our future? Good game, humanity.

Expand full comment
Feb 13Liked by Gary Marcus

You are so clearly right on this front, I'm curious why you think people are trying to say otherwise. Is it because that's how they can (try to) raise $7T? Because they actually have no clue how to build AGI? Because they think LLMs can ultimately get the same result (from a practical standpoint) as AGI?

Expand full comment
author

$

Expand full comment
Feb 17Liked by Gary Marcus

LLMs are a bit like my kids. They are unpredictable, don’t do what you tell them and are very hard to behave correctly. And they occasionally break things.

Expand full comment
Feb 16Liked by Gary Marcus

I can't wait to create the right handed writer/guitarist and wrong clock videos generated by OpenAI Sora.

Expand full comment

Two quotes popped into my head when I read this:

"When we're trying to sell it we call it AI and when we're trying to make it work we call it pattern recognition." -- my "elevator speech" to Honeywell management during the DARPA Grand Challenge.

"Your *other* left foot." -- My drill sergeant in USAF OTS.

Expand full comment

The other left :D

Expand full comment

I am almost at the point in giving up trying to have philosophical conversations with the engineering minded (and business-minded, and most "scientists"). Too demanding, and I am not getting paid... I get though: they want to build things and get things done, not question their assumptions (unless forced too, and even then it's tough).

However, there is no excuse for so-called "scientists" not being willing to be philosophical, since the best scientists in history were also philosophers (Einstein, Werner Heisenberg, etc.). Now they are more often like a kind of technician, or engineers, bureaucrats, businessmen, politically savvy, working in their extreme specialities, not (at least professional) questioning the most basic assumptions about reality, the nature of intelligence, self, understanding, consciousness, what exists, the goal of life, etc. "Who has time for that, except for the 'theory of the leisure class' " as one academic humorously put it.

And meanwhile academic philosophy has gotten lost (psychology too, with its assumptions hardened into dogma), into a rut, trying to be a handmaiden to science, or merely doing conceptual analysis in hyper-specialties no one can understand, irrelevant to living life, losing it's way from the original "love of wisdom" the Greeks knew...

The fact is, no one really knows what intelligence or understanding are. But you have to start somewhere – if doing engineering, then you start from the bad assumptions you have and see what happens – which is what we are seeing now (they just need to be more honest about it); but for science you need to question the assumption; and in philosophy, it goes even deeper, to the source, where we are in the realm of the Unknown – not a comfortable place for many (including the players mentioned about). I see that "somewhere" as empirical, from direct ("inner") experience, to really start to get anywhere regarding intelligence, understanding, and awareness... and few dare to venture there.

But I've already said too much.

Expand full comment

Well said. Clap, clap

Expand full comment

The post uses universals" they have never: and other indications that these posts are opinions with a axe to grind not a scientific expository.

The examples show problems that can be exposed. It would be more interesting to see the power and the limitations that GenAI as all techniques and technologies will have.

"always tried to use statistics as a proxy for deeper understanding"

What would we mean by deeper understanding... do i or you have deep understanding?

Yes I do but actually people only have "deep understanding" at the time they defend their thesis, and only about the topic they studied...It is hard to be up to date and comprehensive to get to be that much of an expert.

Maybe not for you, but aggregating human knowledge and communicating prompts in social terms has become a productive thing for hundreds of millions of people.

Expand full comment

maybe it is only strict filtering mechanisms built into the AI by the developers to reduce bizzare outputs, thereby forcing the AI to rely more on statistically more prevalent images

Expand full comment

Gary, intuitively I agree with all your arguments, but have been playing my own devil’s advocate.

Without referring to current empirical evidence, what are in your view the most succinct fundamental reasons why LLMs will never reach “understanding”, which I interpret as the ability to robustly reason and apply logic,

I.e. that we hit the limits of the current paradigm, and bigger may get better but never anything close to flawless?

If LLMs can roughly be understood to be statistical memory machines, that can adequately represent and reproduce the knowledge that is in the training data, would it be plausible that perfect data for a specific domain, containing all required knowledge and reasoning pathways for that domain (eg known medicine), lead to robust reasoning? Like training a simple regression on real world data from newtonian experiments will lead to a perfect ML model for that physics domain, even without having theoretical conceptual understanding? So in a sense, it gains some implied understanding?

Expand full comment

If you just look responses from ChatGPT it can be confounding because it appears like it "understands" in a human-like way. In software one of the most important decisions software architects make is to design the architecture (or framework) to meet the current and future product requirements. The Transformer Architecture is specially designed for LLMs. It is not designed to understand, but to predict a response to a query in text, images, speech, etc

The English language supports around 1 million words. Every word is defined "by the company it keeps". So the word "work" in a massive weighted vector space is very remotely associated with the planet "Mars" But "Mars" is hugely weighted with "planet" . Then you have words that keep the same company, for example "apple" and "pear" since they are both fruits.

Same with images that are more naturally represented as vectors. With primary Red, Blue, Yellow intensity it mimics the way we see colors on our computer screen.

If you vaguely understand the word "illusion" that's exactly what it is.

Expand full comment
author

all prediction, no interrogable world model that can be reasoned over

Expand full comment

Thanks. And what’s your view on JEPA in that respect, the approach LeCunn is taking?

https://ai.meta.com/blog/v-jepa-yann-lecun-ai-model-video-joint-embedding-predictive-architecture/

Expand full comment