Marcus on AI

Professor, I don't think the oligarchs are limiting themselves to imposing their beliefs on science, they seem to be doing a good job of imposing their beliefs on all of us via a coup of the Federal government, as well.

Expand full comment

Anyone else think they are trying to defund the NIH so that consumers have no alternative to TikTok snake oil? Anyone can sell a vitamin with no evidence (Dr Jen Gunter has a nice breakdown on how easy this is). Vaccines and pharmaceuticals are well regulated and expensive, and they have to work better than placebos. Venture capitalists want faster returns, why not buy the President and eliminate the data and the competition?

Expand full comment

They could rename it TicTac and sell those in place of drugs

Expand full comment

Juan P. Cadile

Happy birthday. Great read

Expand full comment

Dakara

Deep Research is unfortunately the upgrade from mass produced hallucinated blogs to mass produced hallucinated research papers. The errors are just more difficult perceive as they are obfuscated by eloquent convincing professional sounding language.

Expand full comment

Hallucinated research papers are what you get when you give computers LSD (Lysergic Acid Di-LMide)

Expand full comment

Dakota Gale

Happy bday! Thanks for continuing to share your thoughts on all this.

FWIW, this recent podcast convo between Aza from The center for humane tech and Yuval Harari was fascinating in its scope and wisdom of “build AI, but slow down.” I’m sure readers here will dig it. https://podcasts.apple.com/us/podcast/your-undivided-attention/id1460030305?i=1000672033990

Expand full comment

"there is no sweeter gift than vindication"

This is why Gary's arguments get ridiculed so much.

The core assertion, that neural nets provide, at most, educated statistical prediction, is correct.

I fully agree with Gary that the current methods don't have what it takes, and they will need either very honest modeling on top, or even a whole new approach.

It is the self-bragging and polemical takes that really distract from the message.

Expand full comment

Gary Marcus

Dude. On my birthday no less. Please unsubscribe.

Expand full comment

Gary, sooner or later you'll have me banned, because people don't put up with constant criticism.

The message is still there. Sharpen the blade. It will cut better.

Expand full comment

Terry Bollinger

I am impressed that this paper was so early in the cycle. A great analysis!

Expand full comment

Gary, I keep wondering what might precipitate the crash/correction/bubble bursting. DeepSeek can (comparatively, we don’t know the actual numbers) cheaply replicate what OpenAI builds, but no one can cheaply create reliable-ish models - that’s why all the mechanical Turks at ScaleAI are working away behind the curtain. If there’s no way to build new capabilities without that (outside of knowledge areas where the ground truth is known and well documented) does that actually break this? ( I am not an AI expert, obviously, I’m a librarian who works in areas where data/truth is hard to find - otherwise they would not ask me).

Expand full comment

You may not be an AI expert, but you obviously possess something that very few of the AI “experts “ possess: concern for truth.

The field could use more librarians and fewer AI-brainians

Expand full comment

Awww, thanks! I actually have a master of science in information and my program was a mix of people studying libraries and archives and UX. So I got to see where a librarian’s approach to search (source first) contrasted with an IT person’s (in 2004, that was Google Google Google, wait, why can’t I google this?). They could do amazing things that I couldn’t (and didn’t want) to do, but we had skills, too.

Expand full comment

LLMs actually have something basic in common with the original Google search.

Both are based on frequency (of token occurrence for LLMs and of web page links for search) — as opposed to relevance, accuracy and truth.

Expand full comment

And both are useful in some cases, useless in others, and best when you know the difference 🥲 A serenity prayer for knowledge workers….

Expand full comment

Back in the early days of the web, I developed a web site for an artist friend.

But I had the damndest time getting it to appear on a Google search (even though I sent the link to Google) because, as a new site, no one knew about it and hence no one was linking to it!

And since the Google algorithm was ranking sites back then based on number of links to the site, my friends site would not even appear.

Quite the quandary.

I learned then and there what a crazy algorithm it was — basically a popularity contest.

I finally did manage to get my friend’s site to appear near the top of a Google ranking but the way I had to do it was to submit information about the site to an organization that Google was working with that had people manually looking at sites and deciding whether they merited a high Google ranking. But the whole thing took several months!

Expand full comment

The problem is that frequency need not (and often does not) have any relationship with usefulness.

Expand full comment

Computer science seems to be in a rut — on a scratched record — that keeps repeating the same damned thing over and over.

Expand full comment

Fabian Transchel

"If there’s no way to build new capabilities without that (outside of knowledge areas where the ground truth is known and well documented) does that actually break this?"

Yes, indeed. It's been reiterated (*at least*) since BERT2 and GPT-3 that OOD (out-of-distribution) sampling simply cannot work with transformer-based structures, but it's actually worse than that: Every time you do RLHF (reinforcement learning [from] human feedback), the model will diminish something else it learned before. So it *is* possible to train a model on, say, ARC-AGI, but *not* without reducing efficacy on other tasks. Now, I see a world where this can actually be a useful insight (for both edge models and super-specific applications), but it severely hinders AGI progress. Many people amongst Gary have pointed out that this observation alone is enough to show that transformers are *not* the way to AGI.

Expand full comment

"Every time you do RLHF (reinforcement learning [from] human feedback), the model will diminish something else it learned before. So it *is* possible to train a model on, say, ARC-AGI, but *not* without reducing efficacy on other tasks."

This assumes a monolithic model for which "being nice" overrides the rest.

In reality, we will likely see many specialized models. Some will do math, without human feedback whatsoever. Others will do code. Trained completely separately, even by different teams.

Out-of-distribution work is hard for people too. We sweat mightily doing unfamiliar stuff. An AI will have to diligently search and try things, and then human feedback is simply not relevant as the current situation guides the work.

So, there is plenty of directions for AI to branch into, without abandoning the premise that a lot of data and neural nets are part of the solution.

Expand full comment

Clyde Wright

But ARC-AGI was specifically created to test models on OOD..?

Expand full comment

Patty L

Vindication requires an inordinate amount of patience. Good work and Happy Birthday!

Expand full comment

Gerben Wierda

Feb 9Edited

Deep Learning Hitting a Wall (DLHW) from 2022 is a great piece.

But going against unreasonable hype never gets you a warm welcome. What happens to you comes from a basic part of the human intellectual makeup: tribalism. Your (correct) critique is acted upon as upon a wayward member of the tribe, whose position undermines the effectiveness of the 'believed' approach. If one of 20 hunters doesn't follow the standard approach, the hunt fails. Evolutionary pressure requires our convictions to be stable and conformist:

If the herd believes the hype, you're not heard when you are right.

There is more than a shallow likeness between Gary's position and that of Hubert Dreyfus in the 1960s-1980s. With his "AI and Alchemy" RAND paper playing the role of DLHW. It might take decades, not years before the value of DLHW is accepted. The likeness to what happened to Dreyfus (he was mercilessly attacked and ridiculed by the Hintons and LeCunns of that age) is of course slightly ironical.

Expand full comment

Reply (3)

So, if you are not among the herd, you are not among the heard?

Expand full comment

People like Gary who are not among the AI herd, are not only not among the AI heard, but are AI-stracized because they cause cognitive dissonance in members of the herd

Expand full comment

And Lord LeCun knows, there is nothing worse for herd morale.

Expand full comment

Feb 10

Note that Dreyfus was ridiculed for saying symbolic approaches are not the way.

Now maybe the pendulum swung too much in the direction of "non-symbolic is all you need".

It is important to point out that reasoning models, code generation, knowledge graphs, external tools, are already attempts at processing at a higher conceptual level than just "weights in, weights out".

Fundamentally, there's nothing wrong with symbols, as long as the problem is cleaned up enough that they can be used. That's what Ben Goertzel seems to be doing. He uses LLM to convert the problem to what his Bayesian reasoning engine understands, then does rigorous processing.

Expand full comment

Jurgen Gravestein

In many ways, Dreyfus' "fight" is very much alive.

Expand full comment

Gerard

You were completely right in your assessment then, and you still are now. These are not random arguments but well-documented facts.

It takes time and effort to fully grasp these ideas because they are not driven by profit but by intellectual honesty.

The conclusion remains just as valid today: we need to recognize when a paradigm has reached its limits and be willing to reexamine the original assumptions that led us here if we want to achieve real breakthroughs.

Unfortunately, this perspective will not become mainstream because it would mean a significant loss of credibility, an end to funding, and ultimately, the collapse of certain business models.

For the research community, however, the situation is different. We have a moral and ethical responsibility to acknowledge the facts and accept that this particular path of deep learning, at least in its current form, may have run its course.

Expand full comment

Tim Koors

Happy Birthday and keep up the good work. What you do has value that others do not always see nor do they appreciate your efforts. Fight the good fight. You are like one of the biblical prophets forced to wander in the desert for trying to speak to power: the power of the tech bro oligarchy. A kakistocracy with kleptocratic tendencies if ever there was one.

Expand full comment

Gary is wandering in the AI desert looking for an O-AI-sis

Expand full comment

But all he keeps seeing on the horizon are mirages

Expand full comment

David in Tokyo

But he recognizes the mirages as mirages. (Which is why I like your note.)

The tech bros just see the mirages.

Expand full comment

Yes, Gary does recognize them as mirages.

But I think many others do as well. They just won’t admit it

Expand full comment

And the Son of God

Expand full comment

With $40 billion in his hand

Expand full comment

Kathy Matheson