16 Comments

I find two things interesting about the thinking of AI fanbois (for want of a better word).

1. They are "asymmetrically surprised". When an AI does something amazing and clever, they are rightly excited, but they downplay or ignore the same AI doing something stunningly stupid. Yet errors are surely at least as important as successes, especially if you want to figure out where all this is going.

2. They misunderstand understanding, either underestimating what a general intelligence actually is, or overestimating what can be achieved simply by using larger and larger training datasets. Do they think understanding is just a statistical artefact? Or do they suppose it's an emergent property of a sufficiently large model?

These things interrelate, because if you're not paying attention to the sheer insanity of AI's mistakes, you won't notice that it's not progressing towards general intelligence at all.

Where it's headed is perhaps more like a general *search* capability.

Expand full comment

I appreciate the speed of your replies, but there are many confusions here. Symbols precede modern cognitive science by a century; the algorithm that performs Monte Carlo Tree Search uses symbols to track a state in a tree; trees are pretty much the most canonical symbolic structure there is. (Standard neural networks don’t take them as inputs, but a great many symbolic algorithms do). It doesn’t matter when cognitive scientists appeal to MCTS or not; you are conflating cognitive science with a hypothesis and set of tools that are foundational to computer science. And again, it doesn’t matter what AlphaFold 2 *cites*; what matters is that the representations it takes on are handcrafted symbolic representations. Poring through citation lists is not the right way to think about this. Furthermore, I don’t say that “classic models of cognitive science” had any impact on those specific architectures (Alpha* and Google Search) at all; I am not sure where you are even getting that. Again I urge you to separate the engineering question from the cognitive modeling question. Here I was talking about the engineering questions, I said that these systems are hybrids of deep learning and symbols. (You are also wrong on Google Search; as far I know, they now use LLMs as one cue among many). You are also playing games by switching between current foundation models (somewhat narrow) and neural networks in general (neurosymbolic is older than foundation models and open to a variety of neural approaches); and certainly google has been using neural networks as a component in search since at least 2016. (And Google Search, the most economically successful piece of AI history, has used symbols from the beginning; PageRank, for example, is a symbolic algorithm).

Expand full comment

The intelligent mind that is impressed with AI is, well, perhaps not.

Expand full comment

Two trends in particular coexist in the AI discourse I feel, leading to a lot of talking past each other.

1. Looking at behaviours that AIs exhibit and announcing they do understand the world

2. Looking at the inputs to AIs today, and methodology used, to say they do not understand the world

If you believe the former, or lean towards it, then arguments similar to the latter will feel like moving the goalposts. If you believe the latter, you might be called a naysayer for not accepting the "amazing things" AI can yet do. In the AI-backed companies I fund I find that the majority of the problem is in making the system deal with the complexities of the real world, and alas this is pretty hard. If an AI is sufficiently dissimilar to us, it becomes harder to decipher whether it's actually understanding the world or just has a very different mental model, often inadequate or wrong because it's not embodied.

The distinction, in effect, is arguing whether or not a particular thing is a p-zombie, and over the relevance of its outward behaviour. This is counterproductive because it's unprovable until better tests are created than "can this LLM reply coherently to a question".

(Also, den Broeck's paper was really interesting, thanks!)

Expand full comment
Comment deleted
Sep 23, 2022
Comment deleted
Expand full comment

a lot of it can; we tend to muddle by with “System I” a lot. but without System II we wouldn’t be humans, and AI still hasn’t captured that part

Expand full comment

Yes, precisely what I was getting at. We've seen instinct being programmed in, though not "System 2", leading to what we'd refer to as flexible intelligence. I do wonder whether reinforcement learning approaches will be helpful a bit more re this particular point, though I think embodiment is essential to be more congruent with reality.

Expand full comment

I have learned more from the critiques than the praise, however, coming to subj late, two years or so ago (GPT-3), as a humanist. I mean critiques rather than strictly criticism or teardown -- you have offered smart critiques (I follow you on twitter) and Hoel (whom I also follow) offers both. You both show modesty in different ways -- Hoel by including more people (including humanists like me) in his orbit and you by asking questions. I think you might see how his capaciousness is truly helpful to the endeavor, particularly as a kind of humanist himself. 4200 words is not unfair, it is attention! Ultimately this is good and brings more foot traffic to the AI store.

Expand full comment

All for humanism, and I don’t mind the length; its the distortion that I object to. Derrida famously joked (reported in his NYT obituary) that he read only one book, but very carefully. Hoel read a lot of my work, but carelessly.

Expand full comment

The same attacks happened to Dreyfus 50 years ago. I read Hoel's article first and immediately was put off by the 'attacking the person' instead of 'attacking the message', but I read on.

What intrigues me is the argument about 'driverless cars'. Driverless cars seem to be somewhere at the boundary of what can and cannot be solved by digital neural nets. "Now, cars can definitely drive themselves. I’ve ridden in them, and it’s nigh miraculous." writes Hoel, and that reminds me about how progress in Dreyfus' time on chess was reported in the 1960s. Chess finally succumbed to that wave of AI in the 1990s, car driving may succumb to this wave (though I doubt it will in full). But even while chess succumbed, it did not lead to AGI, and the same — I estimate — is true for every digital attempt.

I think the people from DeepMind who created that protein folding solution should be nominated for a Medicine Nobel Prize. But that doesn't mean they (and their colleagues) are on. the road to AGI

Can more in depth information on Cruise's and Waynmo's approaches be read somewhere? What are the limits/boundaries they accept? What are their fail-safes?

(I've responded over at the original article as well, comparing it to what happened in the 1960s)

Expand full comment

I don't yet understand how anyone in these conversations could be described as an "AI critic". If anyone would like to explain how that phrase in being used in this context, please do.

Expand full comment

We’re never going to agree about the big points in this dispute, obviously, but here’s three brief defenses of your criticisms concerning me and/or the article itself:

(a) You open by supporting Melanie Mitchell saying that it’s not helpful to brand scholars as “AI critics.” And yet, you’ve referred to yourself as an “AI critic” or variations thereof. Here’s you on a podcast in 2019: “I’m widely known as a critic of AI.” https://gigaom.com/2019/09/19/voices-in-ai-episode-96-a-conversation-with-gary-marcus/. So it seems entirely appropriate to describe you as an "AI critic" in an article.

(b) In the part about goalposts, you point to a post you wrote about the failure in sketching a bicycle by an art AI (this example was popularized the day before your post was published in a viral tweet by Alexey Guzey: https://twitter.com/alexeyguzey/status/1571186653145743361). Humans are also notoriously terrible at drawing bicycles, btw. Regardless, in my paragraph, I make it clear I was talking about the set of current models, various of which have moved past all these goalposts in various ways, not DALL-E alone, which is not a good substitute for, eg, PaLM. This is again merely finding individual bad prompts, or using the wrong AIs to make certain claims, or shifting from "can't do" to "can't do reliably."

(c) The personal charge. I never called you a dilettante. That’s your word. I also mention that you had a company that was acquired by Uber, which you left out of your summary of my brief biography, since it would undermine the point that you’re making that I minimized your accomplishments. However, at the same time, as I said, I went to the same School of Cognitive Science. I can see how studying child language acquisition and connectionism in the 90s might, in principle, be relevant for AI, but in practice, it is quite distant from current deep learning, in ways people outside the field might not understand. So in this your approach is very much not the mainstream of the field. Maybe this new “neurosymbolic AI” that is being brought back by other authors will be impactful years from now (doesn’t seem to have been so far), but cognitive science has had almost nothing to say about the major successes of deep learning (in a manner that should probably cast some doubt on cognitive science itself, tbh). I myself wrote an AI poetry generator at Hampshire (I wonder if it was under the same professor?) and it had absolutely nothing to do with current deep learning approaches. And I would never ever claim that my writing that program *doesn’t* make me a dilettante - it has no bearing on the matter. So I don’t think it counts as “mangling” your biography.

Expand full comment

Come now, it is certainly mangling by biography to omit all my dissertation work, the fact that I was a prof for 30 years, and the fact that in the company (that you did mention) I was a CEO and (co) Founder, and that the company was an AI company. You didn’t use the word dilettante but you certainly painted me that way, and the whole point of that opening was ad hominem, not substantive. Bender and Mitchell sensitized me to how “AI critic” is being used to frame rhetoric; I’ve taken their notes and linked them. Neurosymbolic AI is already impactful; AlphaGo and AlphaFold are neurosymbolic, so is Google Search. Juergen Schimdhuber’s company is arranged neurosymbolic AI, and even LeCun is acknowledging the importance of symbols nowadays (even if he, unike me, thinks they are likely to be learned). CACM devoted a whole article neurosymbolic AI literally yesterday. It is ridiculous meanwhile to say that neural netowkrs of the 1990s have no relevance; all the problems of distribution shift that I emphasized then are exactly what Bengio talks about the beginning of every talk he has given for the last years. That stuff is super relevant, and why I was able to instantly make a series of predictions in December 2012 that have stood the test of a decade. (Aside on the bicycles, I am well aware of the human difficulties, but people can still color the wheels and label the pedals; go through the examples I gave on Sunday and try them.)

Expand full comment

Okay, so I'll expect you to not use the term "AI critic" or "critic of AI" in the future, if you've been now "sensitized" to these terms.



As for the idea that “neurosymbolic AI is already impactful.” You list three examples, one of which is irrelevant, and the other two of which are, as far as I can tell, untrue. First, the irrelevant Google search (which isn't exactly an accomplishment these days, it just links to big news websites first) has almost nothing with the current foundational models. For the other two, let’s consider your claim that AlphaGo nor Alphafold are neurosymbolic.

First, AlphaGo: in your debate with Bengio, you claim that AlphaGo is a hybrid system based on a symbolic logic. Bengio immediately asks “In what way is it a symbolic system?” and you respond “the Monte Carlo tree search” and Bengio immediately replies “It’s a search but there are no symbols.” Which is correct, the general idea of a Monte Carlo search has nothing to do with cognitive science. (link here: https://www.youtube.com/watch?v=EeqwFjqFvJA, timestamp 56:00)

Second, let’s consider Alphafold. The system is introduced here: https://www.nature.com/articles/s41586-021-03819-2. The word “symbol” does not appear, nor “symbolic,” let alone “neurosymbolic.” Of the 84 citations, none of them seem to be from cognitive science that I can tell.

So I think you’re radically overstating the impact of the classic models of cognitive science on these models - as far as I can tell, all the evidence shows it’s basically nonexistent.

Expand full comment

Gary, the best way to prove your point is coming up with a working alternative that outperforms the current dispensation, otherwise they are not going to heed what you have to say...

Expand full comment

alas they may be a 10 year project, 15 if nobody listens. there is so much complexity required to do these things well… see my next decade in AI

Expand full comment

Godspeed Gary, was it Steve who said that the best way to predict the future is to build it?

Expand full comment