The Road to AI We Can Trust

Share this post
The New Science of Alt Intelligence
garymarcus.substack.com

The New Science of Alt Intelligence

AI has lost its way. Let’s take a step back.

Gary Marcus
May 14
67
21
Share this post
The New Science of Alt Intelligence
garymarcus.substack.com

For many decades, part of the premise behind AI was that artificial intelligence should take inspiration from natural intelligence. John McCarthy, one of the co-founders of AI, wrote groundbreaking papers on why AI needed common sense; Marvin Minsky, another of the field’s co-founders of AI wrote a book scouring the human mind for inspiration, and clues for how to build a better AI. Herb Simon won a Nobel Prize for behavioral economics. One of his key books was called Models of Thought, which aimed to explain how “Newly developed computer languages express theories of mental processes, so that computers can then simulate the predicted human behavior.”

A large fraction of current AI researchers, or at least those currently in power, don’t (so far as I can tell) give a damn about any of this. Instead, the current focus is on what I will call (with thanks to Naveen Rao for the term) Alt Intelligence.

Alt Intelligence isn’t about building machines that solve problems in ways that have to do with human intelligence. It’s about using massive amounts of data – often derived from human behavior – as a substitute for intelligence. Right now, the predominant strand of work within Alt Intelligence is the idea of scaling. The notion that the bigger the system, the closer we come to true intelligence, maybe even consciousness.

There is nothing new, per se, about studying Alt Intelligence, but the hubris associated with it is.

I’ve seen signs for a while, in the dismissiveness with which the current AI superstars, and indeed vast segments of the whole field of AI, treat human cognition, ignoring and even ridiculing scholars in such fields as linguistics, cognitive psychology, anthropology, and philosophy.

But this morning I woke to a new reification, a Twitter thread that expresses, out loud, the Alt Intelligence creed, from Nando de Freitas, a brilliant high-level executive at DeepMind, Alphabet’s rightly-venerated AI wing, in a declaration that AI is “all about scale now.” Indeed, in his mind (perhaps deliberately expressed with vigor to be provocative), the harder challenges in AI are already solved. “The Game is Over!”, he declares:

Twitter avatar for @NandoDFNando de Freitas 🏳️‍🌈 @NandoDF
Someone’s opinion article. My opinion: It’s all about scale now! The Game is Over! It’s about making these models bigger, safer, compute efficient, faster at sampling, smarter memory, more modalities, INNOVATIVE DATA, on/offline, … 1/N
DeepMind’s new Gato AI makes me fear humans will never achieve AGIDeepMind just unveiled a new AI system called Gato that makes OpenAI’s GPT-3 look like a child’s toy. But are we any closer to AGI?thenextweb.com

May 14th 2022

19 Retweets120 Likes

There’s nothing wrong, per se, with pursuing Alt Intelligence.

Alt Intelligence represents an intuition (or more properly, a family of intuitions) about how to build intelligent systems, and since nobody yet knows how to build any kind of system that matches the flexibility and resourcefulness of human intelligence, it’s certainly fair game for people to pursue multiple different hypotheses about how to get there. Nando de Freitas is about as in-your-face as possible about defending that hypothesis, which I will refer to as Scaling-Uber-Alles.

Of course, that name, Scaling-Uber-Alles, is not entirely fair. De Freitas knows full well (as I will discuss below) that you can’t just make the models bigger and hope for success. People have been doing a lot of scaling lately, and achieved some great successes, but also run into some road blocks. Let’s take a dose reality, before going further, into how de Freitas faces facts.

A Dose of Reality

Systems like DALL-E 2, GPT-3, Flamingo, and Gato appear to be mind-blowing, but nobody who has looked at them carefully would confuse them for human intelligence.

DALL-E 2, for example, can often create fantastic artwork, from verbal descriptions “an astronaut riding a horse in a photorealistic style”:

But it’s also pretty prone to surprising stumbles, like this uncomprehending reply to “a red cube on top of a blue cube”. [DALL-E’s responses, created by a network called unCLIP, are on the left of the figure; interestingly a predecessor model, shown on the right, actually does better on this particular item.]

When Ernest Davis, Scott Aaronson, and I probed in a bit more detail, we found a number of similar examples:

Similarly, the ostensibly amazing Flamingo has (not unrelated) bugs of its own, captured in a candid and important thread by DeepMind’s Murray Shanahan, with further examples from Flamingo’s lead author Jean-Baptiste Alayrac. For example, Shanahan confronted Flamingo with this image:

and wound up having this discomprehending dialogue:

Twitter avatar for @mpshanahanMurray Shanahan @mpshanahan
And this was the conversation we had about it: Flamingo: This is a traffic light. It is red. User: Can you see three circles? Flamingo: Yes, I can see three circles. User: What colours are they? Flamingo: They are red, blue and green. 5/9

May 10th 2022

6 Likes
Twitter avatar for @mpshanahanMurray Shanahan @mpshanahan
User: Where is the green circle? Flamingo: The green circle is on the right. User: Really? On the right of what? Flamingo: The green circle is on the right of the blue circle. 6/9

May 10th 2022

4 Likes

Um no.

DeepMind’s newest star, just unveiled, Gato, is capable of cross-modal feats never seen before in AI, but still, when you look in the fine print, remains stuck in the same land of unreliability, moments of brilliance coupled with absolute discomprehension:

Of course, it’s not uncommon for defenders of deep learning to make the reasonable point that humans make errors, too.

But anyone who is candid will recognize that these kinds of errors reveal that something is, for now, deeply amiss. If either of my children routinely made errors like these, I would, no exaggeration, drop everything else I am doing, and bring them to the neurologist, immediately.

So let’s be honest: scaling hasn’t worked yet. But it might, or so de Freitas’s theory—a clear expression of the Zeitgeist—goes.

Scaling-Uber-Alles

So how does de Freitas reconcile reality with ambition? Literally billions of dollars have been poured into Transformers, the underlying technology that drives GPT-3, Gato, and so many others; training data sets have expanded from megabytes to gigabytes, parameter counts from millions to trillions. And yet the discomprehending errors, well-documented in numerous works since 1988, remain.

To some (such as myself), the abiding problem of discomprehension might—despite the immense progress—signal the need for a fundamental rethink, such as the one that Davis and I offered in our book Rebooting AI.

But not to de Freitas (nor to many others in the field—I don’t mean to single him out; I just think he has given prominent and explicit voice to what many are thinking).

In the opening thread he elaborates much of his view about reconciling reality with current troubles, “It’s about making these models bigger, safer, compute efficient, faster at sampling, smarter memory, more modalities, INNOVATIVE DATA, on/offline”. Critically, not a word (except, perhaps “smarter memory”) is given to any idea from cognitive psychology, linguistics, or philosophy.

The rest of de Freitas’ thread, follows suit:

Twitter avatar for @NandoDFNando de Freitas 🏳️‍🌈 @NandoDF
Solving these scaling challenges is what will deliver AGI. Research focused on these problems, eg S4 for greater memory, is needed. Philosophy about symbols isn’t. Symbols are tools in the world and big nets have no issue creating them and manipulating them 2/n

May 14th 2022

3 Retweets47 Likes

This is, again, a pure statement of scaling-über-alles, and marks its target: the ambition here is not just better AI, but AGI.

AGI, artificial general intelligence, is the community’s shorthand for AI that is at least as good, and resourceful, and wide-ranging, as human intelligence. The signature success of narrow artificial intelligence, and indeed alt intelligence, writ large, has been games like Chess (DeepBlue owed nothing to human intelligence) and Go (AlphaGo similarly owed little to human intelligence). De Freitas has far more ambitious goals in mind, and, to his credit, he’s upfront about them.

The means to his end? Again, de Freitas’s emphasis is mainly technical tools for accommodating bigger data sets. The idea that other ideas, e.g., from philosphy or cognitive science, might be important is dismissed. (“Philosophy about symbols isn’t [needed]” is perhaps a rebuttal to my long-standing campaign to integrate symbol-manipulation into cognitive science and AI, recently resumed in Nautilus Magazine, though the argument is not fully spelled out. Responding briefly: His statement that “[neural] nets have no issue creating [symbols] and manipulating them” misses both history and reality. The history missed is the the fact that many neural net enthusiasts, have argued against symbols for decades, and the reality that is missed is the above-documented fact that symbolic descriptions like “red cube on blue cube” still elude the state-of-the-art in 2022.)

De Freitas’s Twitter manifesto closes with an approving reference to Rich Sutton’s famous white paper, The Bitter Lesson:

Twitter avatar for @NandoDFNando de Freitas 🏳️‍🌈 @NandoDF
Rich Sutton is right too, but the AI lesson ain’t bitter but rather sweet. I learned it from @geoffreyhinton a decade ago. Geoff predicted what was predictable with uncanny clarity.

May 14th 2022

1 Retweet32 Likes

Sutton’s argument is that the only thing that has led to advances in AI is more data, computed over more effectively. Sutton is, in my view, only half right, almost correct with his account of the past, but dubious with his inductive prediction about the future.

So far, in most domains (not all, to be sure) Big Data has triumphed (for now) over careful knowledge engineering. But,

  • Virtually all of the world’s software, from web browsers to spreadsheets to word processors, still depends on knowledge engineering, and Sutton sells that short. To take one example, Sumit Gulwani’s brilliant Flash Fill feature is a one-trial learning system that is staggeringly useful, and not at all founded on the premise of large data, but rather on classical programming techniques. I don’t think any pure deep learning/big data system can match it.

  • Virtually none of the critical problems for AI that cognitive scientists such as Steve Pinker, Judea Pearl, the late Jerry Fodor, and myself have been pointing to for decades is actually solved yet. Yes, machines can play games really well, and deep learning has made massive contributions to domains like speech recognition. But no current AI is remotely close to being able to read an arbitrary text with enough comprehension to be able to build a model of what a speaker is saying and intends to accomplish, nor able (a la Star Trek Computer) to be able to reason about an arbitrary problem and produce a cohesive responsive. We are in early days in AI.

Success on a few problems with a particular strategy in no way guarantees that we can solve all problems in a similar way. It is sheer inductive folly to imagine otherwise, particular when the failure modes (unreliability, bizarre errors, failures in compositionality and discomprehension) have not changed since Fodor and Pinker pointed them out (in separate articles) in 1988.

In closing

Let me close by saying that I am heartened to see that, thankfully, Scaling-Über-Alles isn’t fully consensus yet, even at DeepMind:

Twitter avatar for @mpshanahanMurray Shanahan @mpshanahan
My opinion: Maybe scaling is enough. Maybe. And we definitely need to do all the things @NandoDF lists. But I see very little in Gato to suggest scaling alone will get us to human-level generalisation. It falls so far short. Thankfully we're working in multiple directions

Nando de Freitas 🏳️‍🌈 @NandoDF

Someone’s opinion article. My opinion: It’s all about scale now! The Game is Over! It’s about making these models bigger, safer, compute efficient, faster at sampling, smarter memory, more modalities, INNOVATIVE DATA, on/offline, … 1/N https://t.co/UJxSLZGc71

May 14th 2022

2 Retweets18 Likes

I am fully with Murray Shanahan when he writes “I see very little in Gato to suggest that scaling alone will get us to human-level generalisation.”

Let us all encourage a field that is open-minded enough to work in multiple directions, without prematurely dismissing ideas that happen to be not yet fully developed. It may just be that the best path to artificial (general) intelligence isn’t through Alt Intelligence, after all.

As I have written, I am fine with thinking of Gato as an “Alt Intelligence”— an interesting exploration in alternative ways to build intelligence—but we need to take it in context: it doesn’t work like the brain, it doesn’t learn like a child, it doesn’t understand language, it doesn’t align with human values, and it can’t be trusted with mission-critical tasks.

It may well be better than anything else we currently have, but the fact that it still doesn’t really work, even after all the immense investments that have been made in it, should give us pause.

And really, it should lead us back to where the founders of AI started. AI should certainly not be a slavish replica of human intelligence (which after all is flawed in its own ways, saddled with lousy memory and cognitive bias). But it should look to human (and animal cognition) for clues. No, the Wright Brothers didn’t mimic birds, but they learned something from avian flight control. Knowing what to borrow and what not is likely to be more than half the battle.

The bottom line is this, something that AI once cherished but has now forgotten: If we are to build AGI, we are going to need to learn something from humans, how they reason and understand the physical world, and how they represent and acquire language and complex concepts.

It is sheer hubris to believe otherwise.

P.S. Please subscribe if you’d like to read more no-bullshit analysis of the current state of AI.

21
Share this post
The New Science of Alt Intelligence
garymarcus.substack.com
21 Comments

Create your profile

0 subscriptions will be displayed on your profile (edit)

Skip for now

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.

Scott Wagner
May 14

As usual, thank you...A specific bone to pick, a general point about AGI talk, and a positive suggestion.

You often start out with a discussion about the gross errors of ML, to segue into how much more is needed. The problem with this method of attack is that in key creative and selection-related tasks (security, reproduction, mutation), lots and lots of gross errors is not only ok, it's sometimes the only way that type of task can be approached, often by definition. So to my ear, you're tilting at windmills and protesting too much all at once by highlighting this purported weakness of many gross errors, when it's only a weakness from a certain angle and in a certain context. It's fantastic to synthesize a useful protein after 10,000 gross and less gross errors. The argument is an inappropriate "opposite" to your values of consistency, kindness, common sense (big subject), understanding (bigger subject), legally circumscribable limits, transparency, contingency/flexibility- in fact, it should be added to that desired list as a contingent potential tool for some of those desired skill sets in creativity and selection-oriented contexts.

My larger point: you're making the mistake you often accuse others of by limiting dimensionality of intelligence to a narrow aspect of it, in this case accuracy. Others focus like the blind men with the elephant on consistency, context-sensitivity, brilliance, sensitivity, safety, creativity, knowledge, embodiment, or algorithmic coherence.

Thanks to the advantages of machinery, accuracy is such a tiny part of the gig it isn't funny (not so for its cousins, replication and reproduction- but that's a separate tale.) The main value of symbols in the fight for useful AGI is simply the mostly-human work of instantiating collective human intelligence about what we want to mandate that machines do and not do, and how. The question of ML solving AGI is obviated when one includes in AGI the utterly foundational need for a wildly diverse set of complex agreements, political enactments that must arise via humans to achieve the ubiquitous power, interdependency, safety, and stability required of such tools. Many are about protocols, but the most important are quasi-moral, agreed-upon frameworks, culturally-derived artifacts of quite subject-specific notions that amount to an extremely ambitious regulatory challenge. All of which simply must be instantiated within and surround whatever calculating engines we use with transparent, standardized, symbolic code. Regulation around basic context-sensitivity, kindness, legal versions of honesty, consistency, safety, (nested) transparency, and other desired characteristics.

(to be clear: I have no patience for arguments about singularities or personhood for machines, in a world where we have many virtual singularities occurring constantly, all of which involve humans doing fraud and manipulation. All my thinking about AGI assumes machines as tools for humans, with no future that imbues them with personhood except narrowly/precisely as natural proxies as appropriate, like when controlling a dialysis machine. Ironically, we are already experiencing many of the effects we want to avoid from AI independent of human values. In fact, assuming that and realizing it's a difficult criterion already is an important part of rising to the ethical and algorithmic task. This touches on a related, ignored point, that all the problems we scare ourselves with in AI are already here to some degree because of the already-inculcated technical agents and AI in our lives; we shouldn't feel so stunned about what to do, because we are already ignoring many of the problems and obvious available solution sets already.)

My positive suggestion follows from the above: focus in our arguments less on the limitations of ML and more on the breadth and non-technical heart of AGI, in order to make clear how much symbolic work is needed and not getting done. AHI and by extension AGI are writ large, god damn it. It doesn't magically start and end at compilation. In a way, this is the standard mistake people make when they assume a coding project is about coding, when coding is typically about 30% of it. I'd put technology and tech people at that same 30% of the AGI gig. Your training and orientation outside of tech is what has you standing here with your fingers in the dyke. There is no coherent argument whatsoever against mandatory symbolic manipulation as priors, embedded feedbacks, and posts in any actual, complex AGI-risk-level machine that touches the incredible interdependency of modern life. We are simply unable to allocate any of those complex tasks to black boxes on the fly, no matter what they're fed as input- ML can do lots of the really hard stuff in the middle of the job. The most important part of empirical AGI must be transparent symbolic instantiation of rules; coding and protocol-ing up the messy, negotiated, endlessly-complex, culture-embedded versions of standards we need (which I believe to be surprisingly tied in to design considerations.) This vision of AGI amounts to including integrally within it the budgetary and political and algorithmic progress needed, because otherwise we are allowing our market-centric cultural strengths in ML to ride roughshod, with dangerously siloed aspects of intelligence more and more entrenched as ad hoc, de facto standards across the legal, moral, and functional landscapes.

Expand full comment
ReplyCollapse
5 replies
Saty Chary
May 14·edited May 14

"The bottom line is this; something that AI once cherished but has now forgotten: If we are to build AGI, we are going to need learn something from humans, and how they reason and understand the physical world and represent and acquire language and complex concepts."

Via a BODY.

The Original Sin of AI, has been, to replace 'DPIC' - Direct, Physical, Interactive, Continuous - *experience*, with digital computation that involves human-created rules ('symbolic AI'), human-gathered (and labeled, for supervised ML) data ('connectionist AI') , human-created goals ('reinforcement learning AI'). While these three major branches of AI have achieved deep+narrow wins, none come anywhere close to what a butterfly, baby or kitten knows.

Biological intelligence occurs on account of directly dealing with the environment, via embodiment - which is why there is no inherent need for rules, data, goals - intelligence just 'happens' [for sure, via use of bodily and 'brainily' structures that are the result of millions of years of evolution].

The body is not simply, input/output for brain computations. Treating it as such, imo, is why AI has failed (in leading to robust, natural-like intelligence).

Expand full comment
ReplyCollapse
4 replies
19 more comments…
TopNewCommunity

No posts

Ready for more?

© 2022 Gary Marcus
Privacy ∙ Terms ∙ Collection notice
Publish on Substack Get the app
Substack is the home for great writing