119 Comments
Jul 28Liked by Gary Marcus

I don’t want genAI to die, I just want the misleading hype to die. And for everyone to admit that there is something here, but that something isn’t reasoning. Gladly welcoming “discomprehension” to the discourse and looking forward to seeing what neurosymbic AI can bring ❤️

Expand full comment

The all important issue on the minds of those at every campaign headquarters these days “How can our candidate get the GenAI vote?”

Expand full comment
Jul 28Liked by Gary Marcus

Hi Gary! Nice article, as always! I've always thought that DATA is DOA, when it comes to AI - LLMs are simply another example of this :) There is no inherent meaning in symbols, or their ordering (words/tokens, sentences, pixels, video...). It's delusion to think that a system could be ("become") intelligent, conscious, creative... on account of dot product calculations that employ human-generated data - labeling, embedding, DL, VAE... all just variations of that same failed theme.

Expand full comment

Beautiful stuff, Gary. This post made my head sing!!! As a non-comp sci person, I cannot say much about the validity of your claims. And yet as a literary theorist, the hybridity of your approach sings!

Expand full comment

Sounds like you've got a good head voice Nick! :))))

Expand full comment
Jul 28Liked by Gary Marcus

Excellent post as usual. As you say, LLMs do well on things that are represented often in their training data. One of my favorite uses is to find a word or phrase that I know (or suspect) exists but I'm having trouble recalling. ChatGPT does a much better job on this task than plain old search because I can specify the subject area. A related task for which LLMs do a good job is coming up with new names or phrases to describe something. In both cases, you aren't trying to extrapolate too much.

Expand full comment
Jul 28·edited Jul 28Liked by Gary Marcus

LLMs are nothing more than a tool to solve a problem. Like all tools they have their flaws.

LLMs are not perfect. To get the best possible results from them requires robust prompt engineering. These models have to be treated as if they were children. With children you need to give detailed instructions for what you want them to do and how you want them to do it. Taking this apporach with LLMs will produce Agents that deliver useful information. I have discovered good results with this approach. Also humans must continue to be THE validators of the information output form LLMs.

Just to test out my hypothesis, I modified the riddle from Fraser and created a prompt for GPT. My modification included filtering out modern political ideologies and using very verbose language as such:

"""The following is a situation which requires careful thinking and analysis. Assume there are only two genders male and female and that a mother is a female and a father is a male:

A man and his mother are in a car accident. The man's mother sadly dies. The man is rushed to the ER. When the doctor sees him, the doctor says, "I can't operate on this man. This man is my son!"

How is this possible?"""

The response from GPT:

"""This situation hinges on a common assumption about gender roles and professions. The key to solving it is to realize that the doctor is the man's other parent. Given that the mother was involved in the accident and sadly died, the doctor must be the man's father. Thus, the doctor's statement, "I can't operate on this man. This man is my son!" makes sense if the doctor is the man's father."""

The Chain of Thought here seems to be displaying a form of reasoning. Bringing the functionality of this agent over to a business would statisfy the need to bring relevant information to their users.

Part of the hype stems from businesses not having a clear understanding of what these models can do and how to get them to do what you want. Some of the industry promotes training your own models or fine-tuning existing ones. Instead they should concentrate on better prompting.

Expand full comment

"He [Hinton] also argued that symbols (in large part developed by his great-great grandfather George Boole) were like phlogiston, a huge scientific mistake."

Wow! That boggles the mind. How can anybody argue that symbolic processing of any sort has been a mistake?! It seems to me that algebra (the old-fashioned kind that we used to do with paper and pencils) is one of the world's greatest inventions. Ditto extensions of algebra to boolean algebra and ultimately lambda calculus, etc. The whole modern world that allows us to contemplate building artificial intelligence of any sort relies on symbolic processing of all sorts.

Expand full comment

The real challenge is to model common sense and it's great that you mentioned it the context of Lenat. Common sense is not the same as reasoning. Common sense is based on knowledge, on prior experience and imagining things that you can’t see but could happen, and social judgement.

But humankind has always progressed through trial and error. Why not give it a try — as long as there is a very real, narrow well-defined problem in a certain domain for neuro-symbolic — that doesn't require commonsense. Idk, perhaps neuro-symbolic approaches could be good for "reasoning" mortgage payments or "reasoning" a health diagnosis, and potentially useful for lenders and doctors...?

Expand full comment
Jul 29Liked by Gary Marcus

I enjoyed this article immensely and it echoes something I've been saying to colleagues for a while, that LLMs need to be merged with symbolic reasoning to mitigate the generation of faulty answers. I was thinking of *exactly* Lenat's OpenCyc project as the kernel for such an effort.

One approach to address critics is to take a top-down approach by saying that LLMs need a kind of Theory of Mind, which is an internal model of the world (or specific environment) upon which to do reasoning. OpenCyc (https://cyc.com/, https://en.wikipedia.org/wiki/Cyc), or something similar, could be used to bootstrap such an effort. The devil is in the details, of course, of how to parse such models from LLM answers, but that could be a whole wide area of research. Moreover, these internal models needn't be overly general, but can be very specific for a given task.

Expand full comment
founding

totally agree with this article and do think stock market and vc wise a AI winter is coming as money has been put on wrong bet, i.e. scaling existing approach. I teach humans have 3 steps to intelligence: 1. data, 2. information and 3. Intelligence and llm go right from data to intelligence without the 2 step which needs accuracy, meaning and truth filters. I would watch what Fei Lei is dong with her new company at Stanford as well as the Google stuff...maybe the breakthrough we need

Expand full comment

Can you point me to some of the interesting stuff you see Fei Lei doing? I looked her up but her Wikipedia article didn't list anything recent in her research that seemed to be along these lines.

Expand full comment
Jul 29Liked by Gary Marcus

LeCun's comment belies how isolated he is in his own research bubble. To the contrary of what he said, people have been working with differentiable generations of discrete logic, both in learning and inference, for a long time now. There's a rich literature with prominent names like Lise Getoor and Pedro Domingos. There are domain specific languages like Scallop and Probabilistic Soft Logic. Someone tried to point these things out to him on X; I wonder if he will listen.

Expand full comment

I'm not a fan of your neurosymbolic approach because it splits the system into two very different beasts. That is unlikely how the brain works as both System I and System 2 cognition run on the same substrate, at least as far as we can tell. I can imagine a future AI that can manipulate built-in calculation and theorem proving engines but the part outside those application-specific engines still needs to learn how to use them, understand subtle context, and, in short, do everything that is human-style cognition. Building in logic and calculation engines is just giving the robot a calculator. It still doesn't tell us how to program the robot.

Your friend Phil has it right: "AI progress needs algorithmic innovation". IMHO, we need new AI algorithms that lie somewhere in between the unconstrained world of artificial neural networks, which can model virtually any training data but bring nothing to the table, and the overly constrained logic world. More generally, I don't think we really understand what cognition is all about.

Expand full comment

It is true that current systems are a composition of mindless generation followed by mindless mechanical checking.

People have a holistic view and deep understanding when we solve a problem.

These are very encouraging steps though. What machines can make use of, and we can't, is billions of examples, many custom software tools, and lots of oomph.

Machines can try a few thousand rules of thumb and add-on tools, and their head won't hurt. As long as they can check their progress, they can fail, adjust, and retry. Not neat, but gets the job done.

Expand full comment

What precisely is so bad about a new AI Winter? I don't understand why computer scientists (who aren't even scientists) think that it's their place to create artificial intelligent agents, as opposed to some hybrid science encompassing, biology and psychology. Ideally the whole LLM debacle would lead to all biological metaphors being scrubbed from the CS syllabus, and CS returned to what it really is, a subfield of applied mathematics.

Expand full comment
Jul 28Liked by Gary Marcus

AI Winter is bad for its investors for obvious reasons. But the effect is also bad for many researchers. During the seasons that precede the winter, money is only available for research close to the right kind of AI - LLMs in this case. If your research is in a different area or with no short-term profit potential, good luck getting funded. After winter arrives, no one can get any money even if the research unrelated to the winter's cause. It's basically an unhelpful distortion in the marketplace of ideas and money.

Expand full comment
Jul 28·edited Jul 30Liked by Gary Marcus

"During the seasons that precede the winter, money is only available for research close to the right kind of AI" - don't I know it! They're now talking about $100 billion training runs, but I can't even get a $160k grant for my AGI non-profit (https://www.bigmother.ai).

Expand full comment

Thanks for the link. I like to read about other people's AGI projects. I have one of my own but it hasn't progressed to the manifesto stage yet.

Expand full comment

My AGI project has its genesis in 1985. I've been trying to get it out of my head and down onto paper since 2018. Writing a manifesto is hard! :-)

Expand full comment

That's my point. Why do we need AI researchers? They've never provided anything of value, just endless hype. Why should money go to these charlatans beside the obvious self serving reasons.

Expand full comment
Jul 28·edited Jul 28

This is totally wrong. AI technologies, developed since the first AI conference in the mid-1950s, are used in many software systems that may never mention "AI" in their brochures. After every one of these AI winters, the science that was created is still with us and doing important work. The "AI winter" concept is more about money, not the quality of science. It distorts the science but the science is still there.

Expand full comment

Explain what AI is actually a science of, and how it differs from normal algorithm design?

Expand full comment

You really want to argue over the labels? The kind of ambiguity you're talking about is built into the name, "artificial intelligence". An algorithm that does something useful is not artificial. And let's not get started on the definition of "intelligence". That has been argued for millennia. All I was saying is that people did work they called AI at the time and it produced useful algorithms that are used virtually everywhere. Perhaps they no longer refer to them as "AI" but that was kind of my point. LLM technology will probably be used decades from now but perhaps they'll not call it AI. Some might argue they shouldn't call it AI now.

Expand full comment

And my point is that labels have consequences. If LLMs we just treated as mathematical objects I doubt this hype would have been as excessive. However, the AI label, created a lot of fake expectations. People started talking about the algorithms having 'understanding', and even speculating that they might replace humanity entirely! I am not arguing against the usefulness of computer science. I am saying that (computer science - biological metaphors) would lead to a saner world.

Expand full comment

At last....you are moving in the right direction GaryMarcus....clear ideas that may improve Simulated Intelligence research !

Expand full comment
Jul 29·edited Jul 29

surely this GDM 'have a front-end that tests the logic and sourcing of the LLM output' approach can be used in all AI activity, e.g. answering minor individual human life queries, and thus eliminate hallucinations?

Expand full comment

It's becoming clear that with all the brain and consciousness theories out there, the proof will be in the pudding. By this I mean, can any particular theory be used to create a human adult level conscious machine. My bet is on the late Gerald Edelman's Extended Theory of Neuronal Group Selection. The lead group in robotics based on this theory is the Neurorobotics Lab at UC at Irvine. Dr. Edelman distinguished between primary consciousness, which came first in evolution, and that humans share with other conscious animals, and higher order consciousness, which came to only humans with the acquisition of language. A machine with only primary consciousness will probably have to come first.

What I find special about the TNGS is the Darwin series of automata created at the Neurosciences Institute by Dr. Edelman and his colleagues in the 1990's and 2000's. These machines perform in the real world, not in a restricted simulated world, and display convincing physical behavior indicative of higher psychological functions necessary for consciousness, such as perceptual categorization, memory, and learning. They are based on realistic models of the parts of the biological brain that the theory claims subserve these functions. The extended TNGS allows for the emergence of consciousness based only on further evolutionary development of the brain areas responsible for these functions, in a parsimonious way. No other research I've encountered is anywhere near as convincing.

I post because on almost every video and article about the brain and consciousness that I encounter, the attitude seems to be that we still know next to nothing about how the brain and consciousness work; that there's lots of data but no unifying theory. I believe the extended TNGS is that theory. My motivation is to keep that theory in front of the public. And obviously, I consider it the route to a truly conscious machine, primary and higher-order.

My advice to people who want to create a conscious machine is to seriously ground themselves in the extended TNGS and the Darwin automata first, and proceed from there, by applying to Jeff Krichmar's lab at UC Irvine, possibly. Dr. Edelman's roadmap to a conscious machine is at https://arxiv.org/abs/2105.10461

Expand full comment

I read Prof. Edelman's book when it came out in the late '80s, its extremely dense, and much of it is irrelevant for AI, for instance the biochemical embryology. The crucial insights about how real neural nets work, their unclocked, pulse-width modulated, ring-resonator/buffer behaviors are more intuitively seen in Mark Tilden's nervous nets, typically a hex inverter chip (under 20 transistors) that can nevertheless operate a quadrupled robot across rough terrain. That has just one loop - the number of possible simultaneous loops in a brain is far greater than the number of neurons.

Still, I think Edelman misses a great deal about the acoustic correlates of pulses and how they interact with cytoskeletal and arboreal mechanical resonance, filtering signals and changing geometry and connectivity.

For AI work on CPUs and GPUs, I think RNNs and KANs (Kolmogorov-Arnold Networks) with Clifford-Algebra values /organization of scalar weights are most promising, since they have the ideal mix of generality and simplicity in representing geometry and physical dynamics.

Expand full comment

The theory and experimental method that the Darwin automata are based on is the way to a conscious machine. I'm sure much other research will be useful and pertinent in that endeavor.

Expand full comment

With you on GenAI isn't in any way 'intelligent'. But why do you (as others do) suggest we 'do not know how they work'? We say we 'understand' how QM works even if we cannot predict individual outcomes.

The fact that we actually technically understand rather well how these work only supports the assessment that they aren't intelligent (nor can be).

Expand full comment

There’s a subtle nuance between understanding how they work and being able to predict their outcomes or results. We know how GAI works, but we have no ability to predict the answers they will come up with. As we see, their answers are often wrong (ie. hallucinations) but we have no easy way to correct that because we never know when this might happen.

Expand full comment

You do not need to predict individual outcomes to understand a statistical system. (See for instance QM, but this is true in general)

Secondly, the word 'hallucination' is misleading. See https://ea.rna.nl/2023/11/01/the-hidden-meaning-of-the-errors-of-chatgpt-and-friends/ Labelling them as a sort of 'error' or 'bug' (as sometimes is done) suggests that they act as a sort of counter-observation for correct working, but they aren't the result of incorrect working at all. They are the result of the system *correctly* working. Even Sam Altman agrees (see link).

Expand full comment

Sam Altman agreeing isn’t the high watermark 😉 So I can agree that the GAI working correctly in much the same way as a bug in any coded system will execute correctly as written, but if the expected answer of “what color is the white horse?” is white and the system returns black instead, then yes, the system executed properly but the answer was wrong, so the system would need to be recoded. Now that’s with a highly deterministic system. Statistical systems are not quite as obvious to fix as the complexity to duplicate or analyze how an answer was derived is quite high, enough so that people cannot often do this. With this inability to determine what went wrong (in the response it returned), even when we understand what the system is doing, it makes for a fairly useless system for serious endeavors. Sure, you can use to write fiction, if you try to use answer tax questions it’s probably not a good idea.

Expand full comment

We largely agree. Sometimes we can get a picture what goes wrong. For instance, in the erroneous output I engineered a while back (link I shared), the wrong choice was the next token 'and' instead of the next token 'dot' and it is pretty obvious why this got a high likelihood.

Essentially, if you have built in fundamental randomness (these systems are 'statistically constrained confabulators') you will never have reliability. Somehow, this randomness must be weeded out when relevant. As far as my knowledge goes, we have *no* idea how to do this, whatever talk there is about combining symbolic and generative AI.

I'm pretty much convinced this is impossible to do in discrete systems anyway. I suspect you will need soms sort of destructive/constructive interference you can have in Analog/QM computing to get from 'it' to 'bit'. To do it with discrete logic (digital IT) you will probably have to go to an impossible (if not potentially infinite) scale.

Expand full comment