70 Comments

I'm all for updating priors, but in March 2023 Bill Gates predicted that we'd have AI tutors that would be better than human teachers within 18 months -- by October of this year. That claim and related versions of it continue to echo in the education space, a keynote speaker at one of the major ed-tech conferences claimed just days ago that AI will soon be "a billion times smarter" than us. This isn't just frustrating, it's doing real harm to kids.

Expand full comment

Back in the day there was a regular radio program in SF by two guys who explained in detail how Gates stole his system from a guy in Texas. These were guys who would get excited by the fact that synchronous switching would work at such high speeds: they really knew their stuff from the ground up; and thought Gates was an idiot. When I saw him in front of congress trying to defend himself, I had to agree.

Expand full comment
author

gates is not an idiot (I have spoken with him; he’s very smart). But I do think OpenAI played him by training on Ap biology exams, and that it took Gates a while to figure it out.

Expand full comment

Bill Gates is a very intelligent person.

Expand full comment

Re the statement by the keynote ed tech speaker that AI will soon be a billion times smarter than us.

If by “us”, he means himself and other ed technicians, that was probably an accurate statement some time ago

Expand full comment

Flight 209 now arriving at Gates 8…9…10…14…20

Expand full comment
Jun 30Liked by Gary Marcus

Full video: "Bill Gates Reveals Superhuman AI Prediction" (https://www.youtube.com/watch?v=jrTYdOEaiy0) @ 30:00: "It understands exactly what you're saying" -- no, it does not! This is an illusion. The currently available evidence indicates that LLM's have at best an extremely shallow internal world model, which means an extremely shallow "understanding" of the meaning of any prompt, and correspondingly weak mechanisms for reasoning about that extremely shallow understanding. Accordingly, they are able to retrieve, and convincingly re-present, vast amounts of memorised information, but the amount of actual cognition going on is minimal. It's ELIZA^infinity.

Expand full comment
Jun 30·edited Jun 30

Not extremely shallow understanding, no understanding (and thus, no cognition). Understanding is not a matter of degree, except within the human species, the only context within which it has any meaning. Behaving as if there is understanding is all that can be said of machines.

Expand full comment
Jun 30·edited Jun 30

Don't agree with this, sorry. There are two senses in which an entity can "understand" something. The first is to understand the physical universe. In order to achieve such an understanding, the entity in question must construct an internal model that faithfully captures the structure of the physical universe (i.e. its many features, and all the complex and nuanced relationships between these features). Given such an internal model (which is necessarily an ever-evolving guess, even in humans), the entity in question is then able to reason about the physical universe by reasoning about its internal model (e.g. deductively, to derive necessary conclusions, or abductively, to derive possible explanations). The second sense of understanding is to understand a natural language sentence. This also requires an internal model of the universe, which provides the "meaning" onto which the sentence in question may be mapped. There is some evidence (determined by mechanistic interpretability researchers) that LLMs are able to construct at least extremely simple internal models (e.g. of the game of Othello), but as yet no convincing evidence that this extends to anything near fully fledged models of the larger physical universe that faithfully capture its structure. That doesn't necessarily mean, however, that such internal models are beyond the reach of non-human machines.

Expand full comment
Jun 30·edited Jun 30

No need to be sorry. I agree, internal models of the world are not a priori beyond the reach of machines. But understanding, when attributed to machines, is meaningless unless it is the same as our understanding, which is defined by what it is like (for us) to understand (what we experience). The two senses of understanding to which you refer offload everything onto the internal world model, since what you are calling understanding depends on having such a model. But now you have to determine the nature of the model: how it represents the world and how it is processed, since presumably not just any old model will do. The world model humans possess represents the world and processes it in a way that results in understanding. Unless you can show something equivalent in machines, all you can say at this point is that they behave as if they understand. The latter at least seems to be what you're implying by placing scare quotes around understand and meaning.

Expand full comment

"They behave as if they understand." Reminds me of the so-called Turing Test.

Expand full comment
Jun 30Liked by Gary Marcus

Ontology is so back!

Expand full comment
author

No symbols, no ontology.

No ontology, no metacognition.

Expand full comment

Where can I read up on what you mean by symbols in this context?

Expand full comment

No metacognition, no trillion$

Expand full comment

But does anyone know how to code up "ontology"?

Expand full comment

I figure that some of the recent studies indicating that the brain thinks with symbols and merely uses language to communicate will prove insightful. We're building "thinking machines" that actually don't think.

Expand full comment

Brains think in gradients of signal reinforced selectively based on chemical combinations.

But symbols may be the best way to understand this.

Expand full comment

I don’t think, therefore AI’m

Expand full comment

"I'm pink, therefore I'm spam" - an old joke from the old AI days :-)

Expand full comment

Or, stated with a different “temperature” setting:

“I don’t think, therefore I Sam”

Expand full comment

I believe the current GenAI craze is valuable. However, its value does not lie in the technical architecture of GenAI (Transformer, etc.) evolving into some form of AGI but rather in the unprecedented way it has captured the attention of the general public and "big (venture) capitals". Under such circumstances, AI investments will create a huge bubble (as seen, for example, in NVIDIA's stock price on the secondary market). Amid this enormous bubble, some capital will inevitably spill over to support some marginal, initially overlooked technical models (possibly not even on our current radar), and these models may eventually evolve into the final AGI and ASI. It's somewhat akin to the internet bubble around 2000, which, after bursting, left behind genuinely commercially valuable internet companies like Google and Meta, which have continued to grow to this day. I believe AI will follow a similar path. Just like animal evolution, small mammals cautiously foraged under the feet of dinosaurs, but after the decline of dinosaurs, mammals eventually dominated the Earth.

As for which technical architecture will ultimately prevail, I think the current frontier of AI hasn't even reached the intelligence of cats and dogs. This point may be contentious, but at least one thing I believe can be speculated and confirmed: current AI lacks an "inner mental world", whereas cats and dogs have one. Further, cats and dogs do not possess the reflective self-awareness that humans have, which is unique to humans (and possibly not even Neanderthals or Denisovans possessed such reflective self-awareness). I think Fei-Fei Li's approach to visual AI is a reasonable choice.

Some of Julian Jaynes's views in his book "The Origin of Consciousness" may be too radical, but I believe one point is correct: the source of this reflective self-awareness in humans is our language. Furthermore, human-level intelligence ("intelligent consciousness") is a "recent and last software-level upgrade" that occurred after the emergence of abstract words like "I" and "me" and "but, therefore, because" in language, rather than a physical change in the human brain itself. From this perspective, I believe that to achieve AGI, what we need first is a philosophy, not a technical solution (either LLM or traditional AI models).

Expand full comment
Jun 30Liked by Gary Marcus

Something about getting rid of matrix math in LLM came out. Ars did an article on it, the paper is on arxive, and it's not peer reviewed. If it turns out to be a viable alternative for the whole loci map resolver, or whatever the nerds renamed the memory castle, it could be one of those cranks. Might even be useful for those symbol memes. 😀

https://arxiv.org/abs/2310.11453

Expand full comment

Woops wrong link! That was for the bitnet, I meant to give this one for the matmul thing https://arxiv.org/abs/2406.02528

Expand full comment

Metacognition is largely a matter of logic systems extending from facts (RDF triples) to implementation of 1st, 2nd or 3rd order logic in top of a knowledge graph where knowledge is in a stand under relationship ie physics stands under science tree of knowledge kind of thing. I asked an LLM how this would work usingbthe following prompt:

A knowledge graph of concepts is a category logic with sets of sets an interpretation of a knowledge tree. For example Science is a set of sets that includes physics. Physics is a set of sets that include Nuclear Physics. Suppose there is a 2nd order logic engine and rules regarding relationship between concepts. Would that be a foundatiin for metacognition?

LLM Answer:

"Yes, you could consider a knowledge graph of concepts, modeled as a category logic with sets of sets representing a knowledge tree, to be a foundation for meta-cognition. In this context, meta-cognition refers to the ability to think about and analyze one's own thinking processes.

By incorporating a second-order logic engine and defining rules regarding the relationships between concepts, you enable the representation and manipulation of complex relationships and higher-order reasoning within the knowledge graph. This, in turn, supports the development of meta-cognitive capabilities, such as introspection, self-evaluation, and adaptive learning.

The knowledge graph, enhanced with logical rules, serves as a structured representation of concepts and their interconnections. This allows for the representation of concepts like Science, Physics, and Nuclear Physics, as well as the hierarchical relationships between them. In this structure, the second-order logic engine can be used to infer new relationships, detect inconsistencies, and adapt the knowledge graph based on new information or rules.

Meta-cognition emerges from the ability to analyze and modify the knowledge graph's structure, allowing the system to adapt its understanding of the relationships between concepts and, consequently, improve its reasoning capabilities. This adaptive nature is essential for advanced artificial intelligence systems that can learn from their experiences and refine their problem-solving strategies.

In summary, a knowledge graph of concepts, combined with a second-order logic engine and a set of relationship rules, could indeed serve as a foundation for meta-cognition, enabling AI systems to analyze their own reasoning processes and adapt their understanding of various concepts and their interrelationships."

Now I asked this

Can an LLM accomplish the same thing just by increasing the number of tokens with $200 billion to pay for computes and data?

LLM Answer

"No, increasing tokens and compute resources for LLMs alone is unlikely to achieve meta-cognition or structured reasoning comparable to knowledge graphs and second-order logic engines."

This should convince the LLM industry or at least investors that there is no need to waste $200 billion since their LLMs are working about as well now as it can.

Expand full comment

That's hilarious :) I like what it said about metacognition, though. It may not be the most cogent analysis of thinking on the matter, but it's not bad for a system that has no ability to reflect at all!

Expand full comment

"there is no need to waste $200 billion since their LLMs are working about as well now as it can"

Money spent on chatbots won't be wasted. An outrageously large amount of work and data is needed to even roughly map the world and the human problem-solving strategies.

"Metacognition" is the icing on the top, and we won't get there till we solve the simpler problems first.

Nobody has any ideas how to attack the hardest problem anyway, so we could as well "lay the groundwork".

Expand full comment

I completely agree. But the icing is the difference between safe and aligned information and chaos. Quality data is the goal not just well written falsehoods.

Expand full comment

Von Neumann computing architecture (including GPU) cannot scale indefinitely for AI because there will be a power ceiling beyond which society is unwilling or unable to provide the necessary electrical power. Some kind of neuromorphic computing architecture will be needed to operate efficiently at scale (or at the edge in very power constrained environments).

Expand full comment

Whatever you do don't mention von Neumann's The Computer and the Brain, it might put a lot of people out of business.

Expand full comment

Ontologies are great as long as one recognises their weakness outside of structured data (How many of us have filing systems where the categories don't partition, or a folder called 'general' 😉. )

The issue becomes one of deciding whether a cluster of data fits into a category.

Social or institutional facts are fuzzy.

Some context is always assumed.

I'm not an AI expert - has much progress been made on the following?

Binding problem;

Schema problem;

Frame problem;

Classification problem.

Expand full comment

As a retired software ontologist (I used to call myself a “combat ontologist” because software development in Silicon Valley has similarities to combat, but that’s a discussion for another beer), I’m glad to see you talking about the limitations of ontologies. We either recognize the fuzziness of any ontology or we end up constantly revising our ontological boundaries.

My personal take on the nature of “AGI” (I’d rather call it “human-like intelligence”) is that it was catalyzed by the invention of language and complex society, and still depends on those things to work properly. In a very real sense there is no such as an individual human being. Until work on artificial cognition takes that into account I’m afraid it will be unsuccessful in duplicating what evolution has done for us. But it’s also true we don’t need to duplicate human-like intelligence to automate society sufficiently to produce a decent living for everyone, which is the goal I personally see for AI, though I know there are many working in the field who have no interest in that goal.

Expand full comment

Totally agree

Expand full comment

The symbols do not need to be human-understandable ones for intelligence I think. But they have to be for that intelligence to be intelligible for hums. You know: "If a lion could talk, we would not be able to understand it".

Expand full comment

"Neurosymbolic AI has long been an underdog; in the end I expect it to come from behind and be essential." "Neuro-Symbolic Artificial Intelligenceis built on two main pillars: neural networks, which learn patterns from data" What to do when it is a new problem, with no or little data? It is a dead end. It is a disgrace that people think calling something a "neural network" makes it equivalent to a real neural network, with its feedback, feed forward, self-triggering, self-inhibition, ability to make new connections - its ability to make and extend models. The blather about what AGI will do - first, accept the very strong limit on what your Conscious Mind can do, then wonder how the machine is goig to bypass that, while you understand what it is doing - then wonder how you can build a machine beyond your understanding. Making a machine learn English is a way - at least you will understand bits of what it is doing.

Expand full comment

Of course they were originally called associative or connectionist models until Grossberg decided to call them neural nets, and the nonsense began.

Expand full comment

Thanks. I always wondered who was at fault. Still, it wouldn't have been successful without a huge well of gullibility - for that we need to blame the computer science professors - treating it as a cult unconnected to science or engineering. A 10 minute exposition on what real neurons are capable of would have been sufficient - controlling a hearbeat, saccades - it would have introduced a little bit of wonder.

Expand full comment

Neurosymbolic approaches do not do so well when the problems are very vague, when LLM-based approaches using imitation will likely do better, as vast number of heuristics can be compiled.

Any eventual architecture will likely be hybrid, and we are getting there. Use lots of data where there are no models. Use models, logic, where applicable.

Much work will go towards paring problems with the specialized agent that is skilled enough to solve them by whichever method makes most sense.

Expand full comment
Jun 30Liked by Gary Marcus

Sorry, but you are assuming magical properties for LLMs. They have no idea what the words mean, and neither does a person's Conscious Mind. Words describe objects and relations, and they are clumped by verbs and prepositions - all courtesy of the Unconscious Mind. John put the money from the bank on the table in his office. (change of external state) Fred saw the money on the table. (no change of external state) Herman put the money on the table into his bank account. (change of external state).

A plane is defined as "an imaginary flat surface of unlimited extent". - is the "unlimited extent" imaginary or not? There is a huge amount if detail that the Unconscious Mind is handling, with the Conscious You being oblivious. The output of an LLM looks good because it looks like a human wrote it, but if it has cobbled together a few pieces, it is likely to be wrong, because it doesn't know what anything means.

Expand full comment

There is no magic, either in LLM or neurosymbolic computing. What is needed is lots of relevant data, which will have to include text, images, and video.

Even data alone is not enough. The machines will get a good guess that way, but then likely fail in many ways. Those lessons will have to be incorporated back.

We are talking about a multi-decade effort to catalog and model the world. All methods, including formal ones and heuristic ones, will get used.

Expand full comment

More magical thinking, I am afraid. This is a situation where we need the machine before we can do any of this. Remember Covid, and how economists and epidemiologists had the ear of our leaders - and the winner was - bleach! The two scientific specialities had no common vocabulary. Similarly with Climate Change - someone comes out with a highly localised model - which runs much faster than they anticipated - the effects are mostly cumulative, but understanding all the effects = oceanographic, meteorological, reflective - is way beyond any single person - we need a machine that is already capable of containing the answer before we start. It would be pointless spending decades and be continually behind the curve. I am assuming a machine capable of handling natural language would be able to change itself fast enough to keep up. Without a machine capable of tying everything together, all you will get is a dirty mess when you "model the world".

Expand full comment
Jul 1·edited Jul 1

"Without a machine capable of tying everything together, all you will get is a dirty mess when you "model the world"."

The world is not one single phenomenon. And we don't need a single master machine to make sense of it all.

Humanity has a very lengthy history of incremental automation and ever better tools. It is laying one level of bricks when making a building. Then building the next layer upon that.

Machines will be gradually doing more and more. It will take time.

Expand full comment

"It is laying one level of bricks when making a building" - bricks is a very bad analogy - it assumes there is an unchngeable foundation. There isn't. Cities with tens of millions of inhabitants are sinking and will be flooded - there is the possibility of a humn infectious cancer or a virus that can move quickly from birds to cows to people - the "ring of fire" might live up to its name - California is overdue - heat will drive millions of refugees from their homes. No bricks, instead changeability at every level. "It will take time" - we don't have time.

Expand full comment

What's needed is the fairy dust to connect data with algorithms.

All technology has its limits, which is something that venture capitalists don't understand.

Expand full comment

It's amazing to me that Gates does not make better use of people who are quite able to teach him what current AI systems are about. But he certainly does not take advantage of such people, as evidenced by his remarks about the current technology. If scaling two times more is the only hope we have of improving the current technology, we are headed for a financial crash. Here, Mr Gates, is something which might help you:

https://sites.google.com/site/tomspagesthirdmillennium/home/a-demonstration-of-artificial-intelligence-for-beginners

What also amazes me is his using the term "metacognition" as if it's a real player in the current reality, without talking about who is currently working on the technology.

Expand full comment

In an interview he said his experience was in Basic and a bit of C; which is like a linguist saying he understood English and a bit of Latin.

Expand full comment

Gary, it seems to me that another important piece of the solution for more useful and trustworthy AI is likely to be composable causal models with uncertainty quantification, used in combination with model-based methods, such as model-based optimization, and I believe that's more or less in line with comments you've made in the past, including in your book Rebooting AI. Would you be willing to comment on the nature of the role, and the importance, you see for composable causal modeling and model-based decision-support approaches going forward?

Expand full comment

Perhaps a naive question, but are there actual or hypothetical procedures for implementing metacognition?

After doing a fair amount of work with ontologies and knowledge graphs, I observe that these seem to kind of collapse under their own weight. That or require continuous active human updates.

Is there a better mousetrap now or on the horizon?

Expand full comment

Yes indeed!

And for comparision to Gate's 'breath of fresh air' look at The Information's profile of Scale.ai and you realize just how transient that particular GenAI bubble is.

Also Rodney Brooks in Tech Crunch pointing out that Moore's Law is not a given in any technology.

Expand full comment

Scaling is multidimensional, of course. Arguably that's why symbolic AI has struggled more than ML -- its inherent domain is vastly lower dimensional. In high dimensionality more ratcheting is possible -- more opportunities for algorithmic improvements.

Expand full comment