Marcus on AI

Oleg Alexandrov

Language and neural nets first, and symbols and physics models later, is likely the better approach than vice-versa. So, it will be a hybrid approach either way, the question is how to go about it.

Expand full comment

If you are building upon something von Neumann contributed to, that is a good heuristic for saying you are on the right path

Expand full comment

khimru

Yes. And the other thing that we should drop in that process is AGI term. It's useless and pointless and, ultimately, NOT INTERESTING. As in: if and when we would have something that could be called AGI – we would accept it as a tiny meaningless footnote.

Why? It's obvious, if you'll think about it: AGI is 100% about marketing in the “scale is all you need” world. Because in THAT world and ONLY in that world it can act as beacon: add a little more scale, and a little more compute, do the same thing you did for years, just bigger… and you match human, finally! Hurraugh, you won, can fire all these pesky humans with all these unions and take all the profit.

But if we admit that to match human we need something more than scale… then AGI stops being a worthwhile goal!

Because if our AI system is composed from different parts… then suddenly it gets superhuman capabilities immediately! Even calculators from XIX century were able to do some things better that human!

In a world where “scale is all you need” doesn't work AGI is still achievable – but somewhere in XXII or XXIII century where last 0.01% of something that human was better than pre-AGI system is conquered.

By that time our AI is so far beyond human in almost everything it's not even funny… the moment where AGI is reached is not remarkable at all, more of “oh yeah, finally… at arrived” rather then something investors may salivate about.

Expand full comment

Sunil Malhotra

Jul 16

I agree. See what I wrote. https://sunilmalhotra.medium.com/the-folly-of-anthromimicry-9eb34b182da5

Expand full comment

https://en.wikipedia.org/wiki/John_von_Neumann

Oh yeah, what did he know?

An hour of reading later ...

Expand full comment

Sometimes it feels easier to say what he didn’t do than what he did do

Expand full comment

E. Syla

That's easy: he never did anything profound. Flashy talents but linearly superficial mind. No wonder he believed in the singularity, and no wonder weird internet pseudogeeks who also believe in the singularity put him on a singularly high pedestal (Einsten must stand on his toes to lick the soles of his shoes, apparently.)

Expand full comment

Just to make sure I understand what you are saying, you don’t think von Neumann did anything profound?

Expand full comment

E. Syla

Aug 11Edited

He applied math within different fields in already established systems (admittedly quite brilliantly - not enough to be convinced by the weirdos who claim he was a whole other level from other geniuses of the time, however). I don't consider that profound. Maybe you do. One thing that is certain, however, is that whenever he stepped outside of math/comp sci he sounded like the average redditor.

Expand full comment

So creating new fields is something you find profound, yes?

Expand full comment

I'm excited about Pentti Kanerva "Sparse Distributed Memory" and hyperdimensional computing for neurosymbolic.

It builds on Von Neumans "The computer and the brain" lectures (post homously published).

I think it is the direction Von Neuman would have went, had he lived.

Expand full comment

Adam Saltiel

I met Pentti oh so long ago now, in SA.

Must read this up.

Expand full comment

Graham Lovelace

Brilliantly written Gary. I’m no computer scientist but I understood all of this.

Expand full comment

Sunil Malhotra

Jul 16

Ditto.

Expand full comment

Peter Dorman

I have no expertise whatever in the subject of this post, but the sociology/political economy of it is familiar. There is a big literature in the history of science about the mechanisms GM describes, policing the boundaries, cutting out skeptics, etc. I don't have much to add, except that this is apparently one more example.

The political economy part is more vexing. Real money, or perhaps I should say really real money, is being funneled into investments in this field, and to outsiders like myself it looks like they are fragile with respect to not only the underlying technology but monetizable end uses. To put it bluntly, the business case doesn't appear to support them. And that's before we consider the broader social impacts, ranging from increased energy demand, land use, and the potential for the abusive implementation of AI-directed systems. Why the flood of money? Looking at the PE toolkit, I would turn to bubble dynamics, especially where cross-valuation effects are in play -- where my bet increases the market estimation of the value of your bet, etc. If this is correct, we should worry, because bubbles at this scale don't end well. As a side note, could it be that acceptance of a neurosymbolic approach by some actors might, if it were made publicly known, reduce the valuation of existing assets?

Incidentally, it is especially at the early stages of innovation, where many potential paths need to be explored, and where only a few are likely to pass the relevant tests, that transparency is crucial, rather than the opaque regime of private IP. This is exactly the opposite of the often-stated position of the tech libertarians (e.g. Andreessen) that we should gut publicly funded research and send all those scientists to labs run by private, for-profit firms.

Expand full comment

Adam Saltiel

Moved on from Mozilla.

Seriously, I don't think it is as straightforward as "Incidentally, it is especially at the early stages of innovation, where many potential paths need to be explored, and where only a few are likely to pass the relevant tests, that transparency is crucial..."

But I can believe that acceptance of neurosymbolic approaches (there are many) could undermine the way in which current assets are valued.

I am not sure it is just bubble dynamics; the test of that would be in the vexing question of economic benefit.

Expand full comment

Joe

Aug 17

I very much would like to know if the addition of symbolic techniques will save this massive bubble from bursting.

The symbolic approach doesn't need GPUs! Doesn't need all the massive data centers and power plants, etc.

Expand full comment

Herbert Roitblat

Near as I can tell, when the brain implements symbolic processes it emulates them in biological neural networks. Symbolic processes are critical for much advanced thought, particularly science and logic. Many experiments demonstrate that symbols can play a role in foundational cognitive processes (e.g., Carmichael, Hogan, & Walker, 1932. See also https://www.researchgate.net/publication/49803778_Comprehension_and_memory_for_pictures).

People have to study to learn logic. What, I wonder, is the mechanism by which the brain comes to emulate a symbolic system as a result of education? If we could answer that question, we could then unify symbolic and neural processes. Obviously, symbolic processes are biological processes because our brains do engage them, but we have yet to understand how. We each have only one brain that somehow does both continuous and discrete representation.

AI has a special and fraught relation with logic. On one side, AI is a symbolic system (a digital computer) emulating a neural system. So it would be a symbolic system emulating a neural one, emulating a symbolic one. On another side, the current paradigm compels AI researchers to build based exclusively on neural systems with the hope that if we just have enough connections, magic will happen and advanced intelligence will emerge. On yet another side, most of the logic of large language models is based on fallacious logical reasoning by the researchers. The models produce text that would be produced by a reasoning individual, therefore they are reasoning. It is as if Descartes said, I think, therefore, I am. I see that I am, therefore, I think.

Another bit of scientific irony here is that language models are actually symbolic models implemented using continuous processes (deep learning). Their stock in trade is the token. They are based on the assumption that having the right sequence of tokens is both necessary and sufficient for intelligence (Compare Newell and Simon's physical symbol system hypothesis that symbols are both necessary and sufficient for intelligence). But neural network, connectionist, models emerged as a counter to Newell and Simon's hypothesis.

My conclusion is that intelligence depends on both continuous and discrete (neural and symbolic) processes, but I predict that they are not separate systems but a kind of duality, like light waves and particles (photons). Maybe that is the direction we should be heading.

Expand full comment

Babies and young children have developmental stages with learning. LLMs are like those children that have reach language fluency, but have yet to have much formal education. But children are embedded in the world and so can acquire knowledge by doing. LLMs cannot. I suspect AGI will require either embodying AIs in the real world or within a realistic simulation to gain the knowledge that embodied intelligence has. It is analogous to Brooks' early ideas for his "insects" like Attila to operate in the world quickly.

Expand full comment

Jon Aarbakke

We don't think in language (well, I don't). Language came late in evolution. We think in images.

If I say "Pete is the father of Thelma and Louise. Who is Louise's sister"? - what happens in your mind? You draw a picture, right? And you "see" Thelma.

Re the irony; neural nets aren't symbolic, they're numerical.

Expand full comment

Herbert Roitblat

From what we know about cognitive science, thought is probably more abstract than either images or words. In your example, a picture of a man and 3 women would not be enough to answer the question. There is no direct information in the picture about the relationships among the people. What is an image for "sister of?" What is an image for "father of?"

Here is an experiment that looks at the abstract nature of thought. https://www.researchgate.net/publication/244467910_Long-term_memory_for_a_common_object1

People are poor at drawing even common objects (they were much more common then), so they cannot be remembering them by image.

My bigger point is that there is a hundred years of research on thought, expertise, intelligence, and related topics. Those who claim that they understand intelligence well enough to predict that we have conquered it would do well to gain some familiarity with that history (note that this not directed at you, Jon).

Neural networks are subsymbolic, but to say that their memorization of language patterns is sufficient for intelligence requires that language (symbolic) patterns be sufficient for intelligence. No matter how implemented, they are not.

Expand full comment

Jon Aarbakke

Thanks for thoughtful reply; I suppose I should have said "models", rather than "images". The model contains the requisite relationships.

The next, and far deeper question, is this: what's all the fuss? Why do we want machines that "think"?

Isn't it sufficient that humans think? And feel, just as importantly.

What we're seeing today is LLMs rewriting texts. How useful is that, really?

What would really useful AI look like? I've literally no idea.

Expand full comment

Bruce Cohen

Right now I suspect the most fruitful areas of research for AI and perhaps the most interesting use cases are in robotics. Clearly connectionist architectures are needed for sense acquisition and fusion, and for low level effector control, and symbolic processing is necessary for goal planning and contingency evaluation, among other tasks.

But there’s an aspect of cognition that robots require that doesn’t fit either architecture well: creating new low-level responses to events and adding or replacing them into what probably would be non-symbolic and also might not be neural net systems. In humans this capability is sometimes called “muscle memory”, and it’s crucial in reducing the requirement for attention on the part of the symbolic systems that control and monitor the progression towards a goal.

The work that’s been done on autonomous vehicles has been in a small part of the space of possible robotic use cases because the only effectors are wheels, signal lights, doors, etc.; cars don’t need to pick up small or large objects, or plan how to pull a cable through a conduit. And the demonstrations of humanoid robots have been marketing gimmicks, the human body plan isn’t well designed for specialized jobs like picking fruit or spraying insulating foam in tight quarters.

Expand full comment

Jim Hartman

Language is for communication, not thought: https://www.nature.com/articles/s41586-024-07522-w

Expand full comment

Eric Ellsworth

Jul 28

I am highly skeptical of this thesis.

There is a great body of research in treatment of mental health disorders that demonstrates that communication activities such as writing actually reshape the thought process of those who are engaged in them.

Of course the mechanisms of action are not totally clear, especially at the neural level, but it seems that writing is especially helpful in integrating experiences, feelings, and thoughts into one's self-conception and memories that support that.

Even when one crudely "self-instruments" and tries to observe this process it is evident that attempting to put words around thoughts is not a one-way process. In fact, I have yet to see anyone who can produce synthesized concepts and thoughts without language. That's not to say that the concepts are well or fully described by language, but that language and other forms of expression play an integral role in thinking and cognition.

Expand full comment

Mañana

Jul 25

As Wittgenstein said in Philosophical Investigations

Expand full comment

Martin Reznick

Many of us have led interesting and rewarding lives without ever having a mental visualization of any kind whatsoever. I can't even conceive of what it must be like to have pictures in one's mind. You can get awfully far thinking in language.

Expand full comment

Jon Aarbakke

So when you think of a cat, what do you think of?

Expand full comment

Martin Reznick

A cat, obviously. In any event disclosing my subjective experience of thought would be far less informative than looking into some contemporary research on aphantasia (https://aphantasia.com/).

Others have been beguiled by the idea that the essential unit of thought is an image, perhaps most notably Wittgenstein. Thankfully for the other aphantasics out there, he changed his mind in the end.

Expand full comment

George Burch

http://aicyc.org/2025/04/30/sam-implementation-of-a-belief-system/

I use semantic AI model (SAM) to build very specific meaning into the data structure. This paper explains

Expand full comment

RMC

Jul 13Edited

>>>Core problems like “symbol-grounding” have not been systematically solved, for example

That's the thing. Symbols might be grounded because we're alive. I always thought this, and belong to a tradition that sees it that way, and hence switched field from AI to biology actually. But it is fascinating to see how far we can get without grounding them, with just ML prediction of text. Far further than I would have thought, we were a bit smug 20 years ago. Adding in a python interpreter does make the models a lot better.

But I still think disembodied "AGI" is a fundamentally misconceived notion. Whether some other approach, perhaps within a body (but a dead body of course) will produce Isaac Asimov style robots is something we will eventually find out, maybe sooner than later.

My money is still on the idea that being alive is what grounds symbols. I might not put quite as much money on it as I used to however.

Expand full comment

Callum Hackett

Completely agree. I actually find it astonishing that this is not more obvious to people and I wonder if it's a particularly Western oversight. I think it's because we're all still very much in thrall to a (wrong) dualistic account of consciousness and intelligence. I doubt we'll make progress on this until we have a cognitive science that does not distinguish the mind from the body.

Expand full comment

E. Syla

Merleau-Ponty (Western) already made the best case that the mind can't be distinguished from the body in like 1950. The problem is American philosophers and idiot savant computer scientists still thinking in terms of Descartes.

Expand full comment

RMC

Yes, we read a lot of Francisco Varela back in the day. It's a popular approach but it doesn't really produce very good AI sadly. I guess it wouldn't by its own argument. Maybe it would be different with a trillion dollars in funding, and maybe it will eventually get that.

Expand full comment

Lots of things are "obvious" to careless thinkers.

Expand full comment

Vitalism is a bust. Being embodied does not require being "alive", which is an ill-defined concept.

Here's an interesting story: https://thereader.mitpress.mit.edu/daniel-dennett-where-am-i/

Expand full comment

RMC

It's not vitalism and although life is not entirely well defined philosophically it is very clearly a thing, as indeed we've been discussing here all day in a very productive and much richer, yet it seems dislocated scientific world to that of cognitive science, philosophy, computing and AI - and one which is, I daresay, much more empirically grounded since we are studying things that most definitely do cognition. I suppose the thing is to do what we do requires money, serious training and a lab, not just talk and/or a computer.

Expand full comment

Think Mr A

Gary, your article on o3 and Grok 4’s ARC-AGI-2 results makes a great case for neurosymbolic AI. I’ve been thinking about benchmark training, since I’ve heard models can sometimes be exposed to test data, boosting scores artificially. Could Grok 4’s 16% ARC-AGI-2 jump be partly due to training on similar puzzles, or is its neurosymbolic design the real driver? As a non-tech AI researcher, I’m curious how this affects the path to Large Reasoning Models (LRMs) with true reasoning skills.

Expand full comment

Tom

I see hybrid models as the best approach to achieving AI that understands the physical world as well as biological organisms do but has all of the advantages that AI has over biological organisms in terms of processing speed and memory—an AI that makes the current SOTA look primitive and brittle retrospectively. My personal favorite AGI architecture is a hypothetical hybrid LLM, NSAI, and JEPA with embodiment run on neuromorphic carbon nanotubes chips.

Expand full comment

A bit longish, Gary, but spot on. We were apparently living in a parallel universe on neurosymbolic AI, but at least I wasn't aware of it until a few years ago. It is indeed beginning to catch on, though I agree with you in suboptimal form to date. Some recent VC investment and corporate adoption, for example. I'll post to LI with some comments.

Expand full comment

Antony Van der Mude

You've got this right! I wrote a Machine Learning paper back in 1978 that inferred state machines to symbolically represent time series data such as a patient's diagnosis and treatment. A couple of years ago, I combined BERT vectors with an expert system to leverage the strengths of both. I have always tried to combine techniques in an eclectic manner. You neatly summarize the reason why this syncretic approach is best in the long run.

Besides the sociological issue, there is also a psychological issue: humans by nature don't like to think too hard.

It is so much easier to throw all of the data into a hopper, turn the crank and serve the sausage that comes out. If the result is not what is intended, then tweak the process. But you just get a different sausage. You will never end up with Beef Wellington that way.

We are willing to expend a lot of energy tweaking things, but it takes more than that to be truly creative. Rock bands that sound like Led Zeppelin, pop music that sounds like Michael Jackson, or jazz that sounds like Joao Gilberto are a dime a dozen. But true originals are extremely rare.

The AI neural nets are good at copying too. But they are not truly creative.

True creativity happens when people look at a problem and don't stop at just one solution. Neural Nets give an illusion of generality, but they are only one model space out of a large set of approaches.

This is why I am so impressed with the work of Lenore and Manuel Blum on Conscious Turing Machines. They are two of the greatest minds in computer science. They have produced an architecture for AI that can be a framework for building neurosymbolic systems.

Expand full comment

Detlef Nauck

"there is also a psychological issue: humans by nature don't like to think too hard"

I fully agree. I say sometimes today's AI researchers don't like putting the mental elbow grease in. Throwing a bunch of data at a model and see what sticks doesn't hurt the head so much.

Of course, like GM writes, this is also much easier to sell to investors instead of "well, we'll have to do a lot of hard research to figure this all out".

Unfortunately, modern AI is not so much science but more engineering experiments. Or alchemy as someone at NEURIPS once wrote.

Expand full comment

Antony Van der Mude

I agree there. I have a similar problem in biotech.

A few years ago in my spare time I wrote a paper on genomics/epigenetics. I realized that it could be a new modality for cancer. Tumor cells are passing messages back and forth. Hijack that message passing by synthesizing an exosome with RNA to command the cells to self-destruct. This is different from the current modalities. So there is no interest from researchers of investors in cancer therapy because it does not fit the current paradigms and because it is new, there is limited data to support it. Essentailly they choose the safety of what we already have and just tweaking it, instead of putting time and effort into trying something new.

Expand full comment

Jonathan Grudin

Jul 13Edited

Nice article, but there is no mystery as to "Why was the industry so quick to rally around a connectionist-only approach?" and why Geoff Hinton had good reason to avoid symbolic reasoning these past ten years. Symbolic reasoning had over half a century of huge hype, huge expenditures, and zero to show. From 1960 to 1970 virtually the entire AI field predicted "the singularity" or ultra-intelligence (beyond AGI) by 1980 or earlier, including Herb Simon (who predicted 1980 in 1960) and Minsky (predicting 1978 in 1970), with some saying it would be 1985 and the most conservatve I found saying "in this century." In 1987 Marvin told me symbolic approaches would get there "within 30 years." An extraordinary amount was spent by government and industry on speech recognition and linguistics-based (symbolic) language understanding, and it went nowhere. I worked with Doug Lenat at MCC, whose CYC (as in 'encyclopedic', now Cyc) was the most ambitious symbolic effort. He predicted in 1989 that by 1994 it would be educating itself and “spark a vastly greater renaissance in [software capability].” Cyc development has continued wth government funding. In a DARPA planning meeting officials were infuriated at symbolic AI's failures to deliver on promises. Yes, predictive modeling is fine for "the reptilian brain" but not the "mammalian brain" much less ours, but the major breakthroughs of the past 10 or 15 years have come from machine learning, deep learning, and predictive modeling. Symbolic reasoning must do more to explain its half century of hype and failure than help with coding, Tower of Hanoi, counting letters, and guardrails, important as they are. Maybe it could identify bad actors using AI. In my opinion, Geoff Hinton could deserve an apology, but I'm biased. I knew him and once lived in his house. He's a nice guy and more modest than may come across.

Expand full comment

Jul 13Edited

I don't understand half of this yet applying what I think of as a fairly well-developed "systems intuition" the main thesis is a no-brainer. For me it goes back to the difference between animal-brain "network thinking", i.e. combining inference with pattern recognition (networks = patterns of cause and effect relationships) and vernacular language, which by comparison is a later overlay and frankly a tool for social coordination more than dealing with physical reality. Seems like the inferences in network thinking correlate with neurosymbolic reasoning. I'd like to say I'm appalled at Hinton's crude misleading metaphors, but you know, he was just trying to make a buck and boost his ego; so what if he set human progress back a few decades.

Expand full comment

Jul 13Edited

The whole debate is ill-defined. An artificial neural network *is* a symbolic system: it manipulates discrete finite numbers (the weights) with discrete arithmetic operations only, and there is a fixed loss function responsible for learning. When the network learns, it learns fixed, mathematical, and thus symbolic, structures and rules: for instance, words are converted to vectors, and then combined together with learned, but fixed mathematical operations. On the other hand, the brain is not a symbolic system, because it works according to laws of quantum physics, and manipulates electromagnetic fields, chemical substances, etc., so analog quantities.

So the debate should in reality be framed as empiricism vs rationalism. Can symbolic rules be learned from external data only or there are some rules that must be “a priori”, so precede experience, as Kant predicated? The answer is that much can be learned, but the brain has an internal a priori architecture, which is not easy to recreate.

Expand full comment

This is hopelessly confused, chock full of category mistakes and conflation of different levels.

Expand full comment

The hardware needn't be symbolic, for example, there are analog neural chips. At one point HP bet on neural chips but that seems to have come to naught and a dead end.

Expand full comment

Jul 13Edited

Of course the hardware need not be symbolic: as I said, the human brain is not! I remember Hinton said once that analog networks are not going to have industrial success, as they cannot be copied. Every single piece of analog hardware has to be trained separately, and when it stops working, is lost, as the human brain. On the contrary, digital networks can be copied, and so trained for orders of magnitude longer periods of time.

Expand full comment

The human brain project is copying the network and synapse weights and transferring them to digital hardware. Analog cannot be copied by reading the network, but it can be copied by matching every connection to another analog device. Slow today, but conceivably fast in the future. Hypothetical and speculative, yes, but not unimaginable.

Expand full comment

Synapses are a tiny part of the brain. You would need to clone every single neuron, perfectly reproduce its metabolism and the chemical substances it is receptive to and generates (dopamine, serotonin, hormones etc. which have a crucial role in learning and cognition). You must also simulate the electromagnetic fields that neurons collectively generate, which are known as brainwaves, and are a crucial mechanism that we still understand poorly. You will thus need a ridiculously powerful quantum computer, at the very least. Even worse, you would need to understand all these details of how the brain works to implement them in the first place. In a distant sci-fi future it will perhaps be possible, but now it is foolish.

Expand full comment

I think you are obfuscating the issue. replicating a small piece of brain in silicon is exactly what the brain project is doing. IOW, all the neural interactions of the piece of brain are being precisely simulated. Therefore, theoretically, a brain could be "copied".

But that is not what you originally claimed. You stated that analog hardware could not be copied, because it would require going through all the training to do the copying. That is false. A 2-layer neural network of connections and weights can be done in analog hardware with a 2D matrix of connections, with, e.g., resistances, as synapses to simulate weights. (high resistance -> low weight.) Conceptually, the 2D layout can be mapped to another matrix, and the resistances changed at each point. This can then be scaled for the entire ANN network. As this can be done in parallel, copying is not time consuming, but uses a different approach than from a digital computer with a file system that can be read. Alternatively, the digital system can be converted to an analog system where the connections and weights matrix can be converted to the analog hardware. This is not so conceptually different than converting a digital program to an FPGA.

Therefore my contention is that we can do analog AI, and that copying such trained analog hardware for mass-produced brains is quite possible. It is neither some impassable technological barrier, nor economically infeasible approach.

Expand full comment

That's completely wrong. A true "ANN" is obviously impossible to copy precisely, because the resistance is an analog quantity. There will always be a measure error. So to make a precise copy of an "ANN" you have to confine the resistances in discrete clusters, which is what turns it into a digital/symbolic system.

Expand full comment

I don't know the fuller context of that Chollet quote but there seems to be a problem of definitions. I agree with him that using a code interpreter qualifies as neurosymbolic but I'd also expect that a lot of AI engineers would say that it doesn't undermine Hinton's outlook (even if Hinton himself might disagree).

Perhaps it's because GOFAI focused a lot on symbolic rules for natural language inference, rather than generic applications for code - it's notable in that respect that all the cited neurosymbolic successes (code interpreter, AlphaX etc.) are not NLI systems. Some people might therefore discount an LLM+ system as neurosymbolic if it is still the LLM in full control of NL interpretation.

I don't mind the definition one way or the other but I think there is more doubt to be had about the potential of neurosymbolic techniques that specifically look to share NLI between the LLM and a rule system. One can point to toy examples that make the idea appealing but scaling problems are exactly what sounded the death knell for GOFAI as well. All techniques will hit a fundamental wall in that words require human bodies for meaning. There's no use trying to rely on statistical or structural relationships to infer a world model because NL simply doesn't encode such models - the idea that it does is a kind of user illusion.

Expand full comment

Martin Machacek

Words, symbols require shared lived experience to make sense. As humans we have hard times to imagine other than an embodied lived experience. That does not mean other experience cannot exist, but it definitely won’t be shared with us humans. So the symbols it will use won’t be (easily) understandable to us. Communication with such intelligence would be likely complicated and it will likely have hard times learning from human knowledge. This makes me believe that such intelligence won’t be useful for humans.

Expand full comment

Sean H

I'm also a believer in the neurosymbolic approach and would posit that it's going to be most useful when we use the neural network learning approach to build the symbolic systems themselves. The current approach is to rely on the neural network approach until we get to a hard problem and then outsource the calculation to a symbolic system (e.g., python app). Those are essentially two different systems bolted together.

What might be more useful is to build the actual symbolic representations from the data during learning. When each new piece of information is encountered, the system tries to generalize if possible and starts to build a symbolic representation with some confidence attached to it. You can then start to generalize and make assumptions from day one of encountering something new, just like humans do. And then as new information comes in, refine the symbolic representation and confidence associated with it.

Expand full comment

Gary Marcus

this really hasn’t worked this far. we need other techniques to make it work

Expand full comment

Ronan

Jul 15

This seems like the right approach. "Symbolic" reasoning should emerge from a NN when it becomes a useful tool for solving particular problems in the training data. So far the problem we have been solving is 'predict the next word'.

Expand full comment

Ken Kahn

Regarding "Hinton’s admonition against using symbolic AI was extraordinarily misguided, and probably delayed discovery of useful tools like Code Interpreter by years." - The Code Interpreter (now called "Advanced Data Analysis") was introduced in July 2023. Earlier versions of GPT could not generate code very well so it wouldn't have made sense to create it earlier.

More generally, nearly all of today's neurosymbolic systems are able to use tools and generate and run code because they were trained to do so. Hand-written symbolic code may show up in guard rails as mentioned and in reward functions but it is a minor part of the effort. Yes, many system do search but that is not the related to your argument that symbols and representations are missing.

Expand full comment

Jul 13Edited

You seem to be missing a couple of critical points. 98% of the funding has been dedicated only to scale, not areas consistent with good science -- like compression (scale should be pursued but only as one among several obvious paths).

Very little of the capital has been invested in better understanding of the human brain, for example, despite its super efficiency. Classic and historic case of Maslow's hammer. Scale is the hammer that Big Tech has, so everything looks like a nail that needs scale. Blind to everything else.

The scale obsession appears to be due primarily to two things. One is what I call the AGI cult, which is a psychological phenomenon strengthened by people like Hinton (and others) who had most of his entire career invested in DL, but the biggest reason is the massive target in Big Tech. They are threatened by AI that doesn't rely on scale, so they were more than happy to provide record subsidies to LLMs. The strategies are pretty clear -- on one side it's skim as much off the top as rapidly as possible (LLMs), and on the other (Big Tech), it's control or kill AI.

One of the best ways to destroy a market is to bury it with capital and force it to an unsustainable path.

Expand full comment

Ken Kahn

This doesn't match my impression of Anthropic and Google. Anthropic puts lots of resources into mechanistic interpretability. Google does many AI projects beyond LLMs - AlphaZero, AlphaFold, AlphaGeometry, AlphaProof, AlphaGeonome, ...

Expand full comment

https://open.substack.com/pub/kyield/p/perverse-incentives-are-driving-systemic

Jul 14Edited

You are actually making part of the same argument Gary is making, and I've been working on for 28 years (accelerating discoveries in life science was my original motivation for the KYield theorem). AlphaFold is arguably the most important contribution for society in AI to date, and it's a neurosymbolic hybrid, not a generalized chatbot.

However, there has been tension all along between DeepMind and Google, and Google has restructured DeepMind so they have more control and it better aligns with the mothership. Almost all the money still comes from search at Google, hence the focus on scale.

Anthropic has a similar challenge and tension between generalized AI available to the public and very narrow and deep functionality for business.

See my recent op-ed:

Perverse Incentives are Driving Systemic Risk in AI

Expand full comment

Ken Kahn

To clarify - I think the pure neural network proponents believe in symbols, representations, compositionality, heuristics, etc. -- it is just they believe a neural network can learn those things without being engineered. That is the crucial question.

Expand full comment