82 Comments
Apr 7Liked by Gary Marcus

It certainly seems the negative societal impacts of generative AI are far outpacing any potential benefits, and at a staggering speed. The pollution from LLMs threatens the internet as a useful communication medium, sucking up human-generated content and then turning out a firehose of at best mediocre, unreliable, generated swill.

If someone *wanted* to harm society and the economy with one efficient stroke, I doubt they could have come up with a better move than having OpenAI release ChatGPT in 2021, with grandiose claims (that continue to not hold up), and set off the rat race that's currently playing out.

Shakespearan tragedy seems too small to describe this. This is like the Illiad, or the Mahabharata. Humankind letting loose its worst instincts and causing mass suffering and harm.

Expand full comment

Funny how if you slog through LeCun's most recent appearance on Lex Fridman podcast, LeCun's now very skeptical of LLMs as the path to AGI or seriously advanced AI. The most dangerous thing about AI development is it promotes people who are highly technically proficient-which LeCun clearly is-but also unbelievably intellectually dishonest. They repeatedly hype AI's alleged capabilities while disparaging those concerned about safety and reliability. When the safety concerns turn out to be impossible to deny, the AI hype people move on and pretend they knew all along that, for example, LLMs are unreliable. No! You were shouting down people saying that just a few months ago as "doomers!" The people with tech skills AND shameless hype get billions in seed capital, and the people warning about safety concerns get belittled and scorned by people like Marc Andreessen, who claims AI will be the silver bullet for literally every problem humanity has. Meanwhile, LLMs can be hacked by people who know nothing about AI by prompting an LLM with a few sentences the model can't handle! Or a 30 year old computer graphic!

Expand full comment

Don't ignore Hanlon's Razor.

Expand full comment

Well observed.

These days I hold it even more dear than Ockham's Razor. Not because I think it is more important (it's becoming ever more clear that scientific rigor is expriencing the biggest crisis since Galilei stood before the inquisition), but just because it's so readily applicable - and no, I'm neither happy nor delighted by my own cultural pessimism.

Expand full comment

Marc Andreessen - ptui!

Expand full comment

Spot on. Your list of AI deficiencies and inadequacies echos my own. The only thing I would add is that hallucinations are not to be imputed to a system which lacks any concern for realness and truth, and has no care whatsoever for anything other than its own feedback loop (feeding into your argument around echochambers). The most pernicious effect, I agree, is that of contamination. Imagine, therefore, if two LLMs started talking to each other and the fruits of their exchange became the dataset of a third.

Expand full comment
author

that latter is what Davis and I called the echo chamber effect (which is in the list)

Expand full comment

I find your observation striking. Everybody who has been studying GANs only a *little* bit has observed the contamination effect and instead of concluding the approach to be fundamentally broken we end up with articles that claim "two AIs have invented their own language" and alike. When I return to my mail inbox I can only facepalm at some of the papers I get (e.g. to peer-review).

Expand full comment

All this is driven largely by the vast amount of money being thrown at the field. In that sense, it's a gold rush and no one really cares if it's fairy gold that might simply vanish tomorrow. Not as long as they've stuffed their bank accounts first.

I'm afraid that the only way to derail this train is to come up with something that can outperform it, or perform more or less as well for a lot less investment of money and training data.

Expand full comment
Apr 8Liked by Gary Marcus

This train will derail when investors will start to lose a lot of money. For the moment it is the blind race forward, GenAI big companies trying to keep the steam pressure high with forceful declarations and promises. And investors are not ready to withdraw yet. So the solution could be to have new AI companies on the market with a different technology. Or to suffer a major disaster, a major failure due to the present technology, with very huge material or financial losses and very high societal negative impact. A kind of painful awakening.

Expand full comment
Apr 7Liked by Gary Marcus

The train will derail when sufficient time passes and the investors withdraw their money. Such investors are not likely to suffer the sunk-cost fallacy. As far as the technology is concerned, it seems likely that more applications for it will be invented. We know a few of them already. Coding assistance is useful and will make programmers more productive rather than take away their jobs. LLMs are useful for idea generation. These applications will have a common attribute: the need to keep a human in the loop to provide judgement, filtering, etc.

Expand full comment

Gary, I completely am with you on this! Since most LLMs are based on pattern matching,via Attention, which is based on cross-correlation, the downsides of LLM are not surprising.

Expand full comment

Science communicators: correlation is not causation.

LLM advocates: my giant correlation machine is the solution to everything!

🤦🏼‍♀️

Expand full comment
author

💯

Expand full comment
Apr 7Liked by Gary Marcus

“a pile of unreliable systems that are unlikely to live up to the hype.” Every single place I turn, the buzz word of AI is used. When the hype falls flat, I think it will be more than ego’s that will be bruised. People are investing in AI tools thinking that this is a great product that has vastly more pros than cons

Expand full comment

The idea that the hype will die down is inaccurate. Rapid advances in areas such as deep learning, deep reinforcement learning, self-supervised learning, and hardware are pushing the boundaries every day. It's likely that we'll soon see breakthroughs that allow Large Language Models (LLMs) to be applied in more meaningful ways, solidifying the rationale behind the current enthusiasm. This momentum shows no signs of slowing down (good!). Hype is good for the mental health of researchers, which will help AI advance rapidly over the next few years, just as it has over the last four. There's nothing worse than feeling like you're doing things that don't matter, at least that's my own experience (just being honest), and so hype is actually really good for development.

Expand full comment

Greed for compute/electricity as well of course...

Expand full comment

The solution is not ideas it is scale. No matter how in the weeds the LLM industry is they figured out the scale problem in human knowledge representation. As you and most researchers know a semantic representation whether CYC or Watson has not scaled despite decades of ontology curating and lexical clustering. Humans hit a wall when they try to scale.

The holy grail has been fully automated construction of knowledge graphs. The data structure for the semantic web is many decades old with Sir Tim getting that fully defined as Web 3.0. RDF and W3C the result. Filling in that data structure has failed as a large scale human activity. Even when limited to narrow specialized areas like Health. Ask Watson.

Expand full comment

It has already been clear for a while that scale is not going to solve LLMs 'understanding' issue. The problem is the architecture, not the scale. See https://ea.rna.nl/2024/02/13/will-sam-altmans-7-trillion-ai-plan-rescue-ai/

Don't forget that OpenAI already had a 175B model by 2019. Have a look at that link that removes the somewhat misleading logarithmic x-axis from OpenAI's paper's graphs. On a benchmark that tests for actual understanding (Winogrande) the scaling to get to human level might be as much as ~3,500,000,000,000,000 in a quick estimate. That number might easily be about a factor one million to high, but even then...

Expand full comment

That is a good way to show that the flaw in the LLM architecture in understanding is not solved by adding flops. But LLM is very good at writing text from a grammatical view. It just can't understand what it writes. My comment relates to a semantic AI database that represents knowledge. That is a completely different architecture with different scale problems. It is not a replacement for LLM but provides factual grounding to the LLM both in training and alignment.

Expand full comment

Apart from the analysis reply: the big problem is that we have no idea how to marry the two approaches, though it probably looks like an 'at scale' version of what chatbot vendors are already doing: having a system pattern recognising something and then passing it on to some other more symbolic function (like ChatGPT is now doing when it recognises arithmetic, produces python code for it, executes that code, and adds that result to the prompt for generation). The problem with the chatbot approach is that these are all one-offs for a specific function (like arithmetic) and that doesn't really scale.

Expand full comment

Indeed. And we know it must be possible as human minds do something like that. We both have a 'discrete sense' (true/false — which may actually boil down to the firing of a single (cluster of) neuron(s)) and an 'estimation', which does almost all of the work. How one creates discreteness out of analog machinery is done by 'removing the in-betweens' (a CPU does that by using a clock). And we have a lot that is 'in-between', like "this cannot be true, can it?", also grounding us. It is a full-scale from full estimation to (acknowledged: not that much) discrete reasoning.

In the current state of AI we know we will need this same range. But in GenAI we're working in a single paradigm (analog estimation) approximated by lots of discrete logic. This actually can be seen as completely upside-down. We probably cannot generate enough intelligence through analog estimation generated by lots of discrete logic (it doesn't scale), we probably need discrete logic coming out of analog machinery (which is of course what digital CPUs also do, it' just that there, the analog behaviour is thoroughly made out of reach).

To me, at least, this has been rather obvious (that is my 'estimation machinery' speaking) since I dug into the matter in the 1990's. Which makes it really easy to make judgements about 'AGI being around the corner'.

Expand full comment

I think that is profoundly wrong. Most actually useful knowledge graph representations have a fractal structure that is exponentially difficult to traverse: Either storage[1] or retrieval (or both!) are NP-hard and you just cannot solve a Von Neumann-bottleneck with scale.

[1] Fun addendum: This is the same reason why I posit that Quantum Computers are likely much less useful than proponents assert: While there are algorithms that show a speedup, they generally ignore that the data representation must be quantum and *it is usually omitted from the question of complexity* - because then the speedup becomes all but a tradeoff between representation preparation and actual calculation.

Expand full comment

Your mental model is profoundly wrong.

A RESTful API. "Representational State Transfer." is the architecture used to navigate knowledge graphs in distributed systems.

Expand full comment

i) Intelligence in a human is not a distributed system.

ii) Of course you can transfer any representation you already have with finite effort. The point is that both creating and traversing it are on the more power-law side for actual problems.

iii) Lookig at a particular representation and stating that it is not fractal is survivorship bias: You have a problem you can represent (efficiently), but that does not show in any way that all problems are like that.

Don't try to obscure the problem with a technical remark about other systems that have totally different assumptions. The fat tail is real, folks.

Expand full comment

I was clearly talking about distributed processing and software engineering in semantic AI models. It is technical and you are not familiar with the technology. I gave you the answer as to how navigation of very large knowledge graphs - billions of nodes - is implemented in the cloud. You said that was impossible which was a technology statement.

Expand full comment

Okay, let's unpack this.

I did *not* say that what *is* implemented is impossible - that would be rather stupid.

What I *did* say was that the *approach* (hint: that is *not* the same as the implementation) is flawed because the underlying assumptions are, imo, wrong. Yes, fundamentally wrong, not just "a little bit off".

Of course we see systems that build knowledge graphs and look up things. Even rather efficiently, it would seem.

So yes, my point is not really technical in the sense you use the word*: Finite knowledge graphs as they are used in SOTA solutions are finite and with no or little regards to: recursion, recurrence, context sensitivity. But all of them matter and complicate the whole thing massively.

My rule of thumb is this: Whenever something has a state space that grows linearly or log-linearly with the input dimensions, it is too simple to do actual interesting things. Think Zipf's law: When you encode a corpus - *any* corpus - then all you can represent is *that* corpus. What you cannot do is extrapolate from there, and that's exactly what people want to do with knowledge graphs. I disagree that this'd be what humans do when they come up with hypotheses.

* It is technical in the same way that category is technical mathematics although you cannot really use it to do actual calculations.

Expand full comment

I am not impressed. I actually run a company that is the leading supplier of a semantic AI stack and supply knowledge graphs at scales of billions of nodes. I am holder of one of the most cited patents in semantic AI. None of that is relevant but my experience and orientation is entrepreneurial

We really do not speak the same language or have the same viewpoint.

Expand full comment

Any time people bring out "quantum" wording when it comes to AI, that is pure charlatanry. Human-like knowledge graphs are neither quantum, nor fractal-like, nor require NP-hard algorithms.

A human-like system is very achievable with current machine architecture, and we've done a great job mapping not just knowledge graphs, but also language. What is next is reasoning.

Another 10-20 years without utterly profound changes, just with improved world models, better architectures than LLM, more compute, and closing the loop between actions and consequences, and we'll have it solved.

Expand full comment

While 'quantum fruitloopery' and such are a real nuisance, it is not wrong to note that the value space of integers is rather small compared to that of reals, and that assumptions that we can have enough integer value space to approximate the real thing are just that: 'assumptions' (and suspect ones at that, given what we already have observed in nature with respect to the role of quantum effects for in the end large scale results). You're convinced this value space is not a problem, I'm rather convinced it fundamentally is. Either you are in for a disappointment or I am in for a surprise.

That human intelligence isn't 'mostly rational' is — I think — established fact.

We haven't mapped 'language' as far as I know. With LLMs we have mapped 'ink patterns'. (And attempts at mapping language, I was a short time part of such an enterprise once, have thoroughly failed). Your 'better architectures' form quite a conditio sine qua non.

Expand full comment

There exist phenomena where floating point approximations of real numbers will fail, such as for chaotic systems.

I do not believe there exists any research claiming that what is in our head is anything else than good-old massive signal processing. Signals transmitted between and inside neurons are reasonably strong, unambiguous, and not high in frequency.

As such, by the Nyquist–Shannon sampling theorem, sufficiently dense and sufficiently accurate discrete measurements should be enough to figure out what the brain is doing.

Expand full comment

And just to take things from a 1950s interpretation of 1940s Neuroscience ...

"The past 40 years have witnessed extensive research on fractal structure and scale-free dynamics in the brain. Although considerable progress has been made, a comprehensive picture has yet to emerge, and needs further linking to a mechanistic account of brain function. Here, we review these concepts, connecting observations across different levels of organization, from both a structural and functional perspective. We argue that, paradoxically, the level of cortical circuits is the least understood from a structural point of view and perhaps the best studied from a dynamical one. We further link observations about scale-freeness and fractality with evidence that the environment provides constraints that may explain the usefulness of fractal structure and scale-free dynamics in the brain. Moreover, we discuss evidence that behavior exhibits scale-free properties, likely emerging from similarly organized brain dynamics, enabling an organism to thrive in an environment that shares the same organizational principles. Finally, we review the sparse evidence for and try to speculate on the functional consequences of fractality and scale-freeness for brain computation. These properties may endow the brain with computational capabilities that transcend current models of neural computation and could hold the key to unraveling how the brain constructs percepts and generates behavior. "

Grosu GF, Hopp AV, Moca VV, Bârzan H, Ciuparu A, Ercsey-Ravasz M, Winkel M, Linde H, Mureșan RC. The fractal brain: scale-invariance in structure and dynamics. Cereb Cortex. 2023 Apr 4;33(8):4574-4605. doi: 10.1093/cercor/bhac363. Erratum in: Cereb Cortex. 2023 Sep 26;33(19):10475. PMID: 36156074; PMCID: PMC10110456.

Expand full comment

"there systems" -> "their systems" (I'm assuming you can remove this comment and I don't know if you see anything else as quickly)

Expand full comment

I would have to disagree with most of this.

Perhaps I'm the only one actually experimenting with LLM's at scale over multiple decades, but the current versions are staggering. Most of the quoted issues are not entirely relevant anymore - the old complex chestnut "Time flies an arrow" is easily comprehended.

The first few thousand novels I generated in the past had issues with what I called the "physics" of reality, impossible descriptions, but no longer.

The unreliability - I suppose nobody has heard of an "unreliable narrator" - is a matter of naivte in fact-checking. Grammar, factual statements - show me a human who is 100% accurate in non-fiction please.

Major American newspapers happily misrepresented certain recent conflicts without correction, no AI needed.

Internet was long, long ago (speaking 1980's) polluted by opinion masquerading as fact, which grew exponentially without AI intervention. Tantalizing misinformation resonated in a manner akin to a laser with mirrors on either end stimulating and amplifyjng crud until it burst out decimating facts in it's path.

Most of these critiques are of internet, not AI or constructed LLM's.

The only major failure I continue to see exhibited in the fiction and nonfiction novels, screenplays, papers, training and analysis texts I generate are embodiment-related.

Encoded sensations within perceptual systems that have shared cognitive strata with abstract reasoning don't yet translate into LLM's - perception of time, or physical position (proprioception), or similar noninguistic perceptual models we hold.

My regression test set to see how generations are doing include harcore erotica (quite good now) which as moments arrive which are purely sensatory, glitches arrive with human body reasoning.

We live with our cognitive systems having encoded reality which we access through consciousness and dimoelled by multiple overlapping sensory/feedback loops. LLM's are already "multimodal" visual, perhaps auditory, and only lack a dozen more sensory encodings to make them even more stunning - chronosensory encoding, chemosensory, proprioceptive, nocisensory, imteroswnsory, thermo- hygro-, equilibrio-, mechano-, and perhaps electro- , magneto-, spatio- sensory encodings during training are the only way to add the dimensions required to encode and connect embodiment, that and homeostatic feedback loops like fatigue, thirst, hunger, temperature, immune systems, reproductive hormones and so on.

Expand full comment

Interesting, and that's a lot of encodings, but I'd like to know, what efforts/outcomes deal with the problem that the LLMs don't care about anything, themselves or otherwise. The argument about 'unreliable narrator' doesn't convince, because it's not so much that 100% reliability is not achievable by humans, but that reliability only makes sense when the rational agent strives to maximise reliability through a concern for reliability. Despite your enthusiasm, all LLMs are vitiated by a complete lack of care and concern for truth. So they fall, without knowing or caring, into bullshit in the Frankfurtian sense.

Expand full comment

That's why i use multipass filtering for outputs to condition higher reliability. Works like a charm. Not unlike GAN networks. A spell-check is a simple filter, grammar, facts, and so on.

I have a feeling that I'm talking to people about a book nobody has read. It's not hard.

Expand full comment

LLMs at scale over multiple decades, hmm? I nearly fell off my chair 😂

Expand full comment

I wrote a small one in 1993 as a toy, put it on a website at xs4all.nl.

Trained on texts from project Gutenberg, you would give it a prompt string in a textarea, select trained models (texts) in subject areas from pulldown menus, and it would perform a completion for a paragraph or two, using weights models (or combined models) using a sliding context window of 10, and a small attention layer. It would create amusing paragraphs.

It's not very hard. It was a small implementation of a markov chain traversal system, which is congruent to the current models, only tiny.

Expand full comment

That was what I suspected (and hoped to draw out from you) - it was a (non-large) language model. Thanks for confirming and sharing; and I admire folks who have been at this game for a long time. Apologies for the tone of my earlier comment; I took a calculated risk and was rewarded with rich detail 😉.

And I’m just blown away by what one extra letter L can mean when attached to the phrase, “language model”. I marvel at what it takes to make one available: about USD20 million capex, a small army of elite engineers and scientists (likely with generous profit sharing plans) and a mad, complex pipeline involving huge amounts of compute and storage.

To me it’s like comparing a paper airplane (language model) with a large wide-body passenger airliner (LLM).

Expand full comment

Simon, i never take offense I was among the first users of USENET over 50 years ago I know text is tricky.

Here are recent examples:

Suzanne - in examination of an idea she had about “mother as artist” and feminist perspectives, 50 different authors, essay and story-form with illustration, tool about 2h to generate

https://www.dropbox.com/scl/fi/cmnmgpmn4a09zs6wqekqq/Mothers_203040full.pdf?rlkey=cjn1c6rz274p7sgrmz8imyr5u&dl=0

Here’s a complex long-form refutation of gender by the high priestess of Gender, 3 versions - Judith Butler, Nancy Drew, and Michel Foucault, and Isabel Allende (Chilean Spanish) and Assis from Brazil, I didn’t force hard vocabulary style control,

https://www.dropbox.com/scl/fi/5vwtrgnc06kipela6muug/Butler_112233full.pdf?rlkey=39zh8v4p2dsluni0xznlt1k8j&dl=0

https://www.dropbox.com/scl/fi/l64vkkcs9q1z3hzt5tyi1/ButlerDrew_112233full.pdf?rlkey=d1k8968emvlmvkhw9i6s1hm61&dl=0

https://www.dropbox.com/scl/fi/11z67n1vrpj2rdb1adhj5/ButlerFoucault_112233full.pdf?rlkey=ehmkzggyowztnsazgwsw92qbp&dl=0

Allende

https://www.dropbox.com/scl/fi/1usbiqiltdgpg2etha17x/ButlerAllende_112233full.pdf?rlkey=yuv8pjacvsvx8fq7sejlq7qje&dl=0

Assis

https://www.dropbox.com/scl/fi/a1w3f3lgnku1zlxowoiik/ButlerAssis_112233full.pdf?rlkey=lq6qoadvdgkixbshtlej5hqio&dl=0

Here’s a children’s story collection by the master of Macabre

Gorey

https://www.dropbox.com/scl/fi/9rs1qzkytoay9be2liyam/GettingBetterGorey_203040full.pdf?rlkey=31pouz64vqt3m5un0mthqnn74&dl=0

I have 50,000-60,000 recent ones, the most recent erotica are novel-length nonfiction of marauding pirates with illustrations and lewd pirate songs scattered throughout.

Images all are generated from section-by-section summarized instruction then run through a classifier to eliminate deformities, and linked back I to chapter.

I also have live Tarot Card reading system with the voice of Lorne Green, website;

Expand full comment

Thank you for this summary. I'm so curious as to why there are no well-funded alternative approaches.

Expand full comment

The source of the hallucinations is you, human.

Expand full comment

The prospect of people becoming mere fact-checkers for AI is remarkably dehumanizing.

I have yet to see a product of AI that isn't derivative, pedestrian drivel in an unctuous voice of fake sincerity.

The AI gold rush is this year's cryptocurrency--a new recipe for irresponsibility and falsehood.

Expand full comment

> Gary Marcus desperately hopes the field of AI will again start to welcome fresh ideas.

And I see it as a Human Language Project for all humankind.

Expand full comment

Similar to Human Genome Project - BTW now a genome is used as a language e.g. Gena-LM - https://www.youtube.com/watch?v=AygUdMl8ils - it's a neural-semantic approach - Human Language Project is a global language platform that combines Wiktionary, Panlex and a semantic multilingual model - activedictionary.com

Expand full comment

The thing is though, AI automation of the white collar world is upon us. It may be a good idea or a bad idea, we may like it or hate it, be confused or clear, enthusiastic or bored. Whatever our personal situation, AI automation of the white collar world is still going to proceed. And it will proceed for the same reasons agriculture was mechanized and factories went robotic. This process of automation is now more than a century old, at the least. Our opinions on the current automation transition don't really matter, because we have little power to change the course of history.

I've been yelling about the overheated knowledge explosion for years now. Even if all my rants were published on the front page of the New York Times, it wouldn't make a bit of difference. Such things are bigger than any of us. They're bigger than all of us.

We are entirely within our rights to yell about AI. But doing so makes about as much sense as yelling at the weather. What does make sense is trying to figure out how we're going to adapt to the inevitable.

Expand full comment