36 Comments

No one disputes the fact that Yann LeCun is a praiseworthy deep learning pioneer and expert. But, in my opinion, LeCun's fixation on DL as the cure for everything is one of the worst things to have happened to AGI research.

Deep learning has absolutely nothing to do with intelligence as we observe it in humans and animals. Why? Because it is inherently incapable of effectively generalizing. Objective function optimization (the gradient learning mechanism that LeCun is married to) is the opposite of generalization. This is not a problem that can be fixed with add-ons. It's a fundamental flaw in DL that makes it irrelevant to AGI.

Generalization is the key to context-bound intelligence. My advice to LeCun is this: Please leave AGI to other more qualified people.

Expand full comment

Well put. Deep learning is powerful for tasks requiring correlation of massive data points. Language understanding is not one of them. We agree it isn't fixed with add-ons to the wrong model. That would be like adding epicycles to the geocentric model back in the dark ages. Those in a paradigm have difficulty seeing another although the scientific crisis is apparent.

Expand full comment

I Kuhn not agree more.

Expand full comment

Thank you for tipping me off to the "objective function optimization" concept as opposed to effective generalization. I'm not in this field, so my knowledge of "generalization" is best captured by this documentary :-) video: https://youtu.be/Oa_QtMf6alU?si=RUH1k3uwZ_qv6pAT - What I mean to say is, would love more information on what that means, about the distinction between objective funciton optimization v. effective generalization. A link or so to clarify would be much appreciated. Thanks!

Expand full comment

Thanks for the comment. We (Rebel Science) wrote an article on this topic about a year ago on Medium. We may republish it on Substack in the near future.

Deep Learning Is Not Just Inadequate for Solving AGI, It Is Useless

https://medium.com/@RebelScience/deep-learning-is-not-just-inadequate-for-solving-agi-it-is-useless-2da6523ab107

Expand full comment

It doesn't help any that LeCun is a dishonest d*ck, as his responses to Gary on Twitter sadly demonstrate.

Expand full comment

He's just afraid, imo. A lot of people are waking up to the fact that DL won't play a role in solving AGI. Something else will come and replace it. But DL is LeCun's reason for being. He can't see beyond it. He feels threatened. His behavior understandable even if unprofessional at times.

Expand full comment

The LLM charade continues... hopefully not for long.

Expand full comment

I think that Ian Bogost came up with the best single sentence that explains the problem and its dangers:

"Once that first blush fades, it becomes clear that ChatGPT doesn’t actually know anything—instead, it outputs compositions that simulate knowledge through persuasive structure." ("Generative Art is Stupid," _The Atlantic,_ 2023-01-14.)

I initially found ChatGPT interesting, but at this point I find it quite frightening. As the electronic communications and Internet have become more and more prevalent over the past few decades, we seem to have moved from a world where the primary problem is finding good information where information is hard to find to a world where the primary problem is filtering out bad information from a huge flood of information both true and false, and the latter seems to me a much more difficult problem. Programs that rapidly generate more information are only going to exacerbate this flood and make finding good information more difficult yet. (And of course they'll produce ever more bad information as they are trained on the flood of other bad information.)

Expand full comment

Apparently I'm not the only one with this concern. Yesterday (2023-03-08) _The Atlantic_ published an article by Matthew Kirschenbaum called "Prepare for the Textpocalypse," which I think coins as good a word as any for what we may soon be facing.

(I do not include the link here to avoid triggering spam filters, but it should be easily found through search. At least today, anyway; who knows if we'll be able to use search engines to find particular human-written text in the future.)

Expand full comment

This feels like they're trying to use an axe to turn a screw. LLMs are simply the wrong tool for the job of you're trying to make factually correct statements about reality or conduct any kind of logically sound reasoning.

Expand full comment

There is a need in a conceptual model of the world i.e. its scientific picture - t.me/thematrixcom

Expand full comment

So true. Humans, animals, birds and bees have that in their own way, on account of their bodies and brains. A disembodied agent can't possibly know what gravity, slippery floor and a million other things mean, beyond descriptions (created by humans), formulae (created by humans) or data (collected by humans). The agent has no means to verify any of this for itself!

Expand full comment

Right, it's a relativity principle for consciousnees - it cannot be separated from a carrier

Expand full comment

"Is this really what AI has come to, automatically mixing reality with bullshit so finely we can no longer recognize the difference?"

Well, consistent with FB story.

Expand full comment

Only barely more impressive than https://thatsmathematics.com/mathgen/ (which did fool a journal once)

Expand full comment

But can you point out profound research efforts for true knowledge representation that really rethink AI from a bottom-up approach?

It seems like the vast majority of ML/AI research is focused on beautifying bullshit.

Expand full comment

It gets drowned out as you say by the vast majority but yes, we are doing meaning NLU. https://www.youtube.com/watch?v=QVrDQJzCmis&t=48s from first principles, linguistics and brain science to solve the problem of machine readable meaning.

Expand full comment

I cannot stop laughing as I contemplate Terrence Tao's discovery of Lennon-Ono [sic] complementarity according to the AI. Despite claims that chatGPT with GPT-4 is much improved, I think this ACM post remains valid https://cacm.acm.org/blogs/blog-cacm/270970-gpt-4s-successes-and-gpt-4s-failures/fulltext

Expand full comment

Galactica, ChatGPT, and all other LLMs are all con-men in the most literal sense, and this ain't comfy. If has the logic skills of a teenager but the BSing skill of a professor, this would unironically make smart people worth more than marketers. Also time to use this to "turing test" academic frauds. https://threadreaderapp.com/thread/1598430479878856737.html https://davidrozado.substack.com/p/what-is-the-iq-of-chatgpt https://en.wikipedia.org/wiki/Sokal_affair

Also not to doot my own hoot, https://bradnbutter.substack.com/p/porn-martyrs-cyborgs-part-1

Expand full comment

Marcus writes:

"And, to be honest, it’s kind of scary seeing an LLM confabulate math and science. High school students will love it, and use it to fool and intimidate (some of) their teachers. The rest of us should be terrified."

Yes, you should be terrified because AI is going to make you obsolete. You are in effect promoting the source of your own inevitable career destruction.

As example, here's an article by an academic philosopher:

https://daily-philosophy.com/jasper-ai-philosophy/

The article says:

"I tried out Jasper AI, a computer program that generates natural language text. It turns out that it can create near-perfect output that would easily pass for a human-written undergraduate philosophy paper."

So, how long will it be until AI can write near perfect output that will easily pass for PhD level philosophy papers? I don't claim to know the timing, but isn't such a development inevitable?

What's going to happen when those who fund the ivory tower can't tell the difference between articles written by humans and those generated by AI? The answer is, the same thing that happened to blue collar workers in factories.

In the coming era we won't need any of you to write us articles about AI, because AI will do a better job of that, at a tiny fraction of the cost. And we won't need you to further design AI either, because AI will out perform you there as well.

What we are witnessing in all these kinds of discussions all across the Net are very intelligent well educated people with good intentions who don't yet grasp that they are presiding over their own career funerals.

Expand full comment

Thanks Gary for your recent presentation - https://www.youtube.com/watch?v=xE0ycn8dKfQ - I absolutely agree about the need of deep understanding and conceptual knowledge for further development of AI - that's what I'm working on in my project - t.me/thematrixcom

Expand full comment

The criticism here isn't what I expected. Being able to invent some plausible sounding bullshit is a characteristic of human domain experts. The real test is whether the explanations of real phenomena are accurate

Expand full comment

I'm a history teacher. If I asked a student "Why did Catherine the Great emancipate the serfs?" the correct (and intelligent) answer is "Catherine the Great didn't emancipate the serfs; Alexander II did." Any other answer reveals an inability to distinguish between truth and bullshit, and any AI that can't distinguish between the two can't and shouldn't be trusted. An expert who invents "plausible sounding bullshit" knows what they're doing; an LLM is just probabilistically sequencing words.

(As one of the links in this article demonstrates, the AI in question is "just okay" even when not being prompted to confabulate... which is exactly what we would expect given that it makes no distinction between truth and fiction, just varying degrees of probability).

Expand full comment

took the liberty of posting this on Twitter and Kurt Andersen and others RT’d, including: https://twitter.com/_akpiper/status/1594372389835612161?s=61&t=zIv3P3UExZxz9k5n8-iWZw

Expand full comment

I think this is an easy problem to solve though right. You just need to pass the output through a second neural network that checks for truthfulness. And labels how confident it in the accuracy just like alpha fold does. How do tell what is accurate? You cross check multiple different sources and if they all give similar answers then you can say confidently it is true. Or you generate several different outputs with minor differences in the prompting. This basically the only way we can tell whether things are true today, unless you are a front line researcher. There is no reason that this would be particularly hard for a neural network. Obviously this will only give you the accepted wisdom. But asking for anything more is obviously not feasible at this point.

Expand full comment

Sorry but nobody has the slightest idea about how to build a neural network that checks for trurthfulness (relative to some baseline agreed upon truth)

Expand full comment

What that second neural network is doing is what we call, "logic". And as Gary Marcus points, out, I don't think we know how to create such a network that can reason in the general case. At the appropriate level of abstraction, it is probably possible to train such an model but that level of abstraction is far below the level of words.

Expand full comment

I'm not sure if you are tilting at windmills full of straw men, or just missing the point. Perhaps I am simply unaware of grandiose claims for LLM that are meant to be taken seriously, as opposed to a bit of gee-whiz hype.

The thing to be impressed by here is what these models do with so little. At 120 billion parameters, it is working with something in the neighborhood of 0.01% to 0.1% of the capacity of the human brain. At inference time, it generates these texts in a matter of seconds. They certainly can't afford to run this model on the latest greatest supercomputers, so it is safe to say the texts are generated with an even lower percentage of the equivalent human brain computer in the same time period. And a human writing such texts would likely spend time at least on the order of hours, perhaps at least 15 minutes on a short summary of a familiar topic. I would be surprised if the texts are generated with any more than 1/1000000th of the compute a human mind would apply to such the tasks, more likely 1/100000000th (1 billionth). Finally, the efficiency with which both the parameters and computefvggggggg are utilized likely falls short of biological brains by at least an order of magnitude..

Galactica and other LLMs do what they do with the equivalent computing resources of a literal and proverbial birdbrain. And a very, very small birdbrain at that.

What if you were told that Galactica was in fact a clump of cultured neurons the size of a small pea?

Get annoyed all you like with the "hype". But the achievement is astounding by any measure.

Expand full comment

MacGraeme, what they are doing *is* easy compared to the actual problem at hand. With deep learning, "inference" is easy. *Training* on the other hand, is hard. And these models aren't doing "inference" in the logical sense of extrapolating from premises and utilizing theorems; it is more like function approximation and that can be done fast. A human generating a summary of topics can't do it as quickly as this because the human is not generating it in any way similarly to how this LLM is doing it. If we could repurpose all the neurons of the human brain in order to train them on scientific texts at the level of words, humans could rival this LLM but it would be equally nonsensical and there would be none available for such important work as seeing, eating and other regulatory tasks of the brain.

And don't dismiss "bird" intelligence; crows and parrots are amongst the most intelligent species on earth, to the point that we should probably stop using "birdbrain" as an insult.

Expand full comment

This is a big moving the goalposts strawman argument. No one questions that Galactica is cleverly designed.

Expand full comment

I asked the model for a "Wikipedia article on Tom Cruise's genetic code"

It came up with this, which is totally fictional:

Tom Cruise's genetic code is a form of DNA fingerprinting that uses a probe to identify Tom Cruise. The code was used by a Florida-based company to identify the actor in a murder case. The company, GenePrint, was sued by Cruise in 2005.

Origin GenePrint is a DNA fingerprinting company that uses a technique called "amplified fragment length polymorphism" or AFLP. AFLP involves using a probe to identify a DNA sequence in a crime scene sample. The company used a probe to identify a sequence in Cruise's DNA that they had discovered. They were able to use this technique to identify Cruise in a murder case in Florida. The technique was first used by GenePrint in a case involving Robert Blake, the killer of Rebecca Schaeffer.

Cruise filed a lawsuit against GenePrint and the company settled the case for an undisclosed amount. The case was dismissed because of the settlement.

Expand full comment