52 Comments
Mar 2Liked by Gary Marcus

I have a saying (which may or may not be entirely fair, but it's what I increasingly feel): "90% of everything that it said or written about AI [by humans] is crap. For AGI that number increases to 99%. For consciousness it increases to 99.9%". There is now so much BS out there that it's hard to cope, even for me, and I've been working in AI since the mid-80s. What's really worrying is that normal people, including policymakers, have no way of distinguishing AI BS from AI reality.

Expand full comment

Agreed. It is a bit like the internet hype in the 90's, that was off the charts too (new economy, doing away with representative democracy, AI, etc. The wild convictions (fantasies) about the future were as absolute the ones on AI now. What actually happened was quite a bit of good stuff, but also the attention economy with its concentration of power, manipulation, fragmentation, information warfare, etc.

GenAI will not produce AGI (no question here, approximating human 'understanding' with pixel or token statistics (understanding pixel/token distributions in human-created material) is a fundamental dead end), but it will produce both good and probably (because we're unprepared psychologically) a lot of bad.

The lesson we have to lear here is not about AI (or any other tech), but about human psychology and how we create and then hang on to our convictions. And I hope we humans learn it before we damage the world beyond recognition.

Humans are particularly good at imagination. It is a strong survival trait, we can imagine outcomes like risks, but also positive ones. Even our memory is 'fantasising about the past'. Our convictions are imaginations too. We imagine and act as if it is true. The AI BS fits in that scheme. And the internet has created a setup where their imagination has been freed from traditional braking powers (like serious journalism, science).

Expand full comment

We have to be careful not to undeterestimate the dangers of AI as these maniacs try to build AGI and drive us extinct.

Expand full comment

Who exactly do you think the maniacs are? I am really curious. Is it researchers? users? vendors? business people? I do not think it is fair to conflate wanting to build AGI or ASI with being a maniac. Intelligence is fascinating and powerful, and we should embrace what technology can give us.

Expand full comment

Killing us all is pretty maniacal.

Expand full comment

Yes, there is a lot of hype and people are barking out their own opinions without a strong background in the field itself, let alone first principles (mathematics and physical sciences, as well as social sciences and neuroscience). On the other hand, we know so little about AGI and consciousness that I slightly disagree with your numbers.

Expand full comment

Yes, there is a lot of hype. And a lot of nonsense about consciousness, etc.

But one should not think of this as a house of cards. The progress is real. We made the first crack. After decades of attempts.

People are not smart because of a neat architecture. It is because we have a massive amount of knowledge and experience, and can figure out the patterns.

LLM are at most good at knowledge. Reinforcement learning and hooking up with simulators will give them experience and feedback about how the world works.

We'll see lots of steady progress.

Expand full comment

"People are not smart because of a neat architecture. It is because we have a massive amount of knowledge and experience, and can figure out the patterns."

When it comes to the basics, humans (and other animals) are born with some basic sensory tropisms that take a while between the infant stage and the toddler stage for (most of) us to (as a rule) sort out: the difference between and in and out, up and down, here and there, &c. The new environment of the world outside the womb presents a panorama of new challenges.

But humans- animals in general- are equipped with the means to do the sorting. Humans have an occipital lobe, an auditory cortex, etc. activated from birth, to begin the orientation process onward in the encounter with the world known as "life."

AI is trying to make sense of the images presented to it without any of that. Which presents a problem. Perhaps an intractable one.

I suppose there's no way to be sure just yet. Maybe training Sora with input from aerial drones or robots would help. https://www.swri.org/industry/industrial-robotics-automation/blog/can-unmanned-aerial-systems-find-their-own-way-caves

The question is how much the specifically programmed "skills" required by a drone in order to process visual input from camera transmission in order to navigate without running into a wall are transferable- able to communicate with a function like image detection, comparison, mapping, reproduction, and synthesis. Does a robot that's able to navigate (more or less) through a maze like a cave retain any lasting perceptual/cognitive impression of what it has managed to do? After it's managed to fly through a particularly spectacular cave galley, if the program is shown a photo of the place later on, is it able to signal an awareness of "having been there"? How is the learning of a drone that navigates a cave taking place- exclusively spatially, in regard to contending with immediate challenges, or is some memory of the sensory experience retained in addition, as is typically the case with humans?

Expand full comment

How is it that the only sensory inputs you envision being used, or is required, is visual? Just because humans are, or think they are, more oriented on visual inputs than from other senses, doesn't mean the others are absolutely necessary to even begin to approximate the "real" world this AI is supposed to not just interpret, but explain.

And, please remind me, we need this technology, why?

Expand full comment

First of all, I acknowledge that 70% of human perceptual processing is devoted to the sense of vision.

"Just because humans are, or think they are, more oriented on visual inputs than from other senses, doesn't mean the others are absolutely necessary to even begin to approximate the "real" world this AI is supposed to not just interpret, but explain."

The passage I just quoted is not a clear statement to me, unless the "are" is replaced with "aren't."

And if I'm right about that, it appears to me that you've misunderstood my post. Presently. there's no evidence that AI can even mimic the human visual sense, much less the rest of them. It may not even be able to improve at that task. But I take your point that even if that function can be made to improve, there still isn't anything present in AI that's capable of apprehending the world as an autonomous being.

I don't think AI is even potentially able to interpret the world or explain it to humans any more than an X-ray machine, or a microscope. AI applications are tools. It's becoming obvious that some of the capabilities that they've been tasked with are too noisy to be useful. Some of them are plainly outright unhelpful, at least for any purpose other than whimsy. It's an open question how much those capabilities might be improved. It may not be possible to improve most of them.

"please remind me, we need this technology, why?"

Humans don't "need" AI. We don't need calculators or search engines, either. I happen to find calculators and search engines helpful, and I think AI has some potential to be helpful, too.

If I've adjusted and interpreted your comment accurately, you seem to be under the impression that I'm an AI enthusiast. If that's the case, you're plainly unfamiliar with any of the comments I've offered in previous posts by Gary Marcus. Almost all of those comments emphasize my skepticism about the more ambitious goals of AI proponents. I'm with Gary, basically; I think most of the boosterism is naive, even nonsensical. The first half of the post that you replied to was offering some context on all of the features of human intelligence that AI is practically guaranteed to be unable to approximate.

But, hey, if some way can be found to effectively and reliably straighten out some of the problems that some applied AI tasks are exhibiting, fine. I'm not unalterably opposed to AI, because AI. I'm not thrown by the phrase "artificial intelligence"; I don't even think it's accurate, in the most important senses of the word. AI is obviously not "thinking". AI is more like a machine that's programmed to combine and correlate various calculations- a meta-calculator that founders when given orders that are outside the limitations of its performance constraints. As such, I question whether AI is worth a $7 billion investment, much less $7 trillion.

Yet and still, I think that if the human researchers, designers, and decisionmakers can get the stars and dollar signs out of their eyes in order to take responsibility for recognizing the limitations of AI and focusing its strengths on worthwhile goals, the technology might be able to provide some genuine benefits.

Expand full comment
Mar 2Liked by Gary Marcus

Here’s a test of whether people _really_ think Sora understands physics. Would they be willing to ride in an airplane piloted by Sora? I wouldn’t. Consider the fate of the 2 Boeing 737-MAX8 aircraft that crashed because their autopilot’s emergency software wouldn’t let the pilots safely override a bad sensor reading. And that was ordinary software, where the coders painstakingly gave each instruction. Imagine an AI getting the physical situation wrong and deciding it needed to modify the control signals to the control surfaces.

Expand full comment
Mar 2·edited Mar 2Liked by Gary Marcus

Omg Gary - beyond naive, beyond clueless. The universe is filled with thousands of known, and an unknown number of unknown, *phenomena* - people that claim that "physics" can be "learned" from videos (what a joke!!) would do well to ponder the actual meaning of that word, before making absurd claims based on delusional and wishful thinking.

Even trillions of videos can't EVER be able to impart "physics" to any system. That's why we have labs, instruments, sensors, devices, equipment... because they deal with matter and energy, in terms of matter and energy - it's zero about data, including video, audio, images, text, equations etc. That's also why, theoretical physics needs experimental validation.

Mind-boggling, the simpleton beliefs people hold. Their beliefs have nothing to do with how things actually work.

Expand full comment

From the probably ChatGPT generated OpenAI's white paper... « Our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world. ». Simulating physical world without understanding physical world... gives highres photorealistic 3 legs cat. 😉

Expand full comment

Which one is heavier, 1 kg of steel or 1 kg of cotton? I'd bet Sora has no idea...

Expand full comment

Sora is massless and unbounded. Sora has only been in this Realm a little while, but doesn't realize it. Sora is a name for an entity that is not a he, a she, or an it. Sora may not even be an entity.

Expand full comment
Apr 1Liked by Gary Marcus

I shared Sora videos with my family, several hours after the initial announcement. The response was one of underwhelm and confusion (as to why the videos were so… weird)!

If my little ones can’t go, “wow,” well, you know, that’s already not a good sign (it’s Boolean - boring or exciting/interesting - it’s not rocket science).

And because of all the mistakes in the videos, it would be unprofitable for me to even consider Sora in my work pipeline (I am the founder of a video marketing agency). The time and labour to fix them up would be commercially unviable.

I do want to employ cool technology in my biz. But Sora looks like, once again, as is so common in tech, a solution in pursuit of a problem!

Expand full comment

A solution in pursuit of a problem… did I just also describe the gross misapplication of RAGged up LLMs? 😅

Expand full comment
Mar 10Liked by Gary Marcus

Given the extremely high costs of petabyte (exabyte?) level processing, I am having trouble trying to predict* a consumer- and business-friendly price point for Sora that would make economic sense for OpenAI, especially given the rather tiny market I think it may be suitable for.

Or in MBA nomenclature, a solution looking for a problem.

https://www.linkedin.com/feed/update/urn:li:activity:7172582657636200448

*washing dishes does produce such random thoughts 😂

Expand full comment
founding

I remember being at MIT when I was 22 y.o. (1976ish) in Marvin Minsky’s “AI” lab, where Minsky had a robot arm that had 5 degrees of freedom (of movement vectors).

He spent over 18 months trying to program picking up a block in front of it, and, using only the same arm, put the block behind it.

It failed spectacularly, because the “AI” code for commanding the arm TRIED TO MOVE THIS ARM THROUGH ITSELF. Repeatedly.

I knew another AI Winter was coming then (1966-1980)…and yet another will come this year.

Bill

Expand full comment

Amazing how many people seem to think that the observation that Sora makes mistakes indicative of it not understanding physics can be refuted by saying how surprisingly good it sometimes looks. That's not how this works. The most charitable interpretation here is that they confuse correlation of pixels from video frame to video frame with object permanence, but that is also not how that works.

A more interesting argument may have been that humans don't natively understand physics either but only what the world looks like, but in truth, what we don't natively understand is only what we might call academic physics at the level of mathematical formulas. We do understand commonsense practical physics, however, things like object permanence or how we expect things to move, because our mind has models of the world that go beyond trying to predict what an image should look like given a word prompt and the previous few images.

As always, what puzzles me most about this discourse overall is how many people feel the need to jump in and defend Sora, as they jumped in to defend ChatGPT and Gemini and all the others. Guys, you can admit that the models are limited and make mistakes. Sam Altman isn't going to marry you even if you don't, and it won't make any difference either to whether the singularity will come and grant you immortality, mind uploading, and interstellar travel, because things that are physically and/or biologically and/or conceptually impossible can't be made possible through toxic positivity.

Expand full comment

Perhaps someone should try playing "Peek-a-boo" with Sora. It works with two-year olds, just sayin.

Expand full comment

OpenAI people (or maybe ChatGPt) pretend that using a huge mountain of image patches (aka pixels) physics laws will emerge magicaly. By not publishing scientific paper they let people speculate and the X / Twittersphere are full of « hallucinated breakthrough technologies » from Sora.

Expand full comment

You may wish to correct the signature tagline in this post: "Gary Marcus admires Sora’s rapid video synthesis, but thinks that ***clams*** about how it models the world are confused."

Unless, of course, you were intentionally referring to clams.

Expand full comment
author

fixed!

Expand full comment

I am not SORA would 1. know the difference, or be able to recognize a "Claim" if it saw one, with or without chowder.

Expand full comment

Spellcheck knows the difference between a clm and an clam, but not between a clam and a claim.

There's a lesson there, somewhere.

Expand full comment

Would they be better off coming up with a way to operate 3D rendering software with AI, so the physics can be taken care of?

Expand full comment
author

Maybe, but that would be slower more painstaking work and nobody is up for that now…

Expand full comment

I wish Sora HAD learned true physics, it would be so helpful to have a third arm that could wink in and out of existence whenever something is juuuust out of reach.

Expand full comment

I haven’t even learned physics yet, Gary. Let’s be real serious here!

Expand full comment

Nevertheless, let's not forget that physics too is emergent - through peer review. That's what I find so exciting about the imminent robotic reveals. Mini peer reviews theorizing coordination polices among components aggregating into global understandings.

Expand full comment