I have a saying (which may or may not be entirely fair, but it's what I increasingly feel): "90% of everything that it said or written about AI [by humans] is crap. For AGI that number increases to 99%. For consciousness it increases to 99.9%". There is now so much BS out there that it's hard to cope, even for me, and I've been working in AI since the mid-80s. What's really worrying is that normal people, including policymakers, have no way of distinguishing AI BS from AI reality.
Agreed. It is a bit like the internet hype in the 90's, that was off the charts too (new economy, doing away with representative democracy, AI, etc. The wild convictions (fantasies) about the future were as absolute the ones on AI now. What actually happened was quite a bit of good stuff, but also the attention economy with its concentration of power, manipulation, fragmentation, information warfare, etc.
GenAI will not produce AGI (no question here, approximating human 'understanding' with pixel or token statistics (understanding pixel/token distributions in human-created material) is a fundamental dead end), but it will produce both good and probably (because we're unprepared psychologically) a lot of bad.
The lesson we have to lear here is not about AI (or any other tech), but about human psychology and how we create and then hang on to our convictions. And I hope we humans learn it before we damage the world beyond recognition.
Humans are particularly good at imagination. It is a strong survival trait, we can imagine outcomes like risks, but also positive ones. Even our memory is 'fantasising about the past'. Our convictions are imaginations too. We imagine and act as if it is true. The AI BS fits in that scheme. And the internet has created a setup where their imagination has been freed from traditional braking powers (like serious journalism, science).
How is it that the only sensory inputs you envision being used, or is required, is visual? Just because humans are, or think they are, more oriented on visual inputs than from other senses, doesn't mean the others are absolutely necessary to even begin to approximate the "real" world this AI is supposed to not just interpret, but explain.
And, please remind me, we need this technology, why?
Here’s a test of whether people _really_ think Sora understands physics. Would they be willing to ride in an airplane piloted by Sora? I wouldn’t. Consider the fate of the 2 Boeing 737-MAX8 aircraft that crashed because their autopilot’s emergency software wouldn’t let the pilots safely override a bad sensor reading. And that was ordinary software, where the coders painstakingly gave each instruction. Imagine an AI getting the physical situation wrong and deciding it needed to modify the control signals to the control surfaces.
Omg Gary - beyond naive, beyond clueless. The universe is filled with thousands of known, and an unknown number of unknown, *phenomena* - people that claim that "physics" can be "learned" from videos (what a joke!!) would do well to ponder the actual meaning of that word, before making absurd claims based on delusional and wishful thinking.
Even trillions of videos can't EVER be able to impart "physics" to any system. That's why we have labs, instruments, sensors, devices, equipment... because they deal with matter and energy, in terms of matter and energy - it's zero about data, including video, audio, images, text, equations etc. That's also why, theoretical physics needs experimental validation.
Mind-boggling, the simpleton beliefs people hold. Their beliefs have nothing to do with how things actually work.
From the probably ChatGPT generated OpenAI's white paper... « Our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world. ». Simulating physical world without understanding physical world... gives highres photorealistic 3 legs cat. 😉
I shared Sora videos with my family, several hours after the initial announcement. The response was one of underwhelm and confusion (as to why the videos were so… weird)!
If my little ones can’t go, “wow,” well, you know, that’s already not a good sign (it’s Boolean - boring or exciting/interesting - it’s not rocket science).
And because of all the mistakes in the videos, it would be unprofitable for me to even consider Sora in my work pipeline (I am the founder of a video marketing agency). The time and labour to fix them up would be commercially unviable.
I do want to employ cool technology in my biz. But Sora looks like, once again, as is so common in tech, a solution in pursuit of a problem!
Given the extremely high costs of petabyte (exabyte?) level processing, I am having trouble trying to predict* a consumer- and business-friendly price point for Sora that would make economic sense for OpenAI, especially given the rather tiny market I think it may be suitable for.
Or in MBA nomenclature, a solution looking for a problem.
I remember being at MIT when I was 22 y.o. (1976ish) in Marvin Minsky’s “AI” lab, where Minsky had a robot arm that had 5 degrees of freedom (of movement vectors).
He spent over 18 months trying to program picking up a block in front of it, and, using only the same arm, put the block behind it.
It failed spectacularly, because the “AI” code for commanding the arm TRIED TO MOVE THIS ARM THROUGH ITSELF. Repeatedly.
I knew another AI Winter was coming then (1966-1980)…and yet another will come this year.
Amazing how many people seem to think that the observation that Sora makes mistakes indicative of it not understanding physics can be refuted by saying how surprisingly good it sometimes looks. That's not how this works. The most charitable interpretation here is that they confuse correlation of pixels from video frame to video frame with object permanence, but that is also not how that works.
A more interesting argument may have been that humans don't natively understand physics either but only what the world looks like, but in truth, what we don't natively understand is only what we might call academic physics at the level of mathematical formulas. We do understand commonsense practical physics, however, things like object permanence or how we expect things to move, because our mind has models of the world that go beyond trying to predict what an image should look like given a word prompt and the previous few images.
As always, what puzzles me most about this discourse overall is how many people feel the need to jump in and defend Sora, as they jumped in to defend ChatGPT and Gemini and all the others. Guys, you can admit that the models are limited and make mistakes. Sam Altman isn't going to marry you even if you don't, and it won't make any difference either to whether the singularity will come and grant you immortality, mind uploading, and interstellar travel, because things that are physically and/or biologically and/or conceptually impossible can't be made possible through toxic positivity.
OpenAI people (or maybe ChatGPt) pretend that using a huge mountain of image patches (aka pixels) physics laws will emerge magicaly. By not publishing scientific paper they let people speculate and the X / Twittersphere are full of « hallucinated breakthrough technologies » from Sora.
You may wish to correct the signature tagline in this post: "Gary Marcus admires Sora’s rapid video synthesis, but thinks that ***clams*** about how it models the world are confused."
Unless, of course, you were intentionally referring to clams.
I wish Sora HAD learned true physics, it would be so helpful to have a third arm that could wink in and out of existence whenever something is juuuust out of reach.
Nevertheless, let's not forget that physics too is emergent - through peer review. That's what I find so exciting about the imminent robotic reveals. Mini peer reviews theorizing coordination polices among components aggregating into global understandings.
I have a saying (which may or may not be entirely fair, but it's what I increasingly feel): "90% of everything that it said or written about AI [by humans] is crap. For AGI that number increases to 99%. For consciousness it increases to 99.9%". There is now so much BS out there that it's hard to cope, even for me, and I've been working in AI since the mid-80s. What's really worrying is that normal people, including policymakers, have no way of distinguishing AI BS from AI reality.
Agreed. It is a bit like the internet hype in the 90's, that was off the charts too (new economy, doing away with representative democracy, AI, etc. The wild convictions (fantasies) about the future were as absolute the ones on AI now. What actually happened was quite a bit of good stuff, but also the attention economy with its concentration of power, manipulation, fragmentation, information warfare, etc.
GenAI will not produce AGI (no question here, approximating human 'understanding' with pixel or token statistics (understanding pixel/token distributions in human-created material) is a fundamental dead end), but it will produce both good and probably (because we're unprepared psychologically) a lot of bad.
The lesson we have to lear here is not about AI (or any other tech), but about human psychology and how we create and then hang on to our convictions. And I hope we humans learn it before we damage the world beyond recognition.
Humans are particularly good at imagination. It is a strong survival trait, we can imagine outcomes like risks, but also positive ones. Even our memory is 'fantasising about the past'. Our convictions are imaginations too. We imagine and act as if it is true. The AI BS fits in that scheme. And the internet has created a setup where their imagination has been freed from traditional braking powers (like serious journalism, science).
We have to be careful not to undeterestimate the dangers of AI as these maniacs try to build AGI and drive us extinct.
Killing us all is pretty maniacal.
How is it that the only sensory inputs you envision being used, or is required, is visual? Just because humans are, or think they are, more oriented on visual inputs than from other senses, doesn't mean the others are absolutely necessary to even begin to approximate the "real" world this AI is supposed to not just interpret, but explain.
And, please remind me, we need this technology, why?
Here’s a test of whether people _really_ think Sora understands physics. Would they be willing to ride in an airplane piloted by Sora? I wouldn’t. Consider the fate of the 2 Boeing 737-MAX8 aircraft that crashed because their autopilot’s emergency software wouldn’t let the pilots safely override a bad sensor reading. And that was ordinary software, where the coders painstakingly gave each instruction. Imagine an AI getting the physical situation wrong and deciding it needed to modify the control signals to the control surfaces.
Omg Gary - beyond naive, beyond clueless. The universe is filled with thousands of known, and an unknown number of unknown, *phenomena* - people that claim that "physics" can be "learned" from videos (what a joke!!) would do well to ponder the actual meaning of that word, before making absurd claims based on delusional and wishful thinking.
Even trillions of videos can't EVER be able to impart "physics" to any system. That's why we have labs, instruments, sensors, devices, equipment... because they deal with matter and energy, in terms of matter and energy - it's zero about data, including video, audio, images, text, equations etc. That's also why, theoretical physics needs experimental validation.
Mind-boggling, the simpleton beliefs people hold. Their beliefs have nothing to do with how things actually work.
From the probably ChatGPT generated OpenAI's white paper... « Our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world. ». Simulating physical world without understanding physical world... gives highres photorealistic 3 legs cat. 😉
Which one is heavier, 1 kg of steel or 1 kg of cotton? I'd bet Sora has no idea...
I shared Sora videos with my family, several hours after the initial announcement. The response was one of underwhelm and confusion (as to why the videos were so… weird)!
If my little ones can’t go, “wow,” well, you know, that’s already not a good sign (it’s Boolean - boring or exciting/interesting - it’s not rocket science).
And because of all the mistakes in the videos, it would be unprofitable for me to even consider Sora in my work pipeline (I am the founder of a video marketing agency). The time and labour to fix them up would be commercially unviable.
I do want to employ cool technology in my biz. But Sora looks like, once again, as is so common in tech, a solution in pursuit of a problem!
A solution in pursuit of a problem… did I just also describe the gross misapplication of RAGged up LLMs? 😅
Given the extremely high costs of petabyte (exabyte?) level processing, I am having trouble trying to predict* a consumer- and business-friendly price point for Sora that would make economic sense for OpenAI, especially given the rather tiny market I think it may be suitable for.
Or in MBA nomenclature, a solution looking for a problem.
https://www.linkedin.com/feed/update/urn:li:activity:7172582657636200448
*washing dishes does produce such random thoughts 😂
I remember being at MIT when I was 22 y.o. (1976ish) in Marvin Minsky’s “AI” lab, where Minsky had a robot arm that had 5 degrees of freedom (of movement vectors).
He spent over 18 months trying to program picking up a block in front of it, and, using only the same arm, put the block behind it.
It failed spectacularly, because the “AI” code for commanding the arm TRIED TO MOVE THIS ARM THROUGH ITSELF. Repeatedly.
I knew another AI Winter was coming then (1966-1980)…and yet another will come this year.
Bill
Amazing how many people seem to think that the observation that Sora makes mistakes indicative of it not understanding physics can be refuted by saying how surprisingly good it sometimes looks. That's not how this works. The most charitable interpretation here is that they confuse correlation of pixels from video frame to video frame with object permanence, but that is also not how that works.
A more interesting argument may have been that humans don't natively understand physics either but only what the world looks like, but in truth, what we don't natively understand is only what we might call academic physics at the level of mathematical formulas. We do understand commonsense practical physics, however, things like object permanence or how we expect things to move, because our mind has models of the world that go beyond trying to predict what an image should look like given a word prompt and the previous few images.
As always, what puzzles me most about this discourse overall is how many people feel the need to jump in and defend Sora, as they jumped in to defend ChatGPT and Gemini and all the others. Guys, you can admit that the models are limited and make mistakes. Sam Altman isn't going to marry you even if you don't, and it won't make any difference either to whether the singularity will come and grant you immortality, mind uploading, and interstellar travel, because things that are physically and/or biologically and/or conceptually impossible can't be made possible through toxic positivity.
Perhaps someone should try playing "Peek-a-boo" with Sora. It works with two-year olds, just sayin.
OpenAI people (or maybe ChatGPt) pretend that using a huge mountain of image patches (aka pixels) physics laws will emerge magicaly. By not publishing scientific paper they let people speculate and the X / Twittersphere are full of « hallucinated breakthrough technologies » from Sora.
You may wish to correct the signature tagline in this post: "Gary Marcus admires Sora’s rapid video synthesis, but thinks that ***clams*** about how it models the world are confused."
Unless, of course, you were intentionally referring to clams.
fixed!
I am not SORA would 1. know the difference, or be able to recognize a "Claim" if it saw one, with or without chowder.
Tangentially related, I’d be eager to hear what folks think about my posts on somewhat similar confusion about imagination and intuition: https://open.substack.com/pub/unexaminedtechnology/p/the-two-is-we-need-to-include-in?r=2xhhg0&utm_medium=ios
Hi Aki, I left a comment there: https://open.substack.com/pub/unexaminedtechnology/p/the-two-is-we-need-to-include-in?r=3er6yo&utm_campaign=comment-list-share-cta&utm_medium=web&comments=true&commentId=51341238
Would they be better off coming up with a way to operate 3D rendering software with AI, so the physics can be taken care of?
Maybe, but that would be slower more painstaking work and nobody is up for that now…
I wish Sora HAD learned true physics, it would be so helpful to have a third arm that could wink in and out of existence whenever something is juuuust out of reach.
I haven’t even learned physics yet, Gary. Let’s be real serious here!
Nevertheless, let's not forget that physics too is emergent - through peer review. That's what I find so exciting about the imminent robotic reveals. Mini peer reviews theorizing coordination polices among components aggregating into global understandings.