Hi Gary, the errors seem egregious, partly because, when the imagery, text, video... seem correct to us, we tacitly assume they 'knew' what to generate.
But the more bland reality is, they have zero clue about anything at all, even when they compute values that turn out correct (to us).
Stochastic monkeys pounding on loaded keyboards, throwing paint on loaded canvases - with zero understanding of what results...
These AI images always look creepy, and it is not just the problems of anatomy and physical space that you have pointed out. An art historian would be able to explain better than me. I think some factors are the excessive detail, unnatural lighting, and the way the textures (skin, hair, fabric, etc.) all look wrong.
What a beautiful piece of art, it seriously made me laugh out loud hard along with Chris McKillop's rephrasing of the prompt and your analysis. These AI mistakes are so funny. Thanks for all the (educational) laughs, Marcus.
Even when they aren't full of egregious errors, AI generated images have a certain immediately recognisable otherness. Earlier this week, I saw a presentation at work that was illustrated on the lines of 'generic stock image of scientist who examines data on a screen', except AI generated instead of stock image, which made sense because the presentation topic was AI, so no problem with that at all. But the scientist had a kind of uncannily shiny appearance and just slightly too idealised features, and of course the screen showed garbled nonsense. Stock photos of people in immaculate white coats pipetting colourful dye into glassware more appropriate for a late Medieval alchemy lab are bad enough, but I feel this is worse.
There are two reasons why despite horn-through-head style errors and off-putting appearance, people are excited about AI media. One is that we are pretty good at seeing what we expect to see, so we often don't notice errors or gloss over the weirdness, just like how we recognise a face in :-).
Second, cultish hype. I do not really understand what is going on inside the heads of people who fill the comment thread under a video of unnaturally moving people with melted features with responses like "so amazing" and "just like Hollywood", but these people exist, they aren't all bots, and for some reason they don't seem to have more pressing things to do than shilling the original poster's business of online courses on how to generate AI 'art'.
My best guess is that it is a mixture of some who hope to grab onto the coattails of fellow hustlers and make some money off a hype cycle and others who seriously believe that Elon Musk and Sam Altman are saving humanity by building robot Messiah to usher in prosperity, interstellar travel, and immortality for all, and even the tiniest bit of scepticism or criticism will delay or even endanger this utopia. Having reasoned thus far, the one thing I get stuck on is why anybody would trust a billionaire to share hypothetical, future AI-generated wealth and rejuvenating drugs with his fan "John12387 laser eye avatar blue tick", but that thought is now really out of scope.
Point is, despite all the flaws, people on social media will continue to claim these images and videos are amazing. And that dissonance can't be good for their psyche.
The guy is unrealistically tall--in addition to needing a very long right arm, an adult unicorn would be expected to be the size of an adult horse. But he's taller than the animal.
So what I'm seeing, as an artistically talentless twitter troll you don't like that fact that AI draws better in seconds than you might achieve in days, and it makes some mistakes (some of which artists famously made).
Every exceptional achievement is met with derision. Honestly, its getting a little embarrassing. I pity the people who's papers you review.
This kind of inherent weakness at seeing globally (or at all) has been exploited by a game player to beat a supposed world-champion Go-playing AI. (I'll post the reference if I can find it again...).
"The tactics used by Pelrine involved slowly stringing together a large “loop” of stones to encircle one of his opponent’s own groups, while distracting the AI with moves in other corners of the board. The Go-playing bot did not notice its vulnerability, even when the encirclement was nearly complete, Pelrine said.
“As a human it would be quite easy to spot,” he added.
The discovery of a weakness in some of the most advanced Go-playing machines points to a fundamental flaw in the deep-learning systems that underpin today’s most advanced AI, said Stuart Russell, a computer science professor at the University of California, Berkeley.
The systems can “understand” only specific situations they have been exposed to in the past and are unable to generalize in a way that humans find easy, he added."
Honest image generation cannot be done in 2D. One has to start with 3D shapes, each anatomically correct, and combine them in physically legal ways. Then add lightning, reflections, background blur, lens distortion, etc.
This has been known for a very long time. I have been however positively shocked that AI art gen has been able to do so much with so little. That should not have been possible.
So yeah, your point is valid. And 3D generated movies, with rigorous physics but with recent highly imaginative advancements, are not far ahead.
> What unifies all of the above is that current systems are good at local coherence, between words, and between pixels, but not at lining up their outputs with a global comprehension of the world.
First, you have to have one - global comprehension of the world - laid out then built-in in system ontology and working and developing incrementally in real world the way as a child grows adapting to the world so you have a general intelligence system -
24:10 - You need to be doing some kind of manipulation, if only to generate internal consistency. So there's no way a large language model can have internal consistency. It's learned everything on the internet... And so to get to another level of cognition, you're going to need something that builds an internally consistent model of what's out there, whether that needs. - Simon Prince, "Understanding Deep Learning"
Horses can have engorged veins like the ones depicted in the image, but that typically happens after a lot of exercise or possibly conditions such as blockage or clotting (or heart issues). Not so much if the horse is just standing around being hugged by older gentlemen.
Hi Gary, the errors seem egregious, partly because, when the imagery, text, video... seem correct to us, we tacitly assume they 'knew' what to generate.
But the more bland reality is, they have zero clue about anything at all, even when they compute values that turn out correct (to us).
Stochastic monkeys pounding on loaded keyboards, throwing paint on loaded canvases - with zero understanding of what results...
"The painting isn’t really particularly in the style of Michelangelo."
We can say something stronger than that - it looks like a creepy digital image, not a painting, and absolutely nothing like Michelangelo. Here is what a painting by Michelangelo looks like: https://s3-us-west-2.amazonaws.com/courses-images-archive-read-only/wp-content/uploads/sites/1122/2016/08/16190403/sibyls.jpg
These AI images always look creepy, and it is not just the problems of anatomy and physical space that you have pointed out. An art historian would be able to explain better than me. I think some factors are the excessive detail, unnatural lighting, and the way the textures (skin, hair, fabric, etc.) all look wrong.
What a beautiful piece of art, it seriously made me laugh out loud hard along with Chris McKillop's rephrasing of the prompt and your analysis. These AI mistakes are so funny. Thanks for all the (educational) laughs, Marcus.
Even when they aren't full of egregious errors, AI generated images have a certain immediately recognisable otherness. Earlier this week, I saw a presentation at work that was illustrated on the lines of 'generic stock image of scientist who examines data on a screen', except AI generated instead of stock image, which made sense because the presentation topic was AI, so no problem with that at all. But the scientist had a kind of uncannily shiny appearance and just slightly too idealised features, and of course the screen showed garbled nonsense. Stock photos of people in immaculate white coats pipetting colourful dye into glassware more appropriate for a late Medieval alchemy lab are bad enough, but I feel this is worse.
There are two reasons why despite horn-through-head style errors and off-putting appearance, people are excited about AI media. One is that we are pretty good at seeing what we expect to see, so we often don't notice errors or gloss over the weirdness, just like how we recognise a face in :-).
Second, cultish hype. I do not really understand what is going on inside the heads of people who fill the comment thread under a video of unnaturally moving people with melted features with responses like "so amazing" and "just like Hollywood", but these people exist, they aren't all bots, and for some reason they don't seem to have more pressing things to do than shilling the original poster's business of online courses on how to generate AI 'art'.
My best guess is that it is a mixture of some who hope to grab onto the coattails of fellow hustlers and make some money off a hype cycle and others who seriously believe that Elon Musk and Sam Altman are saving humanity by building robot Messiah to usher in prosperity, interstellar travel, and immortality for all, and even the tiniest bit of scepticism or criticism will delay or even endanger this utopia. Having reasoned thus far, the one thing I get stuck on is why anybody would trust a billionaire to share hypothetical, future AI-generated wealth and rejuvenating drugs with his fan "John12387 laser eye avatar blue tick", but that thought is now really out of scope.
Point is, despite all the flaws, people on social media will continue to claim these images and videos are amazing. And that dissonance can't be good for their psyche.
The horn shows rings, but not helical twists.
The guy is unrealistically tall--in addition to needing a very long right arm, an adult unicorn would be expected to be the size of an adult horse. But he's taller than the animal.
It's obvious Joe, the man is God.
So what I'm seeing, as an artistically talentless twitter troll you don't like that fact that AI draws better in seconds than you might achieve in days, and it makes some mistakes (some of which artists famously made).
Every exceptional achievement is met with derision. Honestly, its getting a little embarrassing. I pity the people who's papers you review.
please do us both a favor and unsubscribe.
Idk, Stuart, so when the same "Gen AI" system makes a tiny error (in mere seconds) by launching a nuclear device, that's not important either?
You clearly do not understand the concept of "error propagation"...look into it.
Well said and exactly right.
rapunzel hair on the horsie, verry unusual hair braid on the dude (no judgement)
This kind of inherent weakness at seeing globally (or at all) has been exploited by a game player to beat a supposed world-champion Go-playing AI. (I'll post the reference if I can find it again...).
it’s from stuart russell’s lab, and i wrote about it in the substack many months ago (but the paper has since been updated).
Is this the article of yours?:
https://garymarcus.substack.com/p/david-beats-go-liath
I figured you must know about it – I'll check out your article.
Ha, just found that out – you beat me to the punch by one second (or I beat you? ;) ). :)
Ah here we Go (no pun intended):
"The tactics used by Pelrine involved slowly stringing together a large “loop” of stones to encircle one of his opponent’s own groups, while distracting the AI with moves in other corners of the board. The Go-playing bot did not notice its vulnerability, even when the encirclement was nearly complete, Pelrine said.
“As a human it would be quite easy to spot,” he added.
The discovery of a weakness in some of the most advanced Go-playing machines points to a fundamental flaw in the deep-learning systems that underpin today’s most advanced AI, said Stuart Russell, a computer science professor at the University of California, Berkeley.
The systems can “understand” only specific situations they have been exposed to in the past and are unable to generalize in a way that humans find easy, he added."
https://arstechnica.com/information-technology/2023/02/man-beats-machine-at-go-in-human-victory-over-ai/
Isn't the left arm actually sculpted in marble? I think that could be a reaction to the Michaelangelo prompt
Errors schmerrors: https://www.dirtyhans.co.uk/
I'd buy it!
You can actually do a spot the difference with GPT - try it here "create a spot the difference image of an old wizard with a unicorn" https://chat.openai.com/g/g-Yas2WSu7S-cognitive-coach
Honest image generation cannot be done in 2D. One has to start with 3D shapes, each anatomically correct, and combine them in physically legal ways. Then add lightning, reflections, background blur, lens distortion, etc.
This has been known for a very long time. I have been however positively shocked that AI art gen has been able to do so much with so little. That should not have been possible.
So yeah, your point is valid. And 3D generated movies, with rigorous physics but with recent highly imaginative advancements, are not far ahead.
> What unifies all of the above is that current systems are good at local coherence, between words, and between pixels, but not at lining up their outputs with a global comprehension of the world.
First, you have to have one - global comprehension of the world - laid out then built-in in system ontology and working and developing incrementally in real world the way as a child grows adapting to the world so you have a general intelligence system -
A Unified Theory - Universal Language - https://www.linkedin.com/pulse/unified-theory-consciousness-michael-molin/
24:10 - You need to be doing some kind of manipulation, if only to generate internal consistency. So there's no way a large language model can have internal consistency. It's learned everything on the internet... And so to get to another level of cognition, you're going to need something that builds an internally consistent model of what's out there, whether that needs. - Simon Prince, "Understanding Deep Learning"
https://www.youtube.com/watch?v=sJXn4Cl4oww
there needs to be an in there, for there to be an out there.
This image is disturbing.
Horses can have engorged veins like the ones depicted in the image, but that typically happens after a lot of exercise or possibly conditions such as blockage or clotting (or heart issues). Not so much if the horse is just standing around being hugged by older gentlemen.