And just like that, I finished re-reading "The Hitchhikers Guide to the Galaxy" and in times such as these it feels like such a powerful premonition. I published a reflection from the re-reading on this very topic. | Now posted: https://everward.substack.com/p/infinite-improbability-drives-and
I’m rewatching LOTR because I’m starting to suspect that Thiel, Musk and co only saw the trilogy by Peter Jackson but never read the books. As the result, they don’t understand the full metaphors of the universe of Tolkien.
I discovered your Substack recently. I still follow you on Blue Sky, X and LinkedIn but I prefer the Substack format.
In terms of branding, the tech industry forgot that 30 years ago, when I ordered a book on Amazon, people thought I had bought the book in the Amazon. Apple wasn’t exactly inventive either. But it was rather innocent compared to Palantir and the various names you mentioned. Next, Sauron and Mordor? Any brand outside of the tech world would suffer from brand damage. Oh, wait… it happened.
They should call it Juggernaut instead of Behemoth. Comes from the Hindu “Jagannātha” meaning lord of the world. Worshippers formed huge processions including massive chariots. People would sacrifice themselves by jumping under the wheels.
Shivon Zilis, one of the surrogates of Musk, wrote that the 4th child was a juggernaut. Grok explained that the 4th child doesn’t exist and that she posted about the birth of the 4th child on the 1 year birthday of the 3rd. Grok concluded that it was impossible for the 4th child to exist and doubted the existence of the 3rd.
I agree that the AI industry is full of absurd self-parodic aggrandizement of capabilities -- but I think the frequent examples you give of image generators failing to conform to precise instructions are not an effective refutation of their purported capabilities. Image generators are a different, deeply flawed technology from LLMs, and all the LLMs can do is give a prompt to the image generators, the same as any of the rest of us. Image tech is different, orthogonal field of development. Their failure to produce these images isn't really an indictment of their intelligence -- the hallucinations and failures to learn or remember anything are the true limitations of these models, in my opinion.
While image generation and text generation models are specialized for different tasks, they're not entirely orthogonal. Many models use similar techniques — for example, transformers are used in both text (GPT) and image (Vision Transformers, or ViTs). Moreover, multimodal models like GPT-4V or Gemini integrate image and text understanding in the same architecture.
“Failures to produce images aren’t an indictment of intelligence”
That’s a fair philosophical stance — LLMs aren’t truly “intelligent” in a human sense anyway. But the claim sidesteps the fact that multimodal AI is becoming more integrated, and the boundary between text and image understanding is blurring. So image failure can be a sign of limitations in the system’s reasoning or context awareness, depending on the nature of the failure.
It's true that the models are becoming more deeply integrated with their image generators -- however, I stand by the claim that image generation capability is quite distinct from text generation capability. There is nothing approaching 'intelligent design' in the generation of an AI image, the systems are not capable of that yet. Grok and o3 making images are still fundamentally using tools that humans have access to (however 'integrated' those tools may be), and the best humans are equally imprecise in their manipulation of those tools. To show the limitations of an LLM, it is better to show it failing to do something a human can easily do, like play Pokemon Red, rather than something no human can do, like use transformer-based image tech to produce highly precise outputs.
"There is nothing approaching 'intelligent design' in the generation of an AI image"
Nor in the generation of text by LLMs. You seem to be completely missing the point. You say
"the frequent examples you give of image generators failing to conform to precise instructions are not an effective refutation of their purported capabilities. Image generators are a different, deeply flawed technology from LLMs,"
So? Gary gives examples of failures by both text generators and image generators. Both purport to be "AI". Both lack any "I", while being promoted as something they are not.
"To show the limitations of an LLM, it is better to show it failing to do something a human can easily do"
Which Gary has repeatedly done. *Also* presenting the failures of image generators doesn't negate that. I don't think Gary or anyone else is presenting the failures of image generators to be failures of language models per se. And your comments are particularly off base on this post, where Gary highlights a number of different sorts of failures. The rubber ducks are a failure to integrate the language model into the image generation ... whether that's a technical failure or a limitation of the underlying technology really isn't relevant ... it's a failure of *AI* and the people who are promoting it. The same goes for tic-tac-toe ... the LLM could provide the image generator with excruciatingly precise details as to what image to generate ... if the image generator fails to follow such prompts then the people who have implemented this integrated system have screwed up *somewhere* ... exactly where doesn't matter. "not an effective refutation of their purported capabilities"? Of course it is.
I think we just have a different view of the relevance of image generators to the project of generative AI. As I understand it, the image generators are more or less a side show, they receive attention insofar as OpenAI is able to use them to get attention (i.e. the Ghibli memes) but the technology that receives all the hype is the text generators. These technologies may evolve in tandem but only one is invoked with 'machine god', only one is called intelligent -- only one 'simulates' the process of design. The sparks of A.G.I video shows off the difference quite clearly.
Gary will frequently invoke the image generator flaws as reasons to disregard the capabilities of o3 or Grok, when these image generators are more like an external tool that they have been set up to use than a core part of the model.
The A.I companies do not claim to have merged their image generators with their A.Is, they don't even discuss the image generators that much, and the current plan for advancement in the field is to upgrade the text-models into 'agents' that can automate the workflow of AI research. Because of the flaws I've mentioned this plan will almost certainly fail, but image generators are an entirely irrelevant part of that plan -- few under the impression that they will soon be able to precisely generate images; it is not an issue of much concern. What *does* matter are the constant claims to forthcoming 'agency' in the text models, and these are the claims that nobody in the AI industry can refute.
These models *may be* stupid and limited, but it is somewhat hard to make them look stupid and limited by just chatting with them in a text interface. I think Gary's constant showcasing of the flaws of the image gen tech is an easy but somewhat misleading way of doing this, and doesn't put the spotlight on the most important flaws with the models.
"I think we just have a different view of the relevance of image generators to the project of generative AI. "
No, you simply ignored what I wrote.
"Gary will frequently invoke the image generator flaws as reasons to disregard the capabilities of o3 or Grok"
Sez you.
"The A.I companies do not claim to have merged their image generators with their A.Is"
Sigh. What do you think the URL "https://sora.chatgpt.com" means? Let's ask Claude about your claim:
"OpenAI (the company behind ChatGPT) does present their image generation technology (DALL-E) as artificial intelligence. They consistently describe their image generation capabilities as AI-powered technologies in their public communications and documentation.
OpenAI frames DALL-E and their other image generation tools as part of their broader suite of AI products, alongside their language models. In their marketing materials and technical documentation, they discuss these image generation systems using AI terminology and concepts (like diffusion models, neural networks, etc.).
Unlike some debates about whether certain technologies qualify as "true AI," OpenAI has been fairly straightforward in positioning their image generation capabilities within the AI technology category. They present it as an application of machine learning that enables computers to create visual content based on text descriptions."
You: "I think Gary's constant showcasing of the flaws of the image gen tech is an easy but somewhat misleading way of doing this, and doesn't put the spotlight on the most important flaws with the models."
It's remarkable how *selective* you have to be to put forth this nonsense. Gary has written reams of material that address "the most important flaws with the models" with nary a mention of image generation.
Except that if these systems are to be (some day) "intellectual generalists" like we claim of ourselves, they'd better be able to do some "language to drawing" (or similar) tasks like many humans. It is very easy to carve off more and more slices and say "this isn't core", but do this enough and nothing is left. For example, if the systems can't do the duck drawing thing but can work with a human to hunt a seal in the Arctic, I'm more than happy to grant a replacement show of intelligence that way.
You're correct that if the systems can't draw, that is a limitation of their generality. My point is that the path by which the AI companies claim their models will reach AGI does not travel through the fjord of perfect image generation -- they want capable digital programmers that they can put to work on automated AI research; they are betting the farm on the apparent linear gains to programming ability in reasoning models and that they can somehow solve hallucinations and persistence in time and memory.
I really dig the pic of the Behemoth! ---What about this slogan for a bleeding-edge new Meta-AI promotional campaign: "It really whips the llama's ass!" 'C'mon. All the leet kids on p2p will love it!
"even in the worst case it probably won’t lead to every single human on the planet dying" I hate to be the one to break it to you, Gary, but actually, every single human on the planet is going to die. The better question might be, with AI in the mix, in just how much agony, and how quickly is this dying going to happen?
It won’t, at least during what is left of our life Gary. It’s peculiar we’re born the same year. It may happen in 30 to 40 years. We would die in the most stupid way ever at the age of 90 something.
I am told, for some R&D work, hallucinations can be helpful? Yet I don’t see that advertised anywhere.
How would the create an affordance to actually edit the answer, rather than starting from scratch?
Alexa did manage to figure out the animal my daughter was thinking of “like a dog, stars with w, not a wolf” Wolverine.
Behemoth reminds me of a comedian talking about nerds and jocks meeting on podcasts and the nerds thinking they’ve become cool while the jocks think they’ve become smart - and neither outcome is real.
Norbert Wiener later acknowledged he'd been too confident in machine intelligence in 1950 when he thought AI was at a tipping point. The same debate is happening 75 years later.
From August 18, 1950:
Says If Reds Don’t Get Us, Robots Will
Cambridge, Mass. — If Russia doesn’t ruin us the robots will, a noted scientist predicted today. Dr. Norbert Wiener, professor of mathematics at Massachusetts Institute of Technology, said Moscow and the new mechanical brains might even prove unwitting allies in driving the United States into a “decade or more of ruin and despair.”
Wiener is the bearded former boy prodigy who earned his doctorate of philosophy at the age of 19 and went on to develop the new science of ‘cybernetics’–the use of communication in controlling men or machines.
*Will Take Over Tasks*
He said the United States is on the verge of a “second industrial revolution” in which robot factories operated by so-called mechanical brains will take over all the routine tasks of production from men.
“Short of any violent political changes or another great war, I should give a rough estimate that it will take the new tools 10 or 20 years to come into their own,” Wiener said.
But he added that the demands of a war with Russia would speed the development of robot factories and “almost inevitably see the automatic man age in full swing within less than five years.”
What happen to humans when the robots take over?
*May Be a Good Thing*
Wiener has a word of warning about that in a new book, The Human Use of Human Beings, which will be published Monday by Houghton Mifflin Company.
If the new machines are used wisely, he said, it may in the long run ‘make this a good thing and the source of the leisure which is necessary for the cultural development of man on all sides.
But Wiener said the depression of the 1930s will look like “pleasant joke" in comparison with what will happen if the nation misuses the new machines which can calculate, remember, pass judgement and even succumb to nervous breakdowns.
“Thus the new industrial revolution is a two-edged sword,” he said. “It may be used for the benefit of humanity, assuming that humanity survives long enough to enter a period in which such a benefit is possible.”•
Dear lord, why are you letting Musk off easy? That a$$klown actually thinks that by NOT spending $1 trillion into the private sector, there will be MORE MONEY in... (wait for it) the PRIVATE SECTOR.
In the modern world all money is created by debt. So there is a well known accounting identity between private sector positive equity money and government sector debt. The national debt of all countries together IS the positive equity of the total private sector globally. Everything else in the private sector, all loans and the private sector originated money temporarily created by those loans, has a net valuation of zero.
We are living in a post-parody world. Moronic billionaires hoisted up by probability are doing everything possible to wreck our world and make it difficult or impossible to survive in.
"Behemoth" is also spoken of in the same terms as "leviathan". Both were used by Thomas Hobbes in his discussions of political philosophy, the history of political authority, etc. You don't see the former discussed as much ... given the former is the more historical work, perhaps that's why.
"Robot chickens coming home to roost."
User and customer experience is badly lacking from most assessments of tech's impact on business performance in my view.
And just like that, I finished re-reading "The Hitchhikers Guide to the Galaxy" and in times such as these it feels like such a powerful premonition. I published a reflection from the re-reading on this very topic. | Now posted: https://everward.substack.com/p/infinite-improbability-drives-and
I’m rewatching LOTR because I’m starting to suspect that Thiel, Musk and co only saw the trilogy by Peter Jackson but never read the books. As the result, they don’t understand the full metaphors of the universe of Tolkien.
Those guys probably think Saruman is the hero.
I definitely think it is the best sci fi novel yet written. It was prescient in so many ways.
"Share and enjoy!"
Sharing here: https://everward.substack.com/p/infinite-improbability-drives-and
Excellent!
--Forgive me. I can never figure out whether others have heard the jingle: https://www.youtube.com/results?search_query=hitchhiker%27s%20guide%20share%20and%20enjoy
I discovered your Substack recently. I still follow you on Blue Sky, X and LinkedIn but I prefer the Substack format.
In terms of branding, the tech industry forgot that 30 years ago, when I ordered a book on Amazon, people thought I had bought the book in the Amazon. Apple wasn’t exactly inventive either. But it was rather innocent compared to Palantir and the various names you mentioned. Next, Sauron and Mordor? Any brand outside of the tech world would suffer from brand damage. Oh, wait… it happened.
They should call it Juggernaut instead of Behemoth. Comes from the Hindu “Jagannātha” meaning lord of the world. Worshippers formed huge processions including massive chariots. People would sacrifice themselves by jumping under the wheels.
Shivon Zilis, one of the surrogates of Musk, wrote that the 4th child was a juggernaut. Grok explained that the 4th child doesn’t exist and that she posted about the birth of the 4th child on the 1 year birthday of the 3rd. Grok concluded that it was impossible for the 4th child to exist and doubted the existence of the 3rd.
I agree that the AI industry is full of absurd self-parodic aggrandizement of capabilities -- but I think the frequent examples you give of image generators failing to conform to precise instructions are not an effective refutation of their purported capabilities. Image generators are a different, deeply flawed technology from LLMs, and all the LLMs can do is give a prompt to the image generators, the same as any of the rest of us. Image tech is different, orthogonal field of development. Their failure to produce these images isn't really an indictment of their intelligence -- the hallucinations and failures to learn or remember anything are the true limitations of these models, in my opinion.
According to ChatGPT:
"
While image generation and text generation models are specialized for different tasks, they're not entirely orthogonal. Many models use similar techniques — for example, transformers are used in both text (GPT) and image (Vision Transformers, or ViTs). Moreover, multimodal models like GPT-4V or Gemini integrate image and text understanding in the same architecture.
“Failures to produce images aren’t an indictment of intelligence”
That’s a fair philosophical stance — LLMs aren’t truly “intelligent” in a human sense anyway. But the claim sidesteps the fact that multimodal AI is becoming more integrated, and the boundary between text and image understanding is blurring. So image failure can be a sign of limitations in the system’s reasoning or context awareness, depending on the nature of the failure.
"
It's true that the models are becoming more deeply integrated with their image generators -- however, I stand by the claim that image generation capability is quite distinct from text generation capability. There is nothing approaching 'intelligent design' in the generation of an AI image, the systems are not capable of that yet. Grok and o3 making images are still fundamentally using tools that humans have access to (however 'integrated' those tools may be), and the best humans are equally imprecise in their manipulation of those tools. To show the limitations of an LLM, it is better to show it failing to do something a human can easily do, like play Pokemon Red, rather than something no human can do, like use transformer-based image tech to produce highly precise outputs.
"There is nothing approaching 'intelligent design' in the generation of an AI image"
Nor in the generation of text by LLMs. You seem to be completely missing the point. You say
"the frequent examples you give of image generators failing to conform to precise instructions are not an effective refutation of their purported capabilities. Image generators are a different, deeply flawed technology from LLMs,"
So? Gary gives examples of failures by both text generators and image generators. Both purport to be "AI". Both lack any "I", while being promoted as something they are not.
"To show the limitations of an LLM, it is better to show it failing to do something a human can easily do"
Which Gary has repeatedly done. *Also* presenting the failures of image generators doesn't negate that. I don't think Gary or anyone else is presenting the failures of image generators to be failures of language models per se. And your comments are particularly off base on this post, where Gary highlights a number of different sorts of failures. The rubber ducks are a failure to integrate the language model into the image generation ... whether that's a technical failure or a limitation of the underlying technology really isn't relevant ... it's a failure of *AI* and the people who are promoting it. The same goes for tic-tac-toe ... the LLM could provide the image generator with excruciatingly precise details as to what image to generate ... if the image generator fails to follow such prompts then the people who have implemented this integrated system have screwed up *somewhere* ... exactly where doesn't matter. "not an effective refutation of their purported capabilities"? Of course it is.
I think we just have a different view of the relevance of image generators to the project of generative AI. As I understand it, the image generators are more or less a side show, they receive attention insofar as OpenAI is able to use them to get attention (i.e. the Ghibli memes) but the technology that receives all the hype is the text generators. These technologies may evolve in tandem but only one is invoked with 'machine god', only one is called intelligent -- only one 'simulates' the process of design. The sparks of A.G.I video shows off the difference quite clearly.
Gary will frequently invoke the image generator flaws as reasons to disregard the capabilities of o3 or Grok, when these image generators are more like an external tool that they have been set up to use than a core part of the model.
The A.I companies do not claim to have merged their image generators with their A.Is, they don't even discuss the image generators that much, and the current plan for advancement in the field is to upgrade the text-models into 'agents' that can automate the workflow of AI research. Because of the flaws I've mentioned this plan will almost certainly fail, but image generators are an entirely irrelevant part of that plan -- few under the impression that they will soon be able to precisely generate images; it is not an issue of much concern. What *does* matter are the constant claims to forthcoming 'agency' in the text models, and these are the claims that nobody in the AI industry can refute.
These models *may be* stupid and limited, but it is somewhat hard to make them look stupid and limited by just chatting with them in a text interface. I think Gary's constant showcasing of the flaws of the image gen tech is an easy but somewhat misleading way of doing this, and doesn't put the spotlight on the most important flaws with the models.
as it happens, i will have more to say about all this in an upcoming essay
"I think we just have a different view of the relevance of image generators to the project of generative AI. "
No, you simply ignored what I wrote.
"Gary will frequently invoke the image generator flaws as reasons to disregard the capabilities of o3 or Grok"
Sez you.
"The A.I companies do not claim to have merged their image generators with their A.Is"
Sigh. What do you think the URL "https://sora.chatgpt.com" means? Let's ask Claude about your claim:
"OpenAI (the company behind ChatGPT) does present their image generation technology (DALL-E) as artificial intelligence. They consistently describe their image generation capabilities as AI-powered technologies in their public communications and documentation.
OpenAI frames DALL-E and their other image generation tools as part of their broader suite of AI products, alongside their language models. In their marketing materials and technical documentation, they discuss these image generation systems using AI terminology and concepts (like diffusion models, neural networks, etc.).
Unlike some debates about whether certain technologies qualify as "true AI," OpenAI has been fairly straightforward in positioning their image generation capabilities within the AI technology category. They present it as an application of machine learning that enables computers to create visual content based on text descriptions."
You: "I think Gary's constant showcasing of the flaws of the image gen tech is an easy but somewhat misleading way of doing this, and doesn't put the spotlight on the most important flaws with the models."
It's remarkable how *selective* you have to be to put forth this nonsense. Gary has written reams of material that address "the most important flaws with the models" with nary a mention of image generation.
Except that if these systems are to be (some day) "intellectual generalists" like we claim of ourselves, they'd better be able to do some "language to drawing" (or similar) tasks like many humans. It is very easy to carve off more and more slices and say "this isn't core", but do this enough and nothing is left. For example, if the systems can't do the duck drawing thing but can work with a human to hunt a seal in the Arctic, I'm more than happy to grant a replacement show of intelligence that way.
You're correct that if the systems can't draw, that is a limitation of their generality. My point is that the path by which the AI companies claim their models will reach AGI does not travel through the fjord of perfect image generation -- they want capable digital programmers that they can put to work on automated AI research; they are betting the farm on the apparent linear gains to programming ability in reasoning models and that they can somehow solve hallucinations and persistence in time and memory.
LOLOL, Gary!!! Thanks for the laughs - really :) :)
"Thinking"... lol.
I really dig the pic of the Behemoth! ---What about this slogan for a bleeding-edge new Meta-AI promotional campaign: "It really whips the llama's ass!" 'C'mon. All the leet kids on p2p will love it!
Just tried ChatGPT with "Make an image with 13 shoes". Got 14. ChatGPT hates prime numbers!
Can someone start a cron job, watching for the first LLM to emit “self-modern, post-parody”?
Someone must have some grad students at their command...
"even in the worst case it probably won’t lead to every single human on the planet dying" I hate to be the one to break it to you, Gary, but actually, every single human on the planet is going to die. The better question might be, with AI in the mix, in just how much agony, and how quickly is this dying going to happen?
i answered this elsewhere: the question was whether AI would lead to human extinction. sorry for the imprecise wording
It won’t, at least during what is left of our life Gary. It’s peculiar we’re born the same year. It may happen in 30 to 40 years. We would die in the most stupid way ever at the age of 90 something.
I am told, for some R&D work, hallucinations can be helpful? Yet I don’t see that advertised anywhere.
How would the create an affordance to actually edit the answer, rather than starting from scratch?
Alexa did manage to figure out the animal my daughter was thinking of “like a dog, stars with w, not a wolf” Wolverine.
Behemoth reminds me of a comedian talking about nerds and jocks meeting on podcasts and the nerds thinking they’ve become cool while the jocks think they’ve become smart - and neither outcome is real.
As promised...here is that reflection essay. https://everward.substack.com/p/infinite-improbability-drives-and
Norbert Wiener later acknowledged he'd been too confident in machine intelligence in 1950 when he thought AI was at a tipping point. The same debate is happening 75 years later.
From August 18, 1950:
Says If Reds Don’t Get Us, Robots Will
Cambridge, Mass. — If Russia doesn’t ruin us the robots will, a noted scientist predicted today. Dr. Norbert Wiener, professor of mathematics at Massachusetts Institute of Technology, said Moscow and the new mechanical brains might even prove unwitting allies in driving the United States into a “decade or more of ruin and despair.”
Wiener is the bearded former boy prodigy who earned his doctorate of philosophy at the age of 19 and went on to develop the new science of ‘cybernetics’–the use of communication in controlling men or machines.
*Will Take Over Tasks*
He said the United States is on the verge of a “second industrial revolution” in which robot factories operated by so-called mechanical brains will take over all the routine tasks of production from men.
“Short of any violent political changes or another great war, I should give a rough estimate that it will take the new tools 10 or 20 years to come into their own,” Wiener said.
But he added that the demands of a war with Russia would speed the development of robot factories and “almost inevitably see the automatic man age in full swing within less than five years.”
What happen to humans when the robots take over?
*May Be a Good Thing*
Wiener has a word of warning about that in a new book, The Human Use of Human Beings, which will be published Monday by Houghton Mifflin Company.
If the new machines are used wisely, he said, it may in the long run ‘make this a good thing and the source of the leisure which is necessary for the cultural development of man on all sides.
But Wiener said the depression of the 1930s will look like “pleasant joke" in comparison with what will happen if the nation misuses the new machines which can calculate, remember, pass judgement and even succumb to nervous breakdowns.
“Thus the new industrial revolution is a two-edged sword,” he said. “It may be used for the benefit of humanity, assuming that humanity survives long enough to enter a period in which such a benefit is possible.”•
https://bklyn.newspapers.com/image/52903471/?terms=says%2Bif%2Breds%2Bdon%27t%2Bget%2Bus%2Brobots%2Bwill
The problem was, is, will be this: the belief that computation can produce intelligence of the biological kind.
Dear lord, why are you letting Musk off easy? That a$$klown actually thinks that by NOT spending $1 trillion into the private sector, there will be MORE MONEY in... (wait for it) the PRIVATE SECTOR.
In the modern world all money is created by debt. So there is a well known accounting identity between private sector positive equity money and government sector debt. The national debt of all countries together IS the positive equity of the total private sector globally. Everything else in the private sector, all loans and the private sector originated money temporarily created by those loans, has a net valuation of zero.
We are living in a post-parody world. Moronic billionaires hoisted up by probability are doing everything possible to wreck our world and make it difficult or impossible to survive in.
The Groc3 example is truly hilarious! It about sums up the situation.
"Behemoth" is also spoken of in the same terms as "leviathan". Both were used by Thomas Hobbes in his discussions of political philosophy, the history of political authority, etc. You don't see the former discussed as much ... given the former is the more historical work, perhaps that's why.