Since the tech can’t ever deliver on the promises made by the execs, the outcomes are either go the WeWork way or potentially get acquired by MSFT for few cents on a dollar. Inspired me to write this:
R.I.P. TechBro Era 2008–2025: The Inevitable GenAI Crash (Part 1)
The notion that it is *training* that is costly for LLMs is — I suspect — wrong. The situation is far worse, economically: it is *running* extended transformer-based systems that has become very expensive. This was already the trade-off when the transformer architecture was invented in 2017. But over time, producing text has grown from basically single continuations to orders of magnitude more, just to select a single one to present to the user. The ‘indirect’ (‘thinking’ but that is misleading) models may produce thousands or maybe even tens of thousands (we do not know) continuations to produce a single result.
Training you essentially do once for a model. It produces a set of numbers. Training can be layered (just add training to the current state). Running you do for every interaction.
The cost of such systems is real, but manageable, longer term. Gemini uses a mixture of experts, so a sub-model or a handful of them work at any one time. I don't think in practice they generate thousands of reasoning paths, there are likely a handful, and it gets worse for more complex problems only. Then, anything well-understood will call tools (some of which are optimized to be in memory in the same process). Lastly, there's research into moving beyond Transformers to Mamba which does not have quadratic complexity.
The normal models will very likely do a smaller continuously optimising (I suspect that the threads that end up producing average-low-likelihood token sets get pruned and replaced by a clone of the best performing thread) set of continuations, say something like in the order of 10, but such details are secret so all of this is a guess. I did read about the indirect ('thinking') models that these produced orders of magnitudes more (reasoning-form) continuations (as they sometimes may calculate for up to 20 minutes if need be in extreme cases, and they're not doing nothing in that time frame, and each of these is based on that small optimising set), but maybe I am overestimating how much can be done in such time frames. Hundreds is reasonable I think, thousands is extreme cases would not surprise me (don't forget that each continuation with mini-optimising threads can be run in parallel). I admit it is all guesswork. It would be fascinating to have a look in the kitchen of these parties...
MoE is indeed one of the efficiency approaches, but afaik this doesn't minimise the number of continuations, only what parts of the model are working on them..
I am quite sure they have ways of keeping in check the total cost of each one user query.
Truth is, LLM is an imperfect approach. Some problems won't be solved correctly even if running for days, as there are hard problems out there, and LLM doesn't model things properly.
Other overhead comes from running complex tools and doing real-time searches. These don't map neatly to the massively parallel GPU paradigm.
The big uncertainty for me personal is how efficient it is all going to be in relation to the quality (i.e. 'good enough', AGI is completely out of the question afaic). That is an economic uncertainty, and so far it doesn't look very good (it is difficult to see how current levels of data center investments ever can make economic sense). I suspect that a pure GPU route will be economically not viable, that (combinations of) different hardware architectures will fare better, I have seen small signs of medium term improvements that involve non-discrete hardware as part of the far future mix.
'To be managed' for me is thus a question mark still: will we have solution that make it manageable? Uncertainty there is rife and signs aren't very positive yet as far as I can see today.
As I see it, if one does not hype up AGI much, things become clearer. The goal is tools that help coders, customer service people, writers, etc.
From that angle, the progress is remarkably good. I have been using GeminiCLI in my code work, and it is of great value. This is not a chatbot, but a tool on a local machine that understands your code.
The image generation tools do some limited reasoning now, which helps with accuracy. Gemini also does well with math and calculation problems, going beyond LLM as need be.
Google is busy hooking up such agents to all its products. It will make boring work easier, and Google will charge extra.
I don't think we'll get to AGI in one shot. The current methods can be made reliable enough and cheap enough for much work.
I am quite sure everybody is exploring additional methods for spatial awareness, etc., likely still based on neural nets but not LLM. Then we will iterate.
Right. My motto is "The Right Tool to the Right Task" don't use an LLM to do approximate work when you have tools thar can do precise work.
Example.
You could ask the LLM to generate the machine code from a "English" prompt (read fuzzy incomplete requirements) and pack it in a b64 encoded blob that encodes an ELF executable right?
It is a legitimate prompt (I'll try it just for the sake of the argument).
But why would you skip the precise compiler and linker logic that produce precise output for a structured source code.
But if this is the case what are the actual useful and legitimate uses of an LLM?
I've always pushed for all online systems to reveal the amount of energy (compute etc) used to produce each result. Should be present in APIs and available in web interfaces or dedicated apps.
Then people can see for themselves who much energy an LLM eats up to produce the answer for the prompt "what is 2+2?" . Then compare it to other means to do the dame thing and realize the magnitude of the waste.
I'm working on systems where each mW of power matters and here we are spending millions of times more power to basically do the same thing.
The problem is that most users of these systems don’t care about how much energy is being used because they are either not paying at all (using free versions) or not paying for the actual cost of the calculations.
These users also obviously don’t care about all the pirating of intellectual property involved.
This is, of course, where losing money on every transaction, but making it up on volume, rears iis ugly head. If current users had to pay the actual cost of all those electrons, would it be worth it to them?
My speculation is, a Google could eat the loss, if AI were part of a suite of services that was profitable over all. Open AI can't do that.
(or having a broad and deep IT engineering background with some AI in it and reading roughly 140 research papers — just kidding, but I have been doing IT engineering, IT architecture & strategy, and occasionally publishing/speaking about it for 35 years)
Per Google AI: "The California Public Employees' Retirement System (CalPERS) has invested in artificial intelligence (AI) but has not confirmed a direct investment in the private company OpenAI. Instead, CalPERS invests broadly in the AI sector through publicly traded companies and venture capital funds."
CALPERS has over $500 Billion in assets, so a $300 Million investment, even if it ALL went to OpenAI (which it didn't), and then went to zero value, would represent a loss of 0.06%.
Someone did a really great piece on how these decisions are made by union reps, who are wined and dined and given billionaire treatment yet they don’t really understand AI, or Crypto, or any of these investments that they are making on behalf of millions of workers. I’m not throwing shade-I’m ill-equipped to make those decisions myself, and also supremely disinterested. The article was about opening State pensions up to crypto investment. Maybe in The Nation?
Playing the old game of "If I win I am a great fund manager, if I lose the pensions are guaranteed by the state anyway". The silver lining at least for me is that the moment we are all waiting for is a step closer - Prop 13 meets guaranteed pensions and health coverage for government employees.
The collapse of the house built on sand in 2008 (and the 2000 bubble, heck, go back to 1929 intentional depression, tulip-mania, etc.) was visibly inevitable for anyone with eyes to see years ahead of the official chickens come home to roost meltdown. The pattern repeats itself, as patterns seem to, and Altman is a feature, not a bug, as a talented magician points to distraction from the deception being performed. Meanwhile, lots of value and loss occurs which hurts the plebe but the top are always protected.
alwayscurious: Death, taxes, climate change, and polluting the earth comes for us all, though taxes and maybe even death seem to be taking a hit for those at "the top."
The only good news is that Trump has reduced the exorbitant privilege of the dollar, so the damage will be distributed less around the world than in 2008.
The scary thing for OpenAI is that Google has shown time and again that they're really good at grinding away at quality improvements. That's how Google search and Waymo got where they are: Not from "big insights" so much as relentless grinding across a hundred quality levers over a long span of time. Which is what AI is.
I have a persistent fantasy where I drive past Sam Altman, Marc Andreesen, and Elon Musk as they fight over the right to try and clean my windshield at a stoplight.
They would also be a perfect crew for a one way expedition to Mars. Peter Thiel should join them too. They can build their techno-feudalist utopia there, totally unhindered by any deep state or democracy.
Everyone who calls out Andreessen, Musk and Thiel as rich entrepreneurial Silicon Valley types specifically being worthy of impoverishment and/or elimination from the earth.
How is this not incredibly obvious from the prior 2 comments?
It annoys me that in the press Sam Altman is usually described as an AI guru, when he is nothing more than a salesman bordering on a conman (and I am not sure if 'bordering' is putting it strongly enough). I am convinced he doesn't know much about the technical details of AI, and he has certainly not come up with anything of significance in the field.
So AI will be a commodity without profits for the many competing companies. I was looking for a catalyst to end this long bull market. Failing AI should do the job. We'll have to wait a little longer of course.
In the “It’s Far Worse than Y’all Know” department…
We now have an “Office of AI and Quantum” in the Department of Energy.
Setting aside the linguistic and logical inconsistencies inherent in this maneuver…
It’s been long known that “Quantum” was going to be the fallback futuristic funding distraction once the AI circus ran out of runway.
However, now that AI has what is essentially a superhighway for the plunder of public funds, the bubble will remain artificially inflated a while longer.
The attached FAS analysis provides a very carefully worded expression of “concerns” regarding the giveaway of the National Laboratory system to Trump’s overlords via Wright’s lickassery.
As someone who is routinely exposed to the upper levels of the Federal Contracting System and the manipulation of the FARs, I can assure you that not only will they get it, but they’ve already gotten a ridiculous amount.
As alarming, and sickening, as that prospect is, my guess is, they'll try, but won't be able to lay their hands on the money. Really doubt they can "repurpose" enough, they'll have to go to Congress, where they'll be met with populist rage from both right and left.
Need to get my superannuation fund (in Australia 10% of our salary goes into compulsory retirement savings accounts) to ensure none of my money is invested in OpenAI for even one more day
It is true that OpenAI no longer has a lead, and likely Google will continue to go on offensive in market share and quality of products.
Yet, the concerns about OpenAI suffering a catastrophe are vastly premature.
The canaries in the coal mine will be small me-too startups. This will be followed by xAI which exists on vibes alone. Even Meta may decide that the race is too expensive and need to refocus on lesser goals.
Anthropic is doing remarkably well, and for now this is a three-horse race, with OpenAI and Google. If history is any guide, there is room for two or three winners in such a big industry.
OpenAI may need to chase fewer distractions, such as video, and focus more on enterprise, but likely it will do well, even if it cedes the crown to Google.
The biggest concern is that OpenAI tries to grow way too fast with too much debt, and likely the revenue won't keep up. So it would be a self-inflicted blow.
Read the article. The concern is that OpenAI would fail spectacularly, or at least be vastly diminished, resulting in big losses for investors and likely taking down the whole house of cards.
Well this is a delicious cherry on top - Microsoft cuts sales quotas for AI products that customers do not want. Bolts are shaking a bit loose in the AI-complex machine at the moment.
Since the tech can’t ever deliver on the promises made by the execs, the outcomes are either go the WeWork way or potentially get acquired by MSFT for few cents on a dollar. Inspired me to write this:
R.I.P. TechBro Era 2008–2025: The Inevitable GenAI Crash (Part 1)
https://srinipagidyala.substack.com/p/rip-techbro-era-20082025-the-inevitable
Wonder if this would’ve still happened if the Board had the intestinal fortitude to fire him and stay fired…
I think it probably would still have happened. They were already on this trajectory.
I think this is right. Bipolar behavior. Talent drain. Failure to see the true field of play. Loss of focus.
Those are leadership problems.
AI-polar behavior: belief that AGI is just around the corner
"Wonder if this would’ve still happened if the Board had the intestinal fortitude to fire him and stay fired…"
Too many "Bros" on the Board.
Won't happen.
They'll first run to the gov't teet.
Well, they now have one fewer (Jeff)Bro
The notion that it is *training* that is costly for LLMs is — I suspect — wrong. The situation is far worse, economically: it is *running* extended transformer-based systems that has become very expensive. This was already the trade-off when the transformer architecture was invented in 2017. But over time, producing text has grown from basically single continuations to orders of magnitude more, just to select a single one to present to the user. The ‘indirect’ (‘thinking’ but that is misleading) models may produce thousands or maybe even tens of thousands (we do not know) continuations to produce a single result.
Training you essentially do once for a model. It produces a set of numbers. Training can be layered (just add training to the current state). Running you do for every interaction.
The cost of such systems is real, but manageable, longer term. Gemini uses a mixture of experts, so a sub-model or a handful of them work at any one time. I don't think in practice they generate thousands of reasoning paths, there are likely a handful, and it gets worse for more complex problems only. Then, anything well-understood will call tools (some of which are optimized to be in memory in the same process). Lastly, there's research into moving beyond Transformers to Mamba which does not have quadratic complexity.
The normal models will very likely do a smaller continuously optimising (I suspect that the threads that end up producing average-low-likelihood token sets get pruned and replaced by a clone of the best performing thread) set of continuations, say something like in the order of 10, but such details are secret so all of this is a guess. I did read about the indirect ('thinking') models that these produced orders of magnitudes more (reasoning-form) continuations (as they sometimes may calculate for up to 20 minutes if need be in extreme cases, and they're not doing nothing in that time frame, and each of these is based on that small optimising set), but maybe I am overestimating how much can be done in such time frames. Hundreds is reasonable I think, thousands is extreme cases would not surprise me (don't forget that each continuation with mini-optimising threads can be run in parallel). I admit it is all guesswork. It would be fascinating to have a look in the kitchen of these parties...
MoE is indeed one of the efficiency approaches, but afaik this doesn't minimise the number of continuations, only what parts of the model are working on them..
I am quite sure they have ways of keeping in check the total cost of each one user query.
Truth is, LLM is an imperfect approach. Some problems won't be solved correctly even if running for days, as there are hard problems out there, and LLM doesn't model things properly.
Other overhead comes from running complex tools and doing real-time searches. These don't map neatly to the massively parallel GPU paradigm.
This is all something to be managed though.
The big uncertainty for me personal is how efficient it is all going to be in relation to the quality (i.e. 'good enough', AGI is completely out of the question afaic). That is an economic uncertainty, and so far it doesn't look very good (it is difficult to see how current levels of data center investments ever can make economic sense). I suspect that a pure GPU route will be economically not viable, that (combinations of) different hardware architectures will fare better, I have seen small signs of medium term improvements that involve non-discrete hardware as part of the far future mix.
'To be managed' for me is thus a question mark still: will we have solution that make it manageable? Uncertainty there is rife and signs aren't very positive yet as far as I can see today.
As I see it, if one does not hype up AGI much, things become clearer. The goal is tools that help coders, customer service people, writers, etc.
From that angle, the progress is remarkably good. I have been using GeminiCLI in my code work, and it is of great value. This is not a chatbot, but a tool on a local machine that understands your code.
The image generation tools do some limited reasoning now, which helps with accuracy. Gemini also does well with math and calculation problems, going beyond LLM as need be.
Google is busy hooking up such agents to all its products. It will make boring work easier, and Google will charge extra.
I don't think we'll get to AGI in one shot. The current methods can be made reliable enough and cheap enough for much work.
I am quite sure everybody is exploring additional methods for spatial awareness, etc., likely still based on neural nets but not LLM. Then we will iterate.
Right. My motto is "The Right Tool to the Right Task" don't use an LLM to do approximate work when you have tools thar can do precise work.
Example.
You could ask the LLM to generate the machine code from a "English" prompt (read fuzzy incomplete requirements) and pack it in a b64 encoded blob that encodes an ELF executable right?
It is a legitimate prompt (I'll try it just for the sake of the argument).
But why would you skip the precise compiler and linker logic that produce precise output for a structured source code.
But if this is the case what are the actual useful and legitimate uses of an LLM?
I've always pushed for all online systems to reveal the amount of energy (compute etc) used to produce each result. Should be present in APIs and available in web interfaces or dedicated apps.
Then people can see for themselves who much energy an LLM eats up to produce the answer for the prompt "what is 2+2?" . Then compare it to other means to do the dame thing and realize the magnitude of the waste.
I'm working on systems where each mW of power matters and here we are spending millions of times more power to basically do the same thing.
I mean what is wrong with us?
The problem is that most users of these systems don’t care about how much energy is being used because they are either not paying at all (using free versions) or not paying for the actual cost of the calculations.
These users also obviously don’t care about all the pirating of intellectual property involved.
That makes them complicit.
Agree, but. Are the makers of these systems as stupid and careless as some of their users?
I'd hope that's not the case.
As a user and even more as a system builder I do care about the energy efficiency.
So, I am puzzled that this information isn't provided even if not by default.
This is, of course, where losing money on every transaction, but making it up on volume, rears iis ugly head. If current users had to pay the actual cost of all those electrons, would it be worth it to them?
My speculation is, a Google could eat the loss, if AI were part of a suite of services that was profitable over all. Open AI can't do that.
In general, AI electrons are not worth the screens they are written on.
Very impressive understanding, how do I gain such knowledge too…any literature recommendations?
Assuming you meant my comment: thanks for the kind words. Some insight may be had via the explanations at https://ea.rna.nl/the-chatgpt-and-friends-collection/ 😉
(or having a broad and deep IT engineering background with some AI in it and reading roughly 140 research papers — just kidding, but I have been doing IT engineering, IT architecture & strategy, and occasionally publishing/speaking about it for 35 years)
I can heavily recommend, Gerben has been my source of truth on this topic for almost two years now! :)
I second that.
His posts are always educational and thought provoking.
Thanks for sharing this Gerben. Excellent resource. Just finished watching your YT truths about GPT and Friends. Great!
What the hell is CALPERS doing making such a suzable investment is one company?
Per Google AI: "The California Public Employees' Retirement System (CalPERS) has invested in artificial intelligence (AI) but has not confirmed a direct investment in the private company OpenAI. Instead, CalPERS invests broadly in the AI sector through publicly traded companies and venture capital funds."
they have made i think at least two investments in thrive eg https://www.theinformation.com/articles/thrive-to-raise-300-million-from-calpers-amid-dry-spell-for-vcs?utm_source=ti_app&rc=dcf9pt and thrive has made multiple investments in openai. exact numbers are perhaps not known.
CALPERS has over $500 Billion in assets, so a $300 Million investment, even if it ALL went to OpenAI (which it didn't), and then went to zero value, would represent a loss of 0.06%.
i don’t know what the total exposure is but i linked an example not the total
Any exposure is too much exposure.
I assume that’s why Altman wears shades so much.
And prolly SPF100
Someone did a really great piece on how these decisions are made by union reps, who are wined and dined and given billionaire treatment yet they don’t really understand AI, or Crypto, or any of these investments that they are making on behalf of millions of workers. I’m not throwing shade-I’m ill-equipped to make those decisions myself, and also supremely disinterested. The article was about opening State pensions up to crypto investment. Maybe in The Nation?
Playing the old game of "If I win I am a great fund manager, if I lose the pensions are guaranteed by the state anyway". The silver lining at least for me is that the moment we are all waiting for is a step closer - Prop 13 meets guaranteed pensions and health coverage for government employees.
The collapse of the house built on sand in 2008 (and the 2000 bubble, heck, go back to 1929 intentional depression, tulip-mania, etc.) was visibly inevitable for anyone with eyes to see years ahead of the official chickens come home to roost meltdown. The pattern repeats itself, as patterns seem to, and Altman is a feature, not a bug, as a talented magician points to distraction from the deception being performed. Meanwhile, lots of value and loss occurs which hurts the plebe but the top are always protected.
alwayscurious: Death, taxes, climate change, and polluting the earth comes for us all, though taxes and maybe even death seem to be taking a hit for those at "the top."
The only good news is that Trump has reduced the exorbitant privilege of the dollar, so the damage will be distributed less around the world than in 2008.
Couldn't happen to a nicer guy/company... ;)
"Waiter, more champagne! And plenty of ice." -- Sam Altman, 'Time Bandits'
The scary thing for OpenAI is that Google has shown time and again that they're really good at grinding away at quality improvements. That's how Google search and Waymo got where they are: Not from "big insights" so much as relentless grinding across a hundred quality levers over a long span of time. Which is what AI is.
I have a persistent fantasy where I drive past Sam Altman, Marc Andreesen, and Elon Musk as they fight over the right to try and clean my windshield at a stoplight.
They would also be a perfect crew for a one way expedition to Mars. Peter Thiel should join them too. They can build their techno-feudalist utopia there, totally unhindered by any deep state or democracy.
Exactly.
People who believe in free enterprise are ruining the world.
Those guys have never built anything that people willingly spend their own money on.
Only we leftists know how to build the world we all deserve.
Who exactly is "we leftists"? You're not doing a very good job of false-flagging.
“Who exactly is ‘we leftists’?”
Everyone who calls out Andreessen, Musk and Thiel as rich entrepreneurial Silicon Valley types specifically being worthy of impoverishment and/or elimination from the earth.
How is this not incredibly obvious from the prior 2 comments?
They might be able to get to Mars (if the radiation in transit doesn’t kill them).
But life on Mars will require doing stuff like building and maintaining a habitat and growing food,
The tech titans would not last a week on Mars.
They can’t actually DO anything - except write computer code and shuffle money around, which wouldn’t help them survive on Mars.
That’s why they are the perfect crew to go there. They will cause a little bit of bio contamination, but not for too long. :)
Ah, so that’s why it’s called Mars colon-ization?
It annoys me that in the press Sam Altman is usually described as an AI guru, when he is nothing more than a salesman bordering on a conman (and I am not sure if 'bordering' is putting it strongly enough). I am convinced he doesn't know much about the technical details of AI, and he has certainly not come up with anything of significance in the field.
Well, he does know how to build AGI cuz he said so (and Phoenix IS bordering on Arizona)
"dollar store elon musk" is what i started calling him 3 years ago and it's been spot on (for both men)
So AI will be a commodity without profits for the many competing companies. I was looking for a catalyst to end this long bull market. Failing AI should do the job. We'll have to wait a little longer of course.
In the “It’s Far Worse than Y’all Know” department…
We now have an “Office of AI and Quantum” in the Department of Energy.
Setting aside the linguistic and logical inconsistencies inherent in this maneuver…
It’s been long known that “Quantum” was going to be the fallback futuristic funding distraction once the AI circus ran out of runway.
However, now that AI has what is essentially a superhighway for the plunder of public funds, the bubble will remain artificially inflated a while longer.
The attached FAS analysis provides a very carefully worded expression of “concerns” regarding the giveaway of the National Laboratory system to Trump’s overlords via Wright’s lickassery.
Just another log on the fire 🔥
https://fas.org/publication/new-doe-re-organization-raises-uncertainty-for-american-science/
As someone who is routinely exposed to the upper levels of the Federal Contracting System and the manipulation of the FARs, I can assure you that not only will they get it, but they’ve already gotten a ridiculous amount.
Who’s going to stop them?
Congress?
Surely you jest.
How long before people start talking about “Quantum AI”?
Everything else is quantum these days, why not AI?
The conjunction of the terms would probably square the funding for either Quantum Computing or AI alone.
Just think how powerful Quantum AI Supremacy would be. Orders of magnitude more powerful than Superhuman AI or simple Quantum Supremacy alone.
As alarming, and sickening, as that prospect is, my guess is, they'll try, but won't be able to lay their hands on the money. Really doubt they can "repurpose" enough, they'll have to go to Congress, where they'll be met with populist rage from both right and left.
Need to get my superannuation fund (in Australia 10% of our salary goes into compulsory retirement savings accounts) to ensure none of my money is invested in OpenAI for even one more day
It is true that OpenAI no longer has a lead, and likely Google will continue to go on offensive in market share and quality of products.
Yet, the concerns about OpenAI suffering a catastrophe are vastly premature.
The canaries in the coal mine will be small me-too startups. This will be followed by xAI which exists on vibes alone. Even Meta may decide that the race is too expensive and need to refocus on lesser goals.
Anthropic is doing remarkably well, and for now this is a three-horse race, with OpenAI and Google. If history is any guide, there is room for two or three winners in such a big industry.
OpenAI may need to chase fewer distractions, such as video, and focus more on enterprise, but likely it will do well, even if it cedes the crown to Google.
The biggest concern is that OpenAI tries to grow way too fast with too much debt, and likely the revenue won't keep up. So it would be a self-inflicted blow.
“the concerns about OpenAI suffering a catastrophe are vastly premature.“
Who is concerned?
Read the article. The concern is that OpenAI would fail spectacularly, or at least be vastly diminished, resulting in big losses for investors and likely taking down the whole house of cards.
Well this is a delicious cherry on top - Microsoft cuts sales quotas for AI products that customers do not want. Bolts are shaking a bit loose in the AI-complex machine at the moment.
“BoltinAI”
Abruptly waking
From his napping
Sammy’s losing
Ray-Ban cool
Bolts are shaking
Users bolting
Sammy’s hyping
Code Red rule
“Google-bots and clan”
“I do not like Goog-bots and clan
I do not like them
Sam I am”
Would you like them IN LA?
Would you like them IN San Fran?
Would you like them for a chat?
Would you like to chew the fat?
“No, NOT for chat,
To chew the fat,
Nor IN LA
Nor IN San Fran”
“I do not like Goog-bots and clan
I do not like them
Sam I am”