I think Ed dwells too much on costs of AI. Those are real, but the market is also huge. Some companies will go belly-up, like Cruise, but there's lots of money to be made.
I’m still surprised no one has taken a second look at Hard Fork’s 5/12/23 episode trying out the now defunct GM Cruise self-driving car.
Starting at 28 minutes the self-driving car is driving erratically and almost gets into an accident then you hear another driver pull up shouting “Did you see what happened!? You should probably want to report that thing!” Of course, in classic Casey Newton fashion he makes a sarcastic comment saying“We’re going to bring this thing to justice.” Followed by more downplaying of the incident.
Later in the episode they interview the Cruise CEO and never mention what happened, giving the softest of soft ball interviews. At the time, I was so surprised, especially with the bad reputation Cruise’s self-driving tech had at the time.
Then fast forward 5 months later where Cruise loses its license for another malfunction (we all know the story) and now Cruise is completely shut down by GM.
Casey has no business reporting on AI and it’s sad how many people take him seriously. Kevin is slightly better but not much.
There is a difference between "AI skeptics" (where I'd put myself, with my "The Skeptic AI Enthusiast" newsletter) and "AI denialists," for whom "AI is a dud." I'm afraid Gary could be in the latter.
I think one aspect Gary misses altogether is the compound power of incremental improvements. I see almost every week one new, very promising paper reporting a small improvement in LLM-based AI. I won't argue this is "exponential" or "bombastic" or that the ROI could keep the investors happy. What I'd say is that the cumulative effect of those incremental improvements will result, in a few years, in (mostly) reliable, functional LLM-based AI, and certainly not a "dud."
I am sad to see an interesting debate among well-intentioned protagonists and commentators becoming so confrontational. There is too much money at stake.
It's easy to disagree with one another in good faith because we simply don't know: the engineering is way ahead of the science currently, and is likely to remain so for a considerable time.
For the extremely little it's worth, as an academic on the theory side of things, my perspective is that there is no reason why some form of LLM can't become technically highly proficient at processing language in a manner that is meaningful for humans, and for hallucinations to continue to fall, but linguistic competence is not general intelligence. We are back to aiming for the moon by climbing a ladder / tree.
Thanks for posting that screenshot of Newton's response. This is the line that grinds me the most:
"I found a genuine lack of curiosity in whether the scaling laws will get us to superintelligence."
So I take it that answering "no" to this bit of ungrounded speculative fantasy masquerading as a serious question makes me uncurious? It's not clear whether Newton thinks we get to answer "no", or if he expects everyone to perpetually entertain it for entertaining it's sake.
What about if I've actually taken the time to read these arguments, and I found them unpersuasive? What if I've put in my time scrolling threads on Less Wrong, listening to Geoff Hinton lectures, watching debates, and reading the statements/papers/system-cards OpenAI like to post on their website? And what if my takeaway is that "superintelligence" is poorly defined and at any rate unlikely to have the properties its advocates claim for it, and that the path from "scaling laws" (which are also speculative) to "superintelligence" looks to me like that famous math proof comic where the middle says "and then a miracle occurs"? That this whole endeavor amounts to a case study in what can happen when math-and-logic people let their imaginations get the better of them?
In short: just because I think "scaling laws will get us to superintelligence" is unserious bullshit, doesn't mean I lack curiosity. Just means I came to a very different conclusion than he did.
Whatever happened to the "sparks of AGI" from almost two years ago now? Given the dazzling pace of progress and the mega-billions invested, one would imagine we'd at least see tiny flames by now, if not a full-blown conflagration. But at least we now have AI conferences where a single institution out of the masses can publish over a hundred papers...
QED, eg. 2018, Deep Learning: A Critical Appraisal: “ Despite all of the problems I have sketched, I don’t think that we need to abandon deep learning.
Rather, we need to reconceptualize it: not as a universal solvent, but simply as one tool
among many, a power screwdriver in a world in which we also need hammers, wrenches,
and pliers, not to mentions chisels and drills, voltmeters, logic probes, and oscilloscopes.
In perceptual classification, where vast amounts of data are available, deep learning is a
valuable tool; in other, richer cognitive domains, it is often far less satisfactory.”
The challenge is how to model human cognition. When we do work we are precisely aware of the context, and have a highly detailed mental model of what we are dealing with. Seems no easier to spell this out than with perception. It appears no accident neural nets do better at cognitive modeling than other approaches.
Great post here and if it wasn’t for his following, I’d totally ignore much of what Casey has to say as he’s just bloviating. The one aspect I’d defend, probably not in how he did it however, can be summarized by the ol’ adage, that “perfection is the enemy of good enough”, in that I do find measurable societal value with some of what has come to be so far. I’ll use a specific example, I’ve been using Perplexity.ai for the past few months and it has changed my relationship with things I have questions about and need to quickly get an idea of, well beyond the use of search engines. It has also greatly enhanced my ability to answer my kids’ infinite questions quickly and relatively accurately. Heck, I’ve even gotten them to use it for themselves. I’m quite surprised by the quality of how it forms its answers, the references it provides, and the actual quality of the answers themselves. Is this helping AI live up to its promises and hype? Not likely. Is this moving the needle forward on AGI? Not IMO. Has it added a ton of value to my life and my ability to grasp new information? Absolutely, and as a result I consider it a general win. So I guess that while at the more macro view of AI progress this may be but a small and relatively insignificant milestone (for some) along a much longer road, I see it as an incremental net positive achievement that did not exist, and was not easily accessible for the average person, before the advent of these LLMs. They seem to have also obfuscated or at least tuned down (or out) the hallucinations as I have yet to run into any of note. While it’s not clear that they have a defensible business model, and Google and OpenAI have been iterating quickly on similar capabilities, w/o knowing much about their “secret sauce” (if any), their tool is still well ahead of the others, again, IMO. It’s also what I wish Siri was like (are you listening Apple, could be a good acquisition ;). Anyway, I totally agree with the greater points you’ve been making for a long time on AI and the need to bring many more advancements to bear if AGI is ever to be achieved (though I’m not convinced it’s something that should be or needs to be achieved), but the incremental benefits that are being achieved from the current stage of these technologies’ life cycle could and should be what is celebrated.
I see parallels between the trajectory of the current AI hype cycle, and the blockchain hype cycle of a decade ago. Money poured into blockchain startups on the premise that there would be a very large consumer-facing market for blockchain-enabled transactions. People would buy stuff on a blockchain, listen to music on a blockchain, engage in social media on a blockchain, etc. Nearly all of that money went up in smoke and today, apart from recording cryptocurrency transactions and the (very) narrow NFT market, blockchains are where they were always destined to be: in the tech stack of enterprises. That can be a business, but not a mass market business.
Much the same seems to be happening with AI. Companies like OpenAI and Anthropic are earning valuations premised on the assumption of a mass consumer market for generative AI use cases. Inasmuch as non-professional consumer uses of generative AI abound, they are frequently at least dubious, if not outright nefarious, and generally unmonetized: creating deepfakes, cheating on term papers, etc. Where LLMs are likely provide their greatest value-add, is precisely where blockchain landed: in the tech stack of enterprises. Again, that's a business, but not a business that justifies current valuations.
It is fascinating just how pathetic these models behave on so many metrics and the improvement rate suggests that it will be a very long time until they are behaving at an acceptable level of capability.
On most of the Gemini 2 table metrics, it will take at least a decade for a 20% improvement, assuming linear improvement, which doesn't seem to be happening. If the past is to be followed, that could require a ten order of magnitude increase in compute and training material and inference compute (an order of magnitude increase per year).
At this point when people have nothing left to support or substantiate their points, it digresses into degenerative petty name calling. I would tell you not to respond to that because falling to that level makes it hard for the uninformed casual observer to distinguish between truth and lies.
I would just keep putting out the facts as you see them and not respond to personal attacks. In the end as Einstein once said: human stupidity is infinite…and there’s no cure for that. In the end your right and reasonable thinking people who apply logic and not hype or adherence to nice stories can clearly see for themselves…that your right! Take the higher intellectual ground GM👊🏿
Same Newton who left Substack in a huff over alleged huge proliferation of neo-Nazis. In his telling, it was the next 4 chan or 8 chan just waiting to happen! And Substack refused to deal with this grave issue, despite Newton's publicly expressed concern. But something was weird with Newton's reporting. He didn't give numbers of this tidal wave. Then it turned out it was only 5 blogs, 4 of which made no money, 1 of which made almost no money. 5 out 17,000 Substacks! Way to protect us from that tidal wave of far-right hate, Casey! Hard Fork is generally entertaining, maybe due to Kevin Roose's influence, but Newton's work veers between decent and lazy/biased.
Gary, do you use LLMs on a regular basis, and if so, for which purposes? Or asked differently, have you found professional or personal use cases for LLMs?
Seriously, thank you. You, the Eds, Jathan Sadowski, and Paris Marx are doing God's work out there.
Don’t forget Abeba, the Mitchell’s, etc. It takes a village to fight the hype.
Absolutely! I should follow them more, so thanks for the shout out.
I think Ed dwells too much on costs of AI. Those are real, but the market is also huge. Some companies will go belly-up, like Cruise, but there's lots of money to be made.
It must take a lot of work to be his Reply Guy and mine.
The market for a great product is assuredly real. Two years and so many hundred billion dollars later, potential buyers' powder is still mostly dry.
I’m still surprised no one has taken a second look at Hard Fork’s 5/12/23 episode trying out the now defunct GM Cruise self-driving car.
Starting at 28 minutes the self-driving car is driving erratically and almost gets into an accident then you hear another driver pull up shouting “Did you see what happened!? You should probably want to report that thing!” Of course, in classic Casey Newton fashion he makes a sarcastic comment saying“We’re going to bring this thing to justice.” Followed by more downplaying of the incident.
Later in the episode they interview the Cruise CEO and never mention what happened, giving the softest of soft ball interviews. At the time, I was so surprised, especially with the bad reputation Cruise’s self-driving tech had at the time.
Then fast forward 5 months later where Cruise loses its license for another malfunction (we all know the story) and now Cruise is completely shut down by GM.
Casey has no business reporting on AI and it’s sad how many people take him seriously. Kevin is slightly better but not much.
There is a difference between "AI skeptics" (where I'd put myself, with my "The Skeptic AI Enthusiast" newsletter) and "AI denialists," for whom "AI is a dud." I'm afraid Gary could be in the latter.
I think one aspect Gary misses altogether is the compound power of incremental improvements. I see almost every week one new, very promising paper reporting a small improvement in LLM-based AI. I won't argue this is "exponential" or "bombastic" or that the ROI could keep the investors happy. What I'd say is that the cumulative effect of those incremental improvements will result, in a few years, in (mostly) reliable, functional LLM-based AI, and certainly not a "dud."
I am sad to see an interesting debate among well-intentioned protagonists and commentators becoming so confrontational. There is too much money at stake.
It's easy to disagree with one another in good faith because we simply don't know: the engineering is way ahead of the science currently, and is likely to remain so for a considerable time.
For the extremely little it's worth, as an academic on the theory side of things, my perspective is that there is no reason why some form of LLM can't become technically highly proficient at processing language in a manner that is meaningful for humans, and for hallucinations to continue to fall, but linguistic competence is not general intelligence. We are back to aiming for the moon by climbing a ladder / tree.
Thanks for posting that screenshot of Newton's response. This is the line that grinds me the most:
"I found a genuine lack of curiosity in whether the scaling laws will get us to superintelligence."
So I take it that answering "no" to this bit of ungrounded speculative fantasy masquerading as a serious question makes me uncurious? It's not clear whether Newton thinks we get to answer "no", or if he expects everyone to perpetually entertain it for entertaining it's sake.
What about if I've actually taken the time to read these arguments, and I found them unpersuasive? What if I've put in my time scrolling threads on Less Wrong, listening to Geoff Hinton lectures, watching debates, and reading the statements/papers/system-cards OpenAI like to post on their website? And what if my takeaway is that "superintelligence" is poorly defined and at any rate unlikely to have the properties its advocates claim for it, and that the path from "scaling laws" (which are also speculative) to "superintelligence" looks to me like that famous math proof comic where the middle says "and then a miracle occurs"? That this whole endeavor amounts to a case study in what can happen when math-and-logic people let their imaginations get the better of them?
In short: just because I think "scaling laws will get us to superintelligence" is unserious bullshit, doesn't mean I lack curiosity. Just means I came to a very different conclusion than he did.
Whatever happened to the "sparks of AGI" from almost two years ago now? Given the dazzling pace of progress and the mega-billions invested, one would imagine we'd at least see tiny flames by now, if not a full-blown conflagration. But at least we now have AI conferences where a single institution out of the masses can publish over a hundred papers...
Speaking of being "needlessly disrespectful":
His glib and misleading
he misrepresented virtually the entire field (AI skepticism)
The person who is apparently being “intellectually dishonest” (to use Casey’s invidious phrasing) here is Casey
An old proverb tells us “If three people at a party tell you you are drunk, lie down”.
Casey refuses to read the memo.
The point is not just that Newton is wrong or lazy or lacking in objectivity.
Fan boys like Newton
If Casey had balls
As often, the truth is somewhere in the middle.
No progress is ever exponential, except in very small bursts.
Good advancements were made in the last 3 years, and now need to focus more on other aspects than just language.
AGI will likely have many moving parts, just as our own intelligence.
QED, eg. 2018, Deep Learning: A Critical Appraisal: “ Despite all of the problems I have sketched, I don’t think that we need to abandon deep learning.
Rather, we need to reconceptualize it: not as a universal solvent, but simply as one tool
among many, a power screwdriver in a world in which we also need hammers, wrenches,
and pliers, not to mentions chisels and drills, voltmeters, logic probes, and oscilloscopes.
In perceptual classification, where vast amounts of data are available, deep learning is a
valuable tool; in other, richer cognitive domains, it is often far less satisfactory.”
The challenge is how to model human cognition. When we do work we are precisely aware of the context, and have a highly detailed mental model of what we are dealing with. Seems no easier to spell this out than with perception. It appears no accident neural nets do better at cognitive modeling than other approaches.
Great post here and if it wasn’t for his following, I’d totally ignore much of what Casey has to say as he’s just bloviating. The one aspect I’d defend, probably not in how he did it however, can be summarized by the ol’ adage, that “perfection is the enemy of good enough”, in that I do find measurable societal value with some of what has come to be so far. I’ll use a specific example, I’ve been using Perplexity.ai for the past few months and it has changed my relationship with things I have questions about and need to quickly get an idea of, well beyond the use of search engines. It has also greatly enhanced my ability to answer my kids’ infinite questions quickly and relatively accurately. Heck, I’ve even gotten them to use it for themselves. I’m quite surprised by the quality of how it forms its answers, the references it provides, and the actual quality of the answers themselves. Is this helping AI live up to its promises and hype? Not likely. Is this moving the needle forward on AGI? Not IMO. Has it added a ton of value to my life and my ability to grasp new information? Absolutely, and as a result I consider it a general win. So I guess that while at the more macro view of AI progress this may be but a small and relatively insignificant milestone (for some) along a much longer road, I see it as an incremental net positive achievement that did not exist, and was not easily accessible for the average person, before the advent of these LLMs. They seem to have also obfuscated or at least tuned down (or out) the hallucinations as I have yet to run into any of note. While it’s not clear that they have a defensible business model, and Google and OpenAI have been iterating quickly on similar capabilities, w/o knowing much about their “secret sauce” (if any), their tool is still well ahead of the others, again, IMO. It’s also what I wish Siri was like (are you listening Apple, could be a good acquisition ;). Anyway, I totally agree with the greater points you’ve been making for a long time on AI and the need to bring many more advancements to bear if AGI is ever to be achieved (though I’m not convinced it’s something that should be or needs to be achieved), but the incremental benefits that are being achieved from the current stage of these technologies’ life cycle could and should be what is celebrated.
I see parallels between the trajectory of the current AI hype cycle, and the blockchain hype cycle of a decade ago. Money poured into blockchain startups on the premise that there would be a very large consumer-facing market for blockchain-enabled transactions. People would buy stuff on a blockchain, listen to music on a blockchain, engage in social media on a blockchain, etc. Nearly all of that money went up in smoke and today, apart from recording cryptocurrency transactions and the (very) narrow NFT market, blockchains are where they were always destined to be: in the tech stack of enterprises. That can be a business, but not a mass market business.
Much the same seems to be happening with AI. Companies like OpenAI and Anthropic are earning valuations premised on the assumption of a mass consumer market for generative AI use cases. Inasmuch as non-professional consumer uses of generative AI abound, they are frequently at least dubious, if not outright nefarious, and generally unmonetized: creating deepfakes, cheating on term papers, etc. Where LLMs are likely provide their greatest value-add, is precisely where blockchain landed: in the tech stack of enterprises. Again, that's a business, but not a business that justifies current valuations.
It is fascinating just how pathetic these models behave on so many metrics and the improvement rate suggests that it will be a very long time until they are behaving at an acceptable level of capability.
On most of the Gemini 2 table metrics, it will take at least a decade for a 20% improvement, assuming linear improvement, which doesn't seem to be happening. If the past is to be followed, that could require a ten order of magnitude increase in compute and training material and inference compute (an order of magnitude increase per year).
This would seem somewhat unlikely to happen.
At this point when people have nothing left to support or substantiate their points, it digresses into degenerative petty name calling. I would tell you not to respond to that because falling to that level makes it hard for the uninformed casual observer to distinguish between truth and lies.
I would just keep putting out the facts as you see them and not respond to personal attacks. In the end as Einstein once said: human stupidity is infinite…and there’s no cure for that. In the end your right and reasonable thinking people who apply logic and not hype or adherence to nice stories can clearly see for themselves…that your right! Take the higher intellectual ground GM👊🏿
Same Newton who left Substack in a huff over alleged huge proliferation of neo-Nazis. In his telling, it was the next 4 chan or 8 chan just waiting to happen! And Substack refused to deal with this grave issue, despite Newton's publicly expressed concern. But something was weird with Newton's reporting. He didn't give numbers of this tidal wave. Then it turned out it was only 5 blogs, 4 of which made no money, 1 of which made almost no money. 5 out 17,000 Substacks! Way to protect us from that tidal wave of far-right hate, Casey! Hard Fork is generally entertaining, maybe due to Kevin Roose's influence, but Newton's work veers between decent and lazy/biased.
Gary, do you use LLMs on a regular basis, and if so, for which purposes? Or asked differently, have you found professional or personal use cases for LLMs?
Is Casey Newton the dude who left substack because he said it was overrun by nazis?