As an AI researcher, I just wish they could have given us the same amount of money over the next decade that they have all just splurged doing exactly the same things repeatedly using an architecture that is not fit for prime time and will need to be replaced before we make much more progress.
Or, the other way around, current methods are a major success of imagination.
Why diligently and painstakingly build models for the nearly infinite number of situations that can occur in the world, if the machine can map it all for us?
Granted, the ambition is great and the methods not great. But this is just the start.
"all the major AI companies are spending billions producing almost exactly the same results using almost exactly the same data using almost exactly the same technology — all flawed in almost exactly the same ways."
It much worse than that. All major silicon valley companies, save perhaps Apple (though we should reserve judgment till this years WWDC) are restructuring and pivoting around this flawed technology. Google recently reorged its Android/Products team, with the stated rationale being faster integration of AI into some of its most successful products. Meta is adding AI chats to every app it owns. Microsoft is going big on AI...
These are successful companies with successful products and product roadmaps pivoting on the basis of... what?
Certainly, part of this is performative, to keep investors from panicking, so you add "AI" to everything to show that you're in the game. But given the many many many issues with LLMs you and others have demonstrated, what we're seeing is groupthink impacting action at a headspinning scale, with economic consequences that will reverberate for a good long time. You'd think at least some of these companies would resist giving in and stick to a more reasoned pace, but that just doesn't seem to be the case.
In one sense, this is the best proof of how concentrating resources, power and talent in Silicon Valley is a terrible idea. Being surrounded by proponents of the cult of AI has reoriented some of the biggest companies in the world so much that they've prematurely reimagined their entire business being infused with AI.
For a while, I used to think there must be some secret, yet-to-be-revealed implementation of LLMs that some of these folks were sitting on that was driving their supreme confidence. Its clear that's no longer the case.
The best evidence that Silicon Valley tech companies don’t have any AI aces up their sleeve is their need to discredit so many people in the field who disagree with them. If you have a winning hand you show your cards, you don’t fire or lay off your engineers and scientists.
The real canary in the coal mine was the decision to suddenly lay off teams of UX designers and researchers. Those were the teams who filtered user needs to product, but just as AI hype took off many companies decided they no longer needed them. Now instead of “intuitive design,” the goal everyone wanted to convince consumers to adopt the internet from the 1990s onward, now we have calls for “prompt engineering.”
Any technology that needs to painstakingly teach people how to use it properly is teetering on the edge of irrelevance. The problem with LLMs is they can automate such plausible-sounding rubbish it’s harder for users to tell their prompts are the problem, creating a flawed feedback loop.
At some point AI hype needs to get back to end user needs, not investor needs.
Hallucination: "a perceptual experience with all the compelling subjective properties of a real sensory impression but without normal physical stimulus for that sensory modality. Hallucinations are taken as classic indicators of a psychotic disturbance and are the hallmarks of various disorders like schizophrenia. " --- Reber, Arthur S. The Penguin dictionary of psychology. penguin press, 1995.
So when the AI Numerologist say LLM exhibit hallucinations they are saying they have successfully created a system with serious and debilitating psychological disease(s.)
Curious on your thoughts on Meta claiming the new model can reason. I’m regarding this as a red flag (or them redefining the meaning of reasoning, as they have that of AGI) until I hear otherwise from skeptics. Thomson Reuters released something for law firms and said it could reason about 10 times in the press release.
I am not sure that opening AI technology by any company without a lot of heavy precautions is a good thing in general. Releasing the source code behind AI technology is a huge responsibility, an incommensurate responsibility in regard of potential consequences. Not only one should be sure that his code is inherently reliable (constrained hallucinations, verified and secure database, etc.) for a benevolent use but also that the code is safe against unavoidable malevolent use. And that probably means the code should not be totally open or have some built-in limitations or guardrails. The AI technology in its present state, i.e. not yet reliable, not yet safe, no yet regulated, is not ready for releasing in my opinion, neither as a close paid application, nor free open source. All GenAI products on the market today are kind of beta versions or even demonstration versions and in fact should have not be sold or even offered to the large audience yet. The AI companies are somehow testing their products with the global audience instead of testing them with a selected group of users according to a well-defined protocol. And that completely amazing situation is possible because we have not established any norms, standards, regulations, institutions for checking and approving these products before general use.
It's strategically a rather smart strategic move for Meta to turn other company's investments (competitors like Google, Microsoft) into something far less valuable.
It's not such a good thing for society (proliferation of stuff you can do lots of bad things with), but then, pure, hard capitalist moves seldom are.
Venture capital hit a technology that broke their model. They're very used to funding ideas when the technology isn't baked. But now, show them a product whose unique feature is that nobody needs to know how it works, ever. Does an end run around all that risky development. Of course they went for it. But it broke their ability to tell a finished product from a demo, so here we are, shipping demo after demo.
> Historians are going to be scratching their heads
Sounds very human to me, particularly when large publicly owned corporations are involved.
"Nobody ever got fired for buying IBM" - or otherwise going along with the majority, however foolish.
Maverick inventors and sole proprietors sometimes go for things off the beaten path, convinced they are on to something, and many of them fail. Professional executives play it safe. They have limited skin in the game, and that mostly short term; so they prefer same-as-their-peers to a slim chance of producing the next great thing.
Venture capital is supposed to fill this niche, funding maverick inventors and content with a stable of extreme long shots, as long as there's a good chance at least one will pay off big. But AFAICT they too play "follow the leader" a lot, as well as considering five years as extremely long term.
While the counterpoints against current approaches are always useful to hear to keep a balance against the hype, I do think it risks starting to sound a little like a broken record. One might even argue that there is less effort in critiquing a bad design or highlighting the problems than there is in proposing solutions.
What I think would be incredibly helpful would be to intersperse your counter-LLM blogs with some more solution focused, forward looking blogs - perhaps highlighting niche areas of research that deserve more attention, or hypothesising other approaches.
I suspect your regular readers have all already got the point, and the remainder who really do need to hear the message aren't reading it.
There is a very good reason for why AI companies do what they do.
It is simply not possible to code up as a neat framework everything that we know. LLM offers a very powerful and general alternative. Given many detailed examples for how we do work, LLM can learn how to mimic those.
There is hallucination, of course. That happens if the task being done is too different from the examples, if the LLM lacks spelled out examples for how it should verify its work, and is not skilled enough to figure out when to invoke third-party logic to do the work. That is fixable with more specialized data and methods.
The next step where we will see great progress is the MATH benchmark, where current scores hover around 50%.
So no, no plateau. We've hit a vein of gold. We won't see a jump as from GPT-2 to GPT-3, but we will see solid advances over the next year or two.
Hi Andy, one basic problem of LLMs is that they do not “understand” constraints. Have you tried constraining the output just like humans get constraints at work to deliver something?
I discovered that a few months ago and keep finding it to be a problem even today.
As an AI researcher, I just wish they could have given us the same amount of money over the next decade that they have all just splurged doing exactly the same things repeatedly using an architecture that is not fit for prime time and will need to be replaced before we make much more progress.
A major failure of (intellectual) imagination. Not confined to AI/ML.
Or, the other way around, current methods are a major success of imagination.
Why diligently and painstakingly build models for the nearly infinite number of situations that can occur in the world, if the machine can map it all for us?
Granted, the ambition is great and the methods not great. But this is just the start.
Irrational exuberance - https://en.wikipedia.org/wiki/Irrational_Exuberance_(book)
"all the major AI companies are spending billions producing almost exactly the same results using almost exactly the same data using almost exactly the same technology — all flawed in almost exactly the same ways."
It much worse than that. All major silicon valley companies, save perhaps Apple (though we should reserve judgment till this years WWDC) are restructuring and pivoting around this flawed technology. Google recently reorged its Android/Products team, with the stated rationale being faster integration of AI into some of its most successful products. Meta is adding AI chats to every app it owns. Microsoft is going big on AI...
These are successful companies with successful products and product roadmaps pivoting on the basis of... what?
Certainly, part of this is performative, to keep investors from panicking, so you add "AI" to everything to show that you're in the game. But given the many many many issues with LLMs you and others have demonstrated, what we're seeing is groupthink impacting action at a headspinning scale, with economic consequences that will reverberate for a good long time. You'd think at least some of these companies would resist giving in and stick to a more reasoned pace, but that just doesn't seem to be the case.
In one sense, this is the best proof of how concentrating resources, power and talent in Silicon Valley is a terrible idea. Being surrounded by proponents of the cult of AI has reoriented some of the biggest companies in the world so much that they've prematurely reimagined their entire business being infused with AI.
For a while, I used to think there must be some secret, yet-to-be-revealed implementation of LLMs that some of these folks were sitting on that was driving their supreme confidence. Its clear that's no longer the case.
The best evidence that Silicon Valley tech companies don’t have any AI aces up their sleeve is their need to discredit so many people in the field who disagree with them. If you have a winning hand you show your cards, you don’t fire or lay off your engineers and scientists.
The real canary in the coal mine was the decision to suddenly lay off teams of UX designers and researchers. Those were the teams who filtered user needs to product, but just as AI hype took off many companies decided they no longer needed them. Now instead of “intuitive design,” the goal everyone wanted to convince consumers to adopt the internet from the 1990s onward, now we have calls for “prompt engineering.”
Any technology that needs to painstakingly teach people how to use it properly is teetering on the edge of irrelevance. The problem with LLMs is they can automate such plausible-sounding rubbish it’s harder for users to tell their prompts are the problem, creating a flawed feedback loop.
At some point AI hype needs to get back to end user needs, not investor needs.
Sad, but true.
Just because I have a moment ....
Hallucination: "a perceptual experience with all the compelling subjective properties of a real sensory impression but without normal physical stimulus for that sensory modality. Hallucinations are taken as classic indicators of a psychotic disturbance and are the hallmarks of various disorders like schizophrenia. " --- Reber, Arthur S. The Penguin dictionary of psychology. penguin press, 1995.
So when the AI Numerologist say LLM exhibit hallucinations they are saying they have successfully created a system with serious and debilitating psychological disease(s.)
Curious on your thoughts on Meta claiming the new model can reason. I’m regarding this as a red flag (or them redefining the meaning of reasoning, as they have that of AGI) until I hear otherwise from skeptics. Thomson Reuters released something for law firms and said it could reason about 10 times in the press release.
I am not sure that opening AI technology by any company without a lot of heavy precautions is a good thing in general. Releasing the source code behind AI technology is a huge responsibility, an incommensurate responsibility in regard of potential consequences. Not only one should be sure that his code is inherently reliable (constrained hallucinations, verified and secure database, etc.) for a benevolent use but also that the code is safe against unavoidable malevolent use. And that probably means the code should not be totally open or have some built-in limitations or guardrails. The AI technology in its present state, i.e. not yet reliable, not yet safe, no yet regulated, is not ready for releasing in my opinion, neither as a close paid application, nor free open source. All GenAI products on the market today are kind of beta versions or even demonstration versions and in fact should have not be sold or even offered to the large audience yet. The AI companies are somehow testing their products with the global audience instead of testing them with a selected group of users according to a well-defined protocol. And that completely amazing situation is possible because we have not established any norms, standards, regulations, institutions for checking and approving these products before general use.
And not one is investigating semantic AI solutions that are at scale and off-the-shelf.
Or with symbolic logic.
The semantic AI model includes symbolic logic at least to first order.
It's strategically a rather smart strategic move for Meta to turn other company's investments (competitors like Google, Microsoft) into something far less valuable.
It's not such a good thing for society (proliferation of stuff you can do lots of bad things with), but then, pure, hard capitalist moves seldom are.
Venture capital hit a technology that broke their model. They're very used to funding ideas when the technology isn't baked. But now, show them a product whose unique feature is that nobody needs to know how it works, ever. Does an end run around all that risky development. Of course they went for it. But it broke their ability to tell a finished product from a demo, so here we are, shipping demo after demo.
> Historians are going to be scratching their heads
Sounds very human to me, particularly when large publicly owned corporations are involved.
"Nobody ever got fired for buying IBM" - or otherwise going along with the majority, however foolish.
Maverick inventors and sole proprietors sometimes go for things off the beaten path, convinced they are on to something, and many of them fail. Professional executives play it safe. They have limited skin in the game, and that mostly short term; so they prefer same-as-their-peers to a slim chance of producing the next great thing.
Venture capital is supposed to fill this niche, funding maverick inventors and content with a stable of extreme long shots, as long as there's a good chance at least one will pay off big. But AFAICT they too play "follow the leader" a lot, as well as considering five years as extremely long term.
“an increase of realism might lead to a renaissance of new approaches to AI.”
As long as the target continues to be the Holy Grail of AGI, I don’t see that happening.
Yes, we are all individuals... https://www.youtube.com/watch?v=QereR0CViMY
Let's hope this ginormous fiasco does not bring in yet another winter or even worse, an ice age
While the counterpoints against current approaches are always useful to hear to keep a balance against the hype, I do think it risks starting to sound a little like a broken record. One might even argue that there is less effort in critiquing a bad design or highlighting the problems than there is in proposing solutions.
What I think would be incredibly helpful would be to intersperse your counter-LLM blogs with some more solution focused, forward looking blogs - perhaps highlighting niche areas of research that deserve more attention, or hypothesising other approaches.
I suspect your regular readers have all already got the point, and the remainder who really do need to hear the message aren't reading it.
How do you propose to get the message to the ones not reading it?
I share Gary’s articles, some people in power are learning something new.
There is a very good reason for why AI companies do what they do.
It is simply not possible to code up as a neat framework everything that we know. LLM offers a very powerful and general alternative. Given many detailed examples for how we do work, LLM can learn how to mimic those.
There is hallucination, of course. That happens if the task being done is too different from the examples, if the LLM lacks spelled out examples for how it should verify its work, and is not skilled enough to figure out when to invoke third-party logic to do the work. That is fixable with more specialized data and methods.
The next step where we will see great progress is the MATH benchmark, where current scores hover around 50%.
So no, no plateau. We've hit a vein of gold. We won't see a jump as from GPT-2 to GPT-3, but we will see solid advances over the next year or two.
Hi Andy, one basic problem of LLMs is that they do not “understand” constraints. Have you tried constraining the output just like humans get constraints at work to deliver something?
I discovered that a few months ago and keep finding it to be a problem even today.
LLMs do not understand anything. :) They should be paired up with tools that understand things.
What kind of tools do you suggest?
I think it depends on the application. For example, formal verifiers, calculators, conversion to code and running it, tools for symbolic reasoning. LLM do suck for now when they start doing more than word salad generation, but I think that will change. Here's a summary for math, for example. https://www.aei.org/technology-and-innovation/why-ai-struggles-with-basic-math-and-how-thats-changing/
Ok, so use something else for anything other than word salad generation?