There is an infantile assumption in anything to do with AI that exponential growth is expected as normal, and also sustainable. I wonder why? Is it just a lack of numeracy skills, or basic scientific ignorance? Exponential growth is very rare in the natural world. Thing saturate or break very quickly. I wouldn't be surprised if we only get minor incremental improvements in GPTs from now on, improvements that would be hard to financially justify.
Well, if GPT-5 goes bust, so what? Since when did millennialists stop believing in the end of the world simply because it refuses to come when they predict it? It seems to me that faith in AGI-via-scaling is in the same category of beliefs. But there's always the possibility that Microsoft won't be willing to go to the brink of bankruptcy to test this particular belief.
Building AGI is really hard, yes. Simply scaling up won't be enough, indeed.
Most likely GPT-5 and Google's Gemini will be incremental improvements.
It is clear by now that LLM is not good at accuracy, verification, search, math. What it is, however, is a flexible and versatile approach at generating language and doing language tasks.
Likely OpenAI and Google will build an agent that will classify the input task, prepare background information, run one or more expert LLM, get the results, validate against a vast knowledge base, for some tasks generate and run code in the background, etc.
It is likely that Google will overtake OpenAI as Google has deeper and more diverse technical expertise.
"Most likely GPT-5 and Google's Gemini will be incremental improvements."
Another possibility it that the larger models will be _worse_ when it comes to benchmarks* of safety, not spewing vile rants, and the like. My reading is that a lot of hand effort has gone into the current models to patch them up each time an infelicity is found by a user, and there have been a lot of users and a lot of infelicities found. Thus making a larger model may just return them to the start of the patch the infelicities game. And that patching process may be seriously expensive. And perhaps not even possible, since the user base expects the infelicities have been fixed. (And the game of figuring out how to make a model do something stupid may have already gotten a old.)
*: One research group found ChatGPT-4 to be worse than ChatGPT3.5 on such benchmarks. Decoding Trust is an open source toolkit for assessing how trustworthy GPT models are.
By and large, GPT-4 was better-aligned and less biased than GPT-3. Then, Bard has not had notorious incidents like MS Sydney.
Then, hate speech detection can be also done after generation, if LLM can't be fully lobotomized. Such detection is a well-research topic with many applications, and I think companies have a vested interest to do a good job.
Classification based on similarities, as in neural networks or statistical ML in general, will inevitably fail because of "the curse of dimensionality" and "combinatorial explosion". That is why adding new models will not help, I am afraid.
I propose a "combinatorial hammer" - classification based on differences. It works like the game "20 Questions". It has two advantages - logarithmic complexity and the fact that in any context we consider only a limited number of objects.
Welcome to my Substack for details. It is time to start working on a new AI paradigm.
The objections to statistical machine learning, due to its statistical nature, have been around for decades. I am not knowledgeable enough to understand the technical details of your proposal, but it is not as if people did not mightily try alternatives.
So far, statistical learning has produced results beyond anybody's wildest dreams. What is also very exciting about language models that it does not count only on adding every more data, but it is a framework that uses language for communication and integration between other techniques.
Any system has its limits. Because of that one has to choose what to do. Many people decided to do LLMs, I decided to do an alternative. Who knows, maybe one day I will push some limits!
There is also the matter of data. For one thing, copyright issues have become thornier since GPT-4. For another, Generative AI has *already* poisoned the web. Finding good training data has become much, much harder.
Also, training (or perhaps running) a larger model might require resources that are not just expensive but simply inexistent. Even Microsoft Azure cannot provision an infinite number of GPUs, or infinite disk space.
I doubt GPT-5 makes sense financially. GPT-4 hasn't found a market yet beyond curiosity - which is probably fading fast. Language modelling has a great future but a lot of work needs to be done before the technology is really "market ready" for serious applications.
"People might even start to realize that building truly reliable, trustworthy AI is actually really, really hard." I would agree with that. If you take short cuts, they come back to bite you. Size alone is not the problem - we have 100 billion neurons at our beck and call (and have a four pieces limit), but if you don't know what meaning is, and hope to do better by more statistics, the answers are likely to get worse. All this stuff about the new release will fix all the problems - how do you fix a generative AI that gets the wrong answer (and gets stroppy about it)?
In the hopefully-soon-to-be-released video of my talk at EACBPM 2023, there is an example from the GPT3 paper (the last real paper by OpenAI) where, if you look at the quality improvement as a function of model size (and they use a logarithmic x-axis, which is always a warning sign), a quick and dirty calculation tells you that to get to the level of a human on that particular test, the model needs to be something like 10,000 to 100,000 *times* as large... There probably simply is no business case to grow more, and their attempts to scale back (I guess some should be possible) has failed (but would probably never have reached that order of magnitude anyway).
The next AI paradigm will replace similarities with differences and statistics with comparisons and filtering.
In 1973, Allen Newell mentioned, "You can’t play 20 questions with nature and win". But the game “20 Questions” perfectly illustrates how intelligence works. Note the underlying logarithmic complexity of the process. The real-time pressures of most real-world scenarios justify the adoption of this algorithm. It is time to abandon guessing and adopt the 20-question approach to AI.
I don't understand why you think so. If I have a context - a bunch of interrelated objects - how does that interfere with my recognizing objects and relationships among them? Or my preferred course of action in that context.
Note that humans long ago developed the algorithm for dealing with a mess - divide and conquer. If we follow a single priority, a messy context is rarely a problem.
ok, suppose you are trying to divide objects according to colour and you work with the primary colours - red, green and blue. How would you divide cyan, magenta and yellow objects, as those colours are mixtures of red, green and blue. You have to make new division criteria using cyan, magenta and yellow.
I do not work with three colors, I use all the colors of the rainbow and even shades if necessary. But everything depends on a person - one will go with seven colors only, the other one will differentiate dozens of colors and shades.
One may point to a magenta object, the other will recognize that object as purple - that reflects the idea of "close enough".
In my theory, I view properties as rulers with ranges. When we learn and specialize we introduce subranges, but there is always a place for vagueness (see the SEP entry). If one is OK with "close enough" reference so be it, otherwise one will need to introduce additional differentiating factors.
Note how generalization follows from that idea - we generalize when we ignore differentiating factors. So generalization depends on differences, not similarities. For example, the defining feature of mammals differentiates them from other animals, it does not explain all their similarities.
neural networks work like that, a neuron is basically a hyper-plane that splits the input space into two halves. Decision trees also do a similar thing.
Basically. It covers how 20 Questions work. But you will also need set intersections - to cover how Venn diagrams work. I would say it is about obligatory vs optional (one of multiple possibilities, like red or green apples) features and their values.
Is "key" about "door locking tool" or "pitch level"? Is "high" about "height" or "voice"? In "high key" which word constrains the meaning of the other? That is why we need set intersections.
Meh, kinda feels like the moment Intel discovered you couldnt simply run a core at 5 or 6Ghz. So they took a different path. I think LLMs "training" is probably where the most work needs to be done.
The future of LLMs does not look very bright in my opinion. I predict that LLMs will become obsolete as soon as AGI arrives on the scene.
Instead of gigantic, expensive, know-it-all systems that can't tell the difference between truth and falsehood, or good and bad, we will have many smaller embodied systems. Each will have been raised in the world to acquire a specific type expertise. The systems that I envision will develop an experiential understanding of the world based on visual, tactile and auditory sensors, complemented by a full range of motor effectors. After an initial upbringing, they will develop their expertise pretty much the way humans do, i.e., by going through a specific training or schooling regimen. True language understanding can only be built on top of a perceptual and behavioral framework immersed in the world.
I just don't expect this kind of AI to come from the generative AI paradigm.
I gather they might still be useful for some purposes, the same way calculators remained popular for many years after computers were introduced. For example, I believe that LLMs will continue to discover hidden statistical correlations that specialized intelligences might miss.
I predict that LLMs will become less of a hype (not totally useless, there are possibilities for it to be a real productivity enhancer, especially in creative tasks) long, long *before* AGI comes on the scene. for the simple reason that it will be a very long time (and probably requiring different hardware architectures) before we will have AGI. AGI is not fundamentally impossible, but as it is now, it can be considered 'practically impossible' still.
Yes. I agree that LLMs are so fundamentally unlike natural intelligence that, when AGI arrives on the scene, LLMs will continue to occupy a niche market as a tool for tasks that normal intelligence is not suited for. Even AGI will need to use tools for various purposes.
This being said, I see the failure of generative AI to solve AGI as the catalyst that will cause many researchers in the field to retrace their steps and explore new plausible paths toward that goal. In this light, AGI may be closer to being cracked than most experts estimate. One never knows.
I do not share that conviction, because people 'wanting' or 'expecting' AGI and necessary breakthroughs is a too flimsy foundation and also a the reason we experience hypes like GPT-fever. I am reminded of a cartoon in "Einstein for beginners" (good btw) where a character says something to the effect of "Not faster than the speed of light? That's un-American. We will break that barrier too"
Haha. I'm a little bit more optimistic than you are. AGI is possible and where there is a will, there is a way. Also, there is a growing number of serious researchers who can clearly see through the hype. Gary Marcus and his colleagues, for examples, are not easily fooled.
There is an infantile assumption in anything to do with AI that exponential growth is expected as normal, and also sustainable. I wonder why? Is it just a lack of numeracy skills, or basic scientific ignorance? Exponential growth is very rare in the natural world. Thing saturate or break very quickly. I wouldn't be surprised if we only get minor incremental improvements in GPTs from now on, improvements that would be hard to financially justify.
The main obstacle is the fundamental impossibility of eliminating shortcomings by increasing the size.
that's what she said
Who are you sir who are so wise in the ways of science
Well, if GPT-5 goes bust, so what? Since when did millennialists stop believing in the end of the world simply because it refuses to come when they predict it? It seems to me that faith in AGI-via-scaling is in the same category of beliefs. But there's always the possibility that Microsoft won't be willing to go to the brink of bankruptcy to test this particular belief.
Building AGI is really hard, yes. Simply scaling up won't be enough, indeed.
Most likely GPT-5 and Google's Gemini will be incremental improvements.
It is clear by now that LLM is not good at accuracy, verification, search, math. What it is, however, is a flexible and versatile approach at generating language and doing language tasks.
Likely OpenAI and Google will build an agent that will classify the input task, prepare background information, run one or more expert LLM, get the results, validate against a vast knowledge base, for some tasks generate and run code in the background, etc.
It is likely that Google will overtake OpenAI as Google has deeper and more diverse technical expertise.
"Most likely GPT-5 and Google's Gemini will be incremental improvements."
Another possibility it that the larger models will be _worse_ when it comes to benchmarks* of safety, not spewing vile rants, and the like. My reading is that a lot of hand effort has gone into the current models to patch them up each time an infelicity is found by a user, and there have been a lot of users and a lot of infelicities found. Thus making a larger model may just return them to the start of the patch the infelicities game. And that patching process may be seriously expensive. And perhaps not even possible, since the user base expects the infelicities have been fixed. (And the game of figuring out how to make a model do something stupid may have already gotten a old.)
*: One research group found ChatGPT-4 to be worse than ChatGPT3.5 on such benchmarks. Decoding Trust is an open source toolkit for assessing how trustworthy GPT models are.
https://github.com/AI-secure/DecodingTrust
By and large, GPT-4 was better-aligned and less biased than GPT-3. Then, Bard has not had notorious incidents like MS Sydney.
Then, hate speech detection can be also done after generation, if LLM can't be fully lobotomized. Such detection is a well-research topic with many applications, and I think companies have a vested interest to do a good job.
"hate speech detection can be also done after generation" - yeah, Facebook is just great at that :)
Classification based on similarities, as in neural networks or statistical ML in general, will inevitably fail because of "the curse of dimensionality" and "combinatorial explosion". That is why adding new models will not help, I am afraid.
I propose a "combinatorial hammer" - classification based on differences. It works like the game "20 Questions". It has two advantages - logarithmic complexity and the fact that in any context we consider only a limited number of objects.
Welcome to my Substack for details. It is time to start working on a new AI paradigm.
The objections to statistical machine learning, due to its statistical nature, have been around for decades. I am not knowledgeable enough to understand the technical details of your proposal, but it is not as if people did not mightily try alternatives.
So far, statistical learning has produced results beyond anybody's wildest dreams. What is also very exciting about language models that it does not count only on adding every more data, but it is a framework that uses language for communication and integration between other techniques.
A very flexible, general, extensible system.
Any system has its limits. Because of that one has to choose what to do. Many people decided to do LLMs, I decided to do an alternative. Who knows, maybe one day I will push some limits!
There is also the matter of data. For one thing, copyright issues have become thornier since GPT-4. For another, Generative AI has *already* poisoned the web. Finding good training data has become much, much harder.
Also, training (or perhaps running) a larger model might require resources that are not just expensive but simply inexistent. Even Microsoft Azure cannot provision an infinite number of GPUs, or infinite disk space.
I doubt GPT-5 makes sense financially. GPT-4 hasn't found a market yet beyond curiosity - which is probably fading fast. Language modelling has a great future but a lot of work needs to be done before the technology is really "market ready" for serious applications.
"People might even start to realize that building truly reliable, trustworthy AI is actually really, really hard." I would agree with that. If you take short cuts, they come back to bite you. Size alone is not the problem - we have 100 billion neurons at our beck and call (and have a four pieces limit), but if you don't know what meaning is, and hope to do better by more statistics, the answers are likely to get worse. All this stuff about the new release will fix all the problems - how do you fix a generative AI that gets the wrong answer (and gets stroppy about it)?
In the hopefully-soon-to-be-released video of my talk at EACBPM 2023, there is an example from the GPT3 paper (the last real paper by OpenAI) where, if you look at the quality improvement as a function of model size (and they use a logarithmic x-axis, which is always a warning sign), a quick and dirty calculation tells you that to get to the level of a human on that particular test, the model needs to be something like 10,000 to 100,000 *times* as large... There probably simply is no business case to grow more, and their attempts to scale back (I guess some should be possible) has failed (but would probably never have reached that order of magnitude anyway).
The next AI paradigm will replace similarities with differences and statistics with comparisons and filtering.
In 1973, Allen Newell mentioned, "You can’t play 20 questions with nature and win". But the game “20 Questions” perfectly illustrates how intelligence works. Note the underlying logarithmic complexity of the process. The real-time pressures of most real-world scenarios justify the adoption of this algorithm. It is time to abandon guessing and adopt the 20-question approach to AI.
https://alexandernaumenko.substack.com/
that strategy only works if the variables (questions) you are examining are independent, which is only very rarely the case
I don't understand why you think so. If I have a context - a bunch of interrelated objects - how does that interfere with my recognizing objects and relationships among them? Or my preferred course of action in that context.
Note that humans long ago developed the algorithm for dealing with a mess - divide and conquer. If we follow a single priority, a messy context is rarely a problem.
ok, suppose you are trying to divide objects according to colour and you work with the primary colours - red, green and blue. How would you divide cyan, magenta and yellow objects, as those colours are mixtures of red, green and blue. You have to make new division criteria using cyan, magenta and yellow.
I do not work with three colors, I use all the colors of the rainbow and even shades if necessary. But everything depends on a person - one will go with seven colors only, the other one will differentiate dozens of colors and shades.
One may point to a magenta object, the other will recognize that object as purple - that reflects the idea of "close enough".
In my theory, I view properties as rulers with ranges. When we learn and specialize we introduce subranges, but there is always a place for vagueness (see the SEP entry). If one is OK with "close enough" reference so be it, otherwise one will need to introduce additional differentiating factors.
Note how generalization follows from that idea - we generalize when we ignore differentiating factors. So generalization depends on differences, not similarities. For example, the defining feature of mammals differentiates them from other animals, it does not explain all their similarities.
neural networks work like that, a neuron is basically a hyper-plane that splits the input space into two halves. Decision trees also do a similar thing.
Basically. It covers how 20 Questions work. But you will also need set intersections - to cover how Venn diagrams work. I would say it is about obligatory vs optional (one of multiple possibilities, like red or green apples) features and their values.
Is "key" about "door locking tool" or "pitch level"? Is "high" about "height" or "voice"? In "high key" which word constrains the meaning of the other? That is why we need set intersections.
Scaling laws eventually set into any enterprise. No surprise there..
Meh, kinda feels like the moment Intel discovered you couldnt simply run a core at 5 or 6Ghz. So they took a different path. I think LLMs "training" is probably where the most work needs to be done.
The future of LLMs does not look very bright in my opinion. I predict that LLMs will become obsolete as soon as AGI arrives on the scene.
Instead of gigantic, expensive, know-it-all systems that can't tell the difference between truth and falsehood, or good and bad, we will have many smaller embodied systems. Each will have been raised in the world to acquire a specific type expertise. The systems that I envision will develop an experiential understanding of the world based on visual, tactile and auditory sensors, complemented by a full range of motor effectors. After an initial upbringing, they will develop their expertise pretty much the way humans do, i.e., by going through a specific training or schooling regimen. True language understanding can only be built on top of a perceptual and behavioral framework immersed in the world.
I just don't expect this kind of AI to come from the generative AI paradigm.
:”I predict that LLMs will become obsolete as soon as AGI arrives on the scene”
Why would anyone use an LLM if they had the real thing?
I gather they might still be useful for some purposes, the same way calculators remained popular for many years after computers were introduced. For example, I believe that LLMs will continue to discover hidden statistical correlations that specialized intelligences might miss.
I predict that LLMs will become less of a hype (not totally useless, there are possibilities for it to be a real productivity enhancer, especially in creative tasks) long, long *before* AGI comes on the scene. for the simple reason that it will be a very long time (and probably requiring different hardware architectures) before we will have AGI. AGI is not fundamentally impossible, but as it is now, it can be considered 'practically impossible' still.
Yes. I agree that LLMs are so fundamentally unlike natural intelligence that, when AGI arrives on the scene, LLMs will continue to occupy a niche market as a tool for tasks that normal intelligence is not suited for. Even AGI will need to use tools for various purposes.
This being said, I see the failure of generative AI to solve AGI as the catalyst that will cause many researchers in the field to retrace their steps and explore new plausible paths toward that goal. In this light, AGI may be closer to being cracked than most experts estimate. One never knows.
I do not share that conviction, because people 'wanting' or 'expecting' AGI and necessary breakthroughs is a too flimsy foundation and also a the reason we experience hypes like GPT-fever. I am reminded of a cartoon in "Einstein for beginners" (good btw) where a character says something to the effect of "Not faster than the speed of light? That's un-American. We will break that barrier too"
Haha. I'm a little bit more optimistic than you are. AGI is possible and where there is a will, there is a way. Also, there is a growing number of serious researchers who can clearly see through the hype. Gary Marcus and his colleagues, for examples, are not easily fooled.
I only wish people talking about AGI actually define what they mean by it.