The measure of scaling is wrong. Data scaled as compute scaled and it is probably the amount of data that affected the model. https://arxiv.org/abs/2404.04125
The predicted shape of the scaling function is wrong. If it requires exponentially more data for linear improvements, then it must slow down over time.
The measure of intelligence is wrong. Intelligence cannot be measured by existing benchmarks when the model has the opportunity and the means to memorize the answers (or very similar answers).
The models are wrong. LLMs model language, not cognition.
How much of this is a question human beings just being human beings, and, how much is just the professional media relations people feeding the financial hype machine? Where does people's wishful thinking end and actual fraud begin?
The surprise is not that CEOs hype their products. Instead, it's that ignorance of how LLMs (and artificial neural networks generally) actually work that allows the hype to be believed. If they were making cars, say, and claimed that future models were going to go 1000 mph within 5 years, they would be immediately asked what technology they would use and be ridiculed if they didn't have a good answer.
Wish this was true! The claims about autonomous vehicles have gone largely unchecked, as have overly ambitious corporate commitments to reduce climate change.
Unsupported claims about autonomous vehicles have gone largely unchecked…except by the concrete barriers, tractor trailers, parked fire trucks and other stuff they have hit, that is.
The fourth estate, an open and free press, is broken, that's why they get away with it. Social media and big tech empires have absolutely destroyed journalism in a way that Hearst and Pulitzer could only dream of.
The 4th Estate was already in decline due to the FCC ending enforcement of use of airwaves, and consolidation of news outlets due to an end of SEC enforcement.
Rupert Murdoch was the biggest consolidator, and by 1990 put his imprint on news around the world. To survive, other outlets consolidated, and this has become nearly complete today. This drove the death of the 4th Estate more than anything else.
Understanding comes from applying knowledge. It is the next step. It made sense to put data in the pot for as long as it did well, and now need to focus more on validating hypotheses learned from knowledge, and interactive work with a feedback loop.
I was speaking more broadly about society. The context in which hype gets so out of hand. Insisting that an unproven hypothesis (scaling will continue to yield exponentially great improvements) is true is the opposite of science.
That is true. Nothing is really exponential, and even AI people understand that. They push a trend while it lasts, but any one company that bets on one thing only will lose out.
ha! who follows you is how I judge who knows something about AI. The people who are recognized as "experts" constantly display how little they know about the "I" part. And yes, there's a lot of wishfull thinking for a "next tech" -- especially the wealth aspect, and--a kind of Dear Santa, why doesn't Moore's law apply to everything? But the lemming mentality of tech and tech financing has jumped the shark! It's dogma founded on wishes from the self-proclaimed rational, science-based tribe who are also hypocrites, who threw shade at people for learning via dead trees, and then turn a blind eye to the immoral energy consumption of their wanna be tech.
If you want to start understanding Venture Capital, Mulcahy is required reading. It's her writing. Her candid writing is unusual in the field. Can be quite funny in an understated, dry way.
The whole thing reminds me of Hollywood investing in yet another rehash or sequel in a franchise instead of trying out a fresh idea. Investor risk aversion is why we'll eventually get Shrek 19 or whatever. LLMs had such a string of blockbusters, the ROI is gonna have to bottom out pretty hard before people will pivot, and as we've seen in the movie industry, even then they might not.
It does go downhill from here Gary. The worst is when 25 years from now they are still denying and still committing the same errors because professors don't want to update their teaching.
An example of this in another field is the TLR4 problem in human immune systems. Unlike every other animal with TLR4 (insects branched off long before) humans have no functional siglecs to damp the signal. This is why humans are uniquely susceptible to septic conditions. It's a feedback scream in the immune system.
One technology this critically affects is gene therapy. Because the dose of carrier vectors easily goes over the threshold of TLR4. And because this is not taught, protocols don't list it. So there is no diagnostic criteria, physicians don't expect it. Because they don't expect it, and there is no comprehensive response protocol, and it happens so fast, the person (often a kid) dies "mysteriously" from a "cytokine storm".
This was figured out a couple years after the first death in gene therapy, Jesse Gelsinger.
The entire Computer Industry has a deep and abiding ignorance of Biology. Theranos is the poster child but AlphaFold's absurd claim to have "solved protein chemistry" is Right Up There.
Thomas Kuhn wrote an extended elaboration on Schopenhauer's thesis, with copious examples from the history of science. Highly recommended.
Andy Grove and I pointed out the asymptotic inflection of Moore's Law a decade ago. I was an "enforcer" of Moore's Law from '85, when sub-ppb Carbon Analysis was made available by Anatel, and although everyone would like to compare their work to Moore's Law, it was a singularity. Nothing else will EVER decrease in size, mass, energy, and cost by 10 orders of magnitude.
People want to forget that the drivers behind ALL "high tech" were the inventions of transistors and integrated circuits. FETs made flash and low voltage high speed circuits practical, and Silicon Carbide made EVs, next gen robots and rockets possible, as well as many of the grid upgrades for (somewhat greener) energy. All fields of STEM benefitted from quantum effect transduction and electronic data processing.
It is interesting that the current processor fad for AI is archtecture optimized for graphic processing. Makes sense for imaging systems, but not for language - possibly reasoning? What is really needed, and what few recognize in organic intelligence, is a computing system system based on "morphic resonance" (cf. Rupert Sheldrake). This follows heuristic principles.
Meanwhile the salesmen, marketers and hypemasters become the richest and most famous men and the hands-on legions who effected all these changes fade into anonymity. One defninition of "Entrepreneur" is someone who jumps on the bandwagon when the product development is 95% done.
Yes. And a VC needs to attend to their ability to cash out. This is their first principle. That is driven by publicity. The ultimate in hype is Uber, which only lost and loses money.
It's nice and helpful when an investment is solid. But, as with Uber, pure hype flies. Solid investments tend to walk.
One datapoint isn't enough to draw a strong conclusion. Two? That defines a trend. And three might be defining a law, particularly when umpteen cognitive biases are operating. And it's not just technology analysts that make these mistakes! (Former head of research at Gartner, Inc.)
Well, Gartner does have a wee bit of incentive from time to time to extrapolate based on pleasant cognitive biases and a little data. This could even be called Gartner's business model.
It should come as no surprise that the CEO of a company that makes LLMs wants people to believe that LLMs will continue to improve at a high rate forever.
Nadella's quote makes it clear that in order to remain competitive, he took to behaving like an LLM and lifted your words without attribution. Great Timeline!
What we will do next is simple. Deep learning is one component. Architecture can be built on top of it, of which LLM is just a single approach. Better models, with a lot more data, together will retrieval, verification, invocation of world models, simulators, symbolic methods, as needed, will continue to make the tools better.
“Not only is AI not hitting a wall, but cars with AI powered driving assistance aren’t hitting walls, of anything else, either” — Yann LeCun
Depending on one’s definition of “wall”, Teslas MIGHT not have hit any walls.
But they did semi-regularly hit concrete highway barriers (walls?), giant tractor trailers, parked emergency vehicles and other stuff that LeCun was obviously not aware of.
The scaling hypothesis is wrong. It depends on magic. https://www.linkedin.com/pulse/state-thought-genai-herbert-roitblat-kxvmc
The measure of scaling is wrong. Data scaled as compute scaled and it is probably the amount of data that affected the model. https://arxiv.org/abs/2404.04125
The predicted shape of the scaling function is wrong. If it requires exponentially more data for linear improvements, then it must slow down over time.
The measure of intelligence is wrong. Intelligence cannot be measured by existing benchmarks when the model has the opportunity and the means to memorize the answers (or very similar answers).
The models are wrong. LLMs model language, not cognition.
So, what's next? That is what my book is about. Here is an excerpt: https://thereader.mitpress.mit.edu/ai-insight-problems-quirks-human-intelligence/ In the book I lay out a roadmap for the future of artificial intelligence. As Yogi said: "If you don't know where you're going, you might end up someplace else.
"If it requires exponentially more data for linear improvements, then it must slow down over time." This. Yes.
It seems it should act like a classic resource and network problem. So it has to exhibit asymptotic behavior.
How much of this is a question human beings just being human beings, and, how much is just the professional media relations people feeding the financial hype machine? Where does people's wishful thinking end and actual fraud begin?
The surprise is not that CEOs hype their products. Instead, it's that ignorance of how LLMs (and artificial neural networks generally) actually work that allows the hype to be believed. If they were making cars, say, and claimed that future models were going to go 1000 mph within 5 years, they would be immediately asked what technology they would use and be ridiculed if they didn't have a good answer.
Wish this was true! The claims about autonomous vehicles have gone largely unchecked, as have overly ambitious corporate commitments to reduce climate change.
Unsupported claims about autonomous vehicles have gone largely unchecked…except by the concrete barriers, tractor trailers, parked fire trucks and other stuff they have hit, that is.
Trees. Don't forget the trees.
Can the trees ever forgive me?
The fourth estate, an open and free press, is broken, that's why they get away with it. Social media and big tech empires have absolutely destroyed journalism in a way that Hearst and Pulitzer could only dream of.
The 4th Estate was already in decline due to the FCC ending enforcement of use of airwaves, and consolidation of news outlets due to an end of SEC enforcement.
Rupert Murdoch was the biggest consolidator, and by 1990 put his imprint on news around the world. To survive, other outlets consolidated, and this has become nearly complete today. This drove the death of the 4th Estate more than anything else.
We have a listening problem. Knowledge without understanding. The incentives are all wrong.
Understanding comes from applying knowledge. It is the next step. It made sense to put data in the pot for as long as it did well, and now need to focus more on validating hypotheses learned from knowledge, and interactive work with a feedback loop.
I was speaking more broadly about society. The context in which hype gets so out of hand. Insisting that an unproven hypothesis (scaling will continue to yield exponentially great improvements) is true is the opposite of science.
That’s why it’s spelled “hype-othesis”, to differentiate it from a scientific hypothesis
A hype-othesis doesn’t have to be true to generate investment.
All it has to be is sufficiently hyperbolic and “sciencey” sounding.
And “scaling” sounds very sciencey.
And combined with “exponential” you will have investors eating out of your hand
That is true. Nothing is really exponential, and even AI people understand that. They push a trend while it lasts, but any one company that bets on one thing only will lose out.
I agree, but I have one nit to pick: you misspelled “incentive$”
Money does talk 🤑
Unfortunately, when it comes to AI, money moneypolizes the conver$ation
ha! who follows you is how I judge who knows something about AI. The people who are recognized as "experts" constantly display how little they know about the "I" part. And yes, there's a lot of wishfull thinking for a "next tech" -- especially the wealth aspect, and--a kind of Dear Santa, why doesn't Moore's law apply to everything? But the lemming mentality of tech and tech financing has jumped the shark! It's dogma founded on wishes from the self-proclaimed rational, science-based tribe who are also hypocrites, who threw shade at people for learning via dead trees, and then turn a blind eye to the immoral energy consumption of their wanna be tech.
If you want to start understanding Venture Capital, Mulcahy is required reading. It's her writing. Her candid writing is unusual in the field. Can be quite funny in an understated, dry way.
https://www.kauffman.org/reports/we-have-met-the-enemy-and-he-is-us/
The whole thing reminds me of Hollywood investing in yet another rehash or sequel in a franchise instead of trying out a fresh idea. Investor risk aversion is why we'll eventually get Shrek 19 or whatever. LLMs had such a string of blockbusters, the ROI is gonna have to bottom out pretty hard before people will pivot, and as we've seen in the movie industry, even then they might not.
It does go downhill from here Gary. The worst is when 25 years from now they are still denying and still committing the same errors because professors don't want to update their teaching.
An example of this in another field is the TLR4 problem in human immune systems. Unlike every other animal with TLR4 (insects branched off long before) humans have no functional siglecs to damp the signal. This is why humans are uniquely susceptible to septic conditions. It's a feedback scream in the immune system.
One technology this critically affects is gene therapy. Because the dose of carrier vectors easily goes over the threshold of TLR4. And because this is not taught, protocols don't list it. So there is no diagnostic criteria, physicians don't expect it. Because they don't expect it, and there is no comprehensive response protocol, and it happens so fast, the person (often a kid) dies "mysteriously" from a "cytokine storm".
This was figured out a couple years after the first death in gene therapy, Jesse Gelsinger.
It still happens. Mysteriously.
The entire Computer Industry has a deep and abiding ignorance of Biology. Theranos is the poster child but AlphaFold's absurd claim to have "solved protein chemistry" is Right Up There.
I thought the physicists had already claimed to have “solved chemistry.”
And that the chemists had already claimed to have solved biology.
it would seem that all that remains is for the chatbots to solve physics.
Then, everything will be solved.
Thomas Kuhn wrote an extended elaboration on Schopenhauer's thesis, with copious examples from the history of science. Highly recommended.
Andy Grove and I pointed out the asymptotic inflection of Moore's Law a decade ago. I was an "enforcer" of Moore's Law from '85, when sub-ppb Carbon Analysis was made available by Anatel, and although everyone would like to compare their work to Moore's Law, it was a singularity. Nothing else will EVER decrease in size, mass, energy, and cost by 10 orders of magnitude.
People want to forget that the drivers behind ALL "high tech" were the inventions of transistors and integrated circuits. FETs made flash and low voltage high speed circuits practical, and Silicon Carbide made EVs, next gen robots and rockets possible, as well as many of the grid upgrades for (somewhat greener) energy. All fields of STEM benefitted from quantum effect transduction and electronic data processing.
It is interesting that the current processor fad for AI is archtecture optimized for graphic processing. Makes sense for imaging systems, but not for language - possibly reasoning? What is really needed, and what few recognize in organic intelligence, is a computing system system based on "morphic resonance" (cf. Rupert Sheldrake). This follows heuristic principles.
Meanwhile the salesmen, marketers and hypemasters become the richest and most famous men and the hands-on legions who effected all these changes fade into anonymity. One defninition of "Entrepreneur" is someone who jumps on the bandwagon when the product development is 95% done.
Yes. And a VC needs to attend to their ability to cash out. This is their first principle. That is driven by publicity. The ultimate in hype is Uber, which only lost and loses money.
It's nice and helpful when an investment is solid. But, as with Uber, pure hype flies. Solid investments tend to walk.
Pure hype attracts flies too
One datapoint isn't enough to draw a strong conclusion. Two? That defines a trend. And three might be defining a law, particularly when umpteen cognitive biases are operating. And it's not just technology analysts that make these mistakes! (Former head of research at Gartner, Inc.)
Well, Gartner does have a wee bit of incentive from time to time to extrapolate based on pleasant cognitive biases and a little data. This could even be called Gartner's business model.
Just recently: https://x.com/sama/status/1856941766915641580
It should come as no surprise that the CEO of a company that makes LLMs wants people to believe that LLMs will continue to improve at a high rate forever.
Nadella's quote makes it clear that in order to remain competitive, he took to behaving like an LLM and lifted your words without attribution. Great Timeline!
This is an example of scientific thinking being co-opted by greed and capitalism. And wishful thinking…
"The question now is, what we will do next?" - On it! :-)
What we will do next is simple. Deep learning is one component. Architecture can be built on top of it, of which LLM is just a single approach. Better models, with a lot more data, together will retrieval, verification, invocation of world models, simulators, symbolic methods, as needed, will continue to make the tools better.
"What we will do next is simple"
The hype never stops!
Your analysis is accurate. I agree with your overall sentiment.
As for the future. This is my proposal. Not spamming. This is genuine conversation starter: https://ai-cosmos.hashnode.dev/beyond-correlation-giving-llms-a-symbolic-crutch-with-graph-based-logic
Whoops, Nadalla is CEO, not CTO, no?
yikes, not sure how I missed that. thanks! (and now fixed in the online version)
“Not only is AI not hitting a wall, but cars with AI powered driving assistance aren’t hitting walls, of anything else, either” — Yann LeCun
Depending on one’s definition of “wall”, Teslas MIGHT not have hit any walls.
But they did semi-regularly hit concrete highway barriers (walls?), giant tractor trailers, parked emergency vehicles and other stuff that LeCun was obviously not aware of.
🤣
Really. LeCun didn't know?
The accident. Police were sure nobody was driving. Car burned for 4 hours.
https://www.washingtonpost.com/nation/2021/04/19/tesla-texas-driverless-crash/
The AI Incident report.
https://incidentdatabase.ai/cite/337/
"Alleged: " But yer honor. Nobody could tell anything from those crispy critters!
I am being generous today