DeepSeek R1's teaser for its version was: "U.S. sanctions aimed to cripple China's AI ambitionsโinstead, they forged an antifragile juggernaut. Discover how DeepSeek R1 turned chip shortages into a blueprint for dominance."
While DeepSeek R1's version has a notably patriotic tone from a Chinese perspective, that's not my main point.
Rather, I argue that sanctions only serve to make China antifragile. If they were allowed the ease and comfort of being dependent on NVIDIA, they would likely take the path of least resistance and inadvertently support the U.S. economy. This isn't to suggest China is lazy; they are extremely hard-working and smart. It's simply human nature, regardless of nationality, to take the easier path when it's available.
There is no doubt everyone has taken DeepSeek R1 seriously. NVIDIA shareholders certainly did and worked quite hard to defuse the wave of fear. U.S. frontier labs certainly did too.
Fully agree with your concluding comments about the need to sponsor innovation and nurture new contenders rather than ploughing more cash into the same behemoths.
It would be great to see you devote more blogs to highlighting some of the green shoot innovation that you feel needs more attention.
Why is it always driven by "winning"? This is not a race that can be won by anyone and nor should it be. We have more in common then what seperates us, despite what politicians in most countries like us to believe, to catch votes.
Why not develop a more collaborative attitude to collectively improve AI capabilities for the benefit of humankind. Digitization and in particular AI can enable equalization. My hope is that AI can help us to prepare for the consequences of climate change and a better version of democracy without the need of a 'middleman' (e.g. political parties and leaders) and everyone equally empowered to help develop policy and rules to govern ourselves.
Call me naive, but as countries we all should stop viewing our 'progress' as a race to win. Win what?
And money equals power is the expression. Brings up many ethical dilemmas. It would be a good start if countries don't consider themselves as a for-profit business and seek to dominate other sovereign nations under the guise of 'national interest'. Not trying to get political ๐
People keep talking about first mover advantage and use this to justify massive over-valuations of certain companies but isnโt the history of the internet a continuing story of the success of companies that were second but better
Google over Yahoo in search
iPods over mp3 players
Facebook over MySpace
There are so many others, why should ChatGPT be different
"The country that does its best to develop a new tech that is more reliable and more difficult to replicate will win."
Any single technique anybody will come up with will be replicated in a short while. This has always been the case. Always.
I understand the belief that LLM are a dead end. It does however model some aspects of human cognition.
I doubt any new architecture will do away with the need for implicit representations. The world is too complicated to spell out fully in a "principled" way.
So, most likely a new iteration on AI will keep a lot of the current paradigm, and do more careful work when it comes to memory, learning on the fly, better integration with world models. It will likely not be cheap computationally.
Addendum. China is having a hard time catchup up on semiconductors. But that because the semiconductor toolchain consists of many thousands of hardware companies at various levels. It took decades to evolve that. Software has much easier barriers to entry.
In his essay, Amodei writes: "In the end, AI companies in the US and other democracies must have better models than those in China if we want to prevail." - He doesn't really provide justification for this; do you know what the line of argument is for him, and of other tech leaders?
How do you believe, speaking of the bets, when a system โsmarter than any humanโwill be deployed? (Or just capable of replacing top skills in arts and sciences)
'๐๐บ๐ฎ๐ฟ๐๐ฒ๐ฟ ๐๐ต๐ฎ๐ป ๐ฎ๐น๐บ๐ผ๐๐ ๐ฎ๐น๐น ๐ต๐๐บ๐ฎ๐ป ๐ฏ๐ฒ๐ถ๐ป๐ด๐' is too tough a hurdle but smarter than Mr Trump would be a safe bet.
"Amodei's takeaway is to urge the U.S. to double down on export controls that would help his company. But the truth is that so far those controls have not been terribly successful;" Economist Krugman made this point to the Biden administration when they were putting together the export controls.
I have to wonder, does an LLM , or any kind of large neural net fit into an architecture for an AGI? GPUs provide vast amounts of computing power but itโs pretty much matrix multiplication all the way down. What kind of hardware/firmware architecture do we need for symbolic computing?
We canโt really know until weโve done the research to identify what General intelligence is, and then how to copy it in silico. So, at least at this point, doubling down on GPU technology is not necessarily a long term winning bet.
"We canโt really know until weโve done the research to identify what General intelligence is, and then how to copy it in silico."
The problem is: what is intelligence? It appears to be deeply connected to consciousness. And since we cannot be certain that consciousness is an emergent property of our biology, then perhaps the best we can ever hope to achieve is a facsimile of general intelligence.
Sort of in agreement (though Iโm not at all sure about the connection to consciousness, thatโs a whole separate discussion), and itโs not clear to me that we can tell the difference between simulation and emulation in this context. I think weโre always going to be stuck with what I call the Turing Dilemma: how to properly recognize, let alone understand an intelligence (or lack of it) different from our own.
what i got from the Amodei essay was that reasoning fine-tuning costs much less than a de novo model build. so going from V3 to R1 cost $5.5 million in power costs. is this a lot? how much did openai pay to build o1 from GPT4? is inference on R1 30x cheaper than using o1 ($2 vs $60 per million output tokens)? these are not small optimizations. one can run the R1 model for inference on a private server to see what the cost is, and therefore, what the efficiency gains on Deepseek's cloud server is? i assume these optimizations are not baked into the model.
Funny how no one talks about how heavily Chinese companies are subsidized by the Communist Party. They get massive government backing, which lets them undercut competition while hiding their real financials. Their numbers are totally underreported. Between that and all the copyright violations, is it any wonder theyโre outperforming U.S. companies?
In a Capitalist economic system, capital is provided one way.
In a Communist economic system, capital is provided another way.
All this subtext of "it's not fair", "they cheated", "it's just a trivial optimization artifact", "they couldn't have done it without help" is a distraction.
Let's focus on the fact that will move everyone forward: They produced something that scored highly according to benchmarks established by leaders in the field.
Let's not forget both OpenAI and Anthropic have billions of capital to put to work. They can readily demonstrate whether this result is significant.
As for copyright violations, it has been a free-for-all by everyone since the beginning this thing started.
โAutocomplete on steroidsโ. What we (edit laypersons) need so much are simple and efficient metaphors. Thank you very much for this one.
@garymarcus I wrote something yesterday and had DeepSeek R1 rewrite it.
My version's teaser was: "What if the very controls meant to stifle AI development are fueling a new wave of innovation instead?"
Here is my original version: https://bit.ly/3Cie44l
DeepSeek R1's teaser for its version was: "U.S. sanctions aimed to cripple China's AI ambitionsโinstead, they forged an antifragile juggernaut. Discover how DeepSeek R1 turned chip shortages into a blueprint for dominance."
Here is DeepSeek R1's version: https://bit.ly/4hvPvQ6
While DeepSeek R1's version has a notably patriotic tone from a Chinese perspective, that's not my main point.
Rather, I argue that sanctions only serve to make China antifragile. If they were allowed the ease and comfort of being dependent on NVIDIA, they would likely take the path of least resistance and inadvertently support the U.S. economy. This isn't to suggest China is lazy; they are extremely hard-working and smart. It's simply human nature, regardless of nationality, to take the easier path when it's available.
There is no doubt everyone has taken DeepSeek R1 seriously. NVIDIA shareholders certainly did and worked quite hard to defuse the wave of fear. U.S. frontier labs certainly did too.
Fully agree with your concluding comments about the need to sponsor innovation and nurture new contenders rather than ploughing more cash into the same behemoths.
It would be great to see you devote more blogs to highlighting some of the green shoot innovation that you feel needs more attention.
China is rapidly becoming an innovation powerhouse
https://itif.org/publications/2024/09/16/china-is-rapidly-becoming-a-leading-innovator-in-advanced-industries/
while the US is returning to its historic position as an intellectual backwater
https://www.scientificamerican.com/article/trump-cancels-science-reviews-at-nih-worlds-largest-public-biomedical/
Why is it always driven by "winning"? This is not a race that can be won by anyone and nor should it be. We have more in common then what seperates us, despite what politicians in most countries like us to believe, to catch votes.
Why not develop a more collaborative attitude to collectively improve AI capabilities for the benefit of humankind. Digitization and in particular AI can enable equalization. My hope is that AI can help us to prepare for the consequences of climate change and a better version of democracy without the need of a 'middleman' (e.g. political parties and leaders) and everyone equally empowered to help develop policy and rules to govern ourselves.
Call me naive, but as countries we all should stop viewing our 'progress' as a race to win. Win what?
Money.
And money equals power is the expression. Brings up many ethical dilemmas. It would be a good start if countries don't consider themselves as a for-profit business and seek to dominate other sovereign nations under the guise of 'national interest'. Not trying to get political ๐
People keep talking about first mover advantage and use this to justify massive over-valuations of certain companies but isnโt the history of the internet a continuing story of the success of companies that were second but better
Google over Yahoo in search
iPods over mp3 players
Facebook over MySpace
There are so many others, why should ChatGPT be different
"The country that does its best to develop a new tech that is more reliable and more difficult to replicate will win."
Any single technique anybody will come up with will be replicated in a short while. This has always been the case. Always.
I understand the belief that LLM are a dead end. It does however model some aspects of human cognition.
I doubt any new architecture will do away with the need for implicit representations. The world is too complicated to spell out fully in a "principled" way.
So, most likely a new iteration on AI will keep a lot of the current paradigm, and do more careful work when it comes to memory, learning on the fly, better integration with world models. It will likely not be cheap computationally.
Addendum. China is having a hard time catchup up on semiconductors. But that because the semiconductor toolchain consists of many thousands of hardware companies at various levels. It took decades to evolve that. Software has much easier barriers to entry.
In his essay, Amodei writes: "In the end, AI companies in the US and other democracies must have better models than those in China if we want to prevail." - He doesn't really provide justification for this; do you know what the line of argument is for him, and of other tech leaders?
Sounds like sore-loser excuses.
How do you believe, speaking of the bets, when a system โsmarter than any humanโwill be deployed? (Or just capable of replacing top skills in arts and sciences)
'๐๐บ๐ฎ๐ฟ๐๐ฒ๐ฟ ๐๐ต๐ฎ๐ป ๐ฎ๐น๐บ๐ผ๐๐ ๐ฎ๐น๐น ๐ต๐๐บ๐ฎ๐ป ๐ฏ๐ฒ๐ถ๐ป๐ด๐' is too tough a hurdle but smarter than Mr Trump would be a safe bet.
"Amodei's takeaway is to urge the U.S. to double down on export controls that would help his company. But the truth is that so far those controls have not been terribly successful;" Economist Krugman made this point to the Biden administration when they were putting together the export controls.
I have to wonder, does an LLM , or any kind of large neural net fit into an architecture for an AGI? GPUs provide vast amounts of computing power but itโs pretty much matrix multiplication all the way down. What kind of hardware/firmware architecture do we need for symbolic computing?
We canโt really know until weโve done the research to identify what General intelligence is, and then how to copy it in silico. So, at least at this point, doubling down on GPU technology is not necessarily a long term winning bet.
"We canโt really know until weโve done the research to identify what General intelligence is, and then how to copy it in silico."
The problem is: what is intelligence? It appears to be deeply connected to consciousness. And since we cannot be certain that consciousness is an emergent property of our biology, then perhaps the best we can ever hope to achieve is a facsimile of general intelligence.
Sort of in agreement (though Iโm not at all sure about the connection to consciousness, thatโs a whole separate discussion), and itโs not clear to me that we can tell the difference between simulation and emulation in this context. I think weโre always going to be stuck with what I call the Turing Dilemma: how to properly recognize, let alone understand an intelligence (or lack of it) different from our own.
Neah, this is not spamming, this is indeed being on top of things. Thanks for your insights.
what i got from the Amodei essay was that reasoning fine-tuning costs much less than a de novo model build. so going from V3 to R1 cost $5.5 million in power costs. is this a lot? how much did openai pay to build o1 from GPT4? is inference on R1 30x cheaper than using o1 ($2 vs $60 per million output tokens)? these are not small optimizations. one can run the R1 model for inference on a private server to see what the cost is, and therefore, what the efficiency gains on Deepseek's cloud server is? i assume these optimizations are not baked into the model.
Funny how no one talks about how heavily Chinese companies are subsidized by the Communist Party. They get massive government backing, which lets them undercut competition while hiding their real financials. Their numbers are totally underreported. Between that and all the copyright violations, is it any wonder theyโre outperforming U.S. companies?
It takes capital.
In a Capitalist economic system, capital is provided one way.
In a Communist economic system, capital is provided another way.
All this subtext of "it's not fair", "they cheated", "it's just a trivial optimization artifact", "they couldn't have done it without help" is a distraction.
Let's focus on the fact that will move everyone forward: They produced something that scored highly according to benchmarks established by leaders in the field.
Let's not forget both OpenAI and Anthropic have billions of capital to put to work. They can readily demonstrate whether this result is significant.
As for copyright violations, it has been a free-for-all by everyone since the beginning this thing started.