DeepSeek r1 is not smarter than earlier models, just trained more cheaply
It doesn't solve hallucinations or problems with reliability.
It is still expensive to operate ("inference"), especially if you want to make it "think" longer like o3.
The biggest threat isn't to Nvidia (people will still need GPUs, albeit fewer higher end ones), but rather to OpenAI and Anthropic (because price wars will undercut their hopes of making a profit). Furthermore, unlike some companies I could name, DeepSeek is actually, well, Open, which may help them lure talent from more walled off operations..
DeepSeek is an economic revolution, and geopolitical wake-up call, but that doesn't directly bring us any closer to AGI.
Gary Marcus, called the Nvidia correction here, on Sunday, before it happened,. He will discuss DeepSeek’s implications more today on CNBC Squawk on the Street, at roughly 1030am ET.
DeepSeek has disrupted several long-standing assumptions in AI development:
1. “We have a special sauce, and we are very smart.” No, you don’t. True progress comes from disciplined, principled work grounded in the scientific method. Success is achievable by anyone who approaches the challenge with persistence and rigour, not by clinging to secrecy or overconfidence.
2. “Brute force is the answer.” Relying on vast amounts of data or compute power has never been the optimal strategy. Stacking GPUs without deeper insight is uninspired. A purposeful understanding of training processes and operational mechanisms is far more effective than brute force.
3. “The transformer is enough; let’s focus on propaganda and market dominance.” Incorrect. Real innovation requires reviewing, improving, and evolving current techniques, not settling into complacency and prioritising market share over meaningful advancement.
4. “Heavy investment drives innovation.” Not necessarily. Sharing knowledge and fostering collaboration are more powerful than centralising resources in a few hands. Knowledge distribution outpaces capital concentration in driving progress.
The AI bubble has burst, and it’s a critical moment for the industry. Vast resources have been squandered on unsustainable practices, and now is the time to take stock and recalibrate. We need to leave behind speculation and hype, addressing the fundamental problems transformers have revealed over the last eight years. With lessons learned, the focus must shift to creating something truly innovative and sustainable.
It’s time to abandon brute force as a stand-in for understanding. Prioritise evidence-based analysis and raise the bar for the industry as a whole.
Practical steps forward:
• Investigate the learning process, tracing outputs back to training data to identify what skills or behaviours are being promoted.
• Embrace curriculum-based training data to shape more effective and purposeful learning.
• Move beyond traditional Euclidean geometries, adopting structures better suited to the discrete and hierarchical nature of language.
• Replace black-box evaluations with holistic frameworks based on clear mathematical models, free from anthropomorphic or subjective biases.
• Develop hybrid approaches that integrate continuous stochastic distributions with symbolic latent spaces.
Now is the moment to reimagine AI development with clarity, creativity, and accountability.
I appreciate that this development undercuts all conversations about how wonderful AI will be for humanity. If that’s why we’re doing it then wouldn’t folks celebrate someone doing it cheaper and openly? China was also at the forefront of making cheap textbooks that were not subject to the intellectual property and copyright laws that make them so difficult to access in the West.