The affordability of DeepSeek is a myth: The revolutionary AI actually cost $1.6 billion to develop
DeepSeek's surprisingly inexpensive AI model, DeepSeek V3, has shaken the AI market, causing significant drops in NVIDIA's stock price. While DeepSeek boasts a remarkably low training cost of $6 million, using only 2048 GPUs, a closer look reveals a more complex reality.
Image: ensigame.com
DeepSeek V3's innovative architecture is key to its performance. It utilizes:
- Multi-token Prediction (MTP): Predicting multiple words simultaneously for increased accuracy and efficiency.
- Mixture of Experts (MoE): Employing 256 neural networks, activating eight for each token, speeding up training and improving performance.
- Multi-head Latent Attention (MLA): Repeatedly focusing on key sentence parts to minimize information loss and capture crucial nuances.
Image: ensigame.com
However, SemiAnalysis uncovered DeepSeek's massive infrastructure: approximately 50,000 Nvidia Hopper GPUs, including H800, H100, and H20 units, spread across multiple data centers. This represents a total server investment of roughly $1.6 billion, with operational expenses estimated at $944 million. The $6 million figure only reflects pre-training GPU costs, excluding research, refinement, data processing, and infrastructure.
DeepSeek, a subsidiary of High-Flyer, a Chinese hedge fund, owns its data centers, unlike cloud-reliant competitors. This ownership grants greater control and faster innovation. The company's self-funding model enhances agility. Furthermore, DeepSeek attracts top talent, with some researchers earning over $1.3 million annually, primarily from Chinese universities.
Image: ensigame.com
While DeepSeek's claimed $6 million training cost is misleading, its overall investment exceeds $500 million. Its lean structure allows for efficient innovation, contrasting with larger, more bureaucratic companies.
Image: ensigame.com
DeepSeek's success highlights the competitive potential of well-funded independent AI companies. While the "revolutionary budget" claim is exaggerated, its achievements are undeniable, especially considering competitors' substantially higher costs (e.g., DeepSeek's R1 model cost $5 million versus ChatGPT4's $100 million). The company's success is a testament to significant investment, technical breakthroughs, and a strong team.