“Most of the infrastructure cost for AI is for inference: serving AI assistants to billions of people.”
— Yann LeCun, VP & Chief AI Scientist at Meta
Yann LeCun made this comment in response to the sharp drop in Nvidia’s share price on January 27, 2024, following the launch of Deepseek R1, a new AI model developed by Deepseek AI. This model was reportedly trained at a fraction of the cost incurred by Hyperscalers like OpenAI, Anthropic, and Google DeepMind, raising questions about whether Nvidia’s dominance in AI compute was at risk.
The market reaction stemmed from speculation that the training costs of cutting-edge AI models—previously seen as a key driver of Nvidia’s GPU demand—could decrease significantly with more efficient methods. However, LeCun pointed out that most AI infrastructure costs come not from training but from inference, the process of running AI models at scale to serve billions of users. This suggests that Nvidia’s long-term demand may remain strong, as inference still relies heavily on high-performance GPUs.
LeCun’s view aligned with analyses from key AI investors and industry leaders. He supported the argument made by Antoine Blondeau, co-founder of Alpha Intelligence Capital, who described Nvidia’s stock drop as “vastly overblown” and “NOT a ‘Sputnik moment’”, referencing the concern that Nvidia’s market position was insecure. Additionally, Jonathan Ross, founder of Groq, shared a video titled “Why $500B isn’t enough for AI,” explaining why AI compute demand remains insatiable despite efficiency gains.
This discussion underscores a critical aspect of AI economics: while training costs may drop with better algorithms and hardware, the sheer scale of inference workloads—powering AI assistants, chatbots, and generative models for billions of users—remains a dominant and growing expense. This supports the case for sustained investment in AI infrastructure, particularly in Nvidia’s GPUs, which continue to be the gold standard for inference at scale.