| |
|
Our selection of the top business news sources on the web.
AM edition. Issue number 1115
Latest 10 stories. Click the button for more.
|
| |
“The talk about AI bubbles seemed very divorced from what was happening in frontier labs and what we were seeing. We are not seeing any slowdown of progress.” - Julian Schrittwieser - Anthropic
Those closest to technical breakthroughs are witnessing a pattern of sustained, compounding advancement that is often underestimated by commentators and investors. This perspective underscores both the power and limitations of conventional intuitions regarding exponential technological progress.
Context of the Quote
Schrittwieser delivered these remarks in a 2025 interview on the MAD Podcast, prompted by widespread discourse on the so-called ‘AI bubble’. His key contention is that debate around an AI investment or hype “bubble” feels disconnected from the lived reality inside the world’s top research labs, where the practical pace of innovation remains brisk and outwardly undiminished. He outlines that, according to direct observation and internal benchmarks at labs such as Anthropic, progress remains on a highly consistent exponential curve: “every three to four months, the model is able to do a task that is twice as long as before completely on its own”.
He draws an analogy to the early days of COVID-19, where exponential growth was invisible until it became overwhelming; the same mathematical processes, Schrittwieser contends, apply to AI system capabilities. While public narratives about bubbles often reference the dot-com era, he highlights a bifurcation: frontier labs sustain robust, revenue-generating trajectories, while the wider AI ecosystem might experience bubble-like effects in valuation. But at the core—the technology itself continues to improve at a predictably exponential rate well supported by both qualitative experience and benchmark data.
Schrittwieser’s view, rooted in immediate, operational knowledge, is that the default expectation of a linear future is mistaken: advances in autonomy, reasoning, and productivity are compounding. This means genuinely transformative impacts—such as AI agents that function at expert level or beyond for extended, unsupervised tasks—are poised to arrive sooner than many anticipate.
Profile: Julian Schrittwieser
Julian Schrittwieser is one of the world’s leading artificial intelligence researchers, currently based at Anthropic, following a decade as a core scientist at Google DeepMind. Raised in rural Austria, Schrittwieser’s journey from an adolescent fascination with game programming to the vanguard of AI research exemplifies the discipline’s blend of curiosity, mathematical rigour, and engineering prowess. He studied computer science at the Vienna University of Technology, before interning at Google.
Schrittwieser was a central contributor to several historic machine learning milestones, most notably:
- AlphaGo, the first program to defeat a world champion at Go, combining deep neural networks with Monte Carlo Tree Search.
- AlphaGo Zero and AlphaZero, which generalised the approach to achieve superhuman performance without human examples, through self-play—demonstrating true generality in reinforcement learning.
- MuZero (as lead author), solving the challenge of mastering environments without even knowing the rules in advance, by enabling the system to learn its own internal, predictive world models—an innovation bringing RL closer to complex, real-world domains.
- Later work includes AlphaCode (code synthesis), AlphaTensor (algorithmic discovery), and applied advances in Gemini and AlphaProof.
At Anthropic, Schrittwieser is at the frontier of research into scaling laws, reinforcement learning, autonomous agents, and novel techniques for alignment and safety in next-generation AI. True to his pragmatic ethos, he prioritises what directly raises capability and reliability, and advocates for careful, data-led extrapolation rather than speculation.
Theoretical Backstory: Exponential AI Progress and Key Thinkers
Schrittwieser’s remarks situate him within a tradition of AI theorists and builders focused on scaling laws, reinforcement learning (RL), and emergent capabilities:
Leading Theorists and Historical Perspective
These thinkers converge on several key observations directly reflected in Schrittwieser's view:
- Exponential Capability Curves: Consistent advances in performance often surprise those outside the labs due to our poor intuitive grasp of exponentiality—what Schrittwieser terms a repeated “failure to understand the exponential”.
- Scaling Laws and Reinforcement Learning: Improvements are not just about larger models, but ever-better training, more reliable reinforcement learning, agentic architecture, and robust reward systems—developments Schrittwieser's work epitomises.
- Novelty and Emergence: Historically, theorists doubted whether neural models could go beyond sophisticated mimicry; the “Move 37” moment (AlphaGo’s unprecedented move in Go) was a touchstone for true machine creativity, a theme Schrittwieser stresses remains highly relevant today.
- Bubbles, Productivity, and Market Cycles: Mainstream financial and social narratives may oscillate dramatically, but real capability growth—observable in benchmarks and direct use—has historically marched on undeterred by speculative excesses.
Synthesis: Why the Perspective Matters
The quote foregrounds a gap between external perceptions and insider realities. Pioneers like Schrittwieser and his cohort stress that transformative change will not follow a smooth, linear or hype-driven curve, but an exponential, data-backed progression—one that may defy conventional intuition, but is already reshaping productivity and the structure of work.
This moment is not about “irrational exuberance”, but rather the compounding product of theoretical insight, algorithmic audacity, and relentless engineering: the engine behind the next wave of economic and social transformation.

|
| |
| |
“AI is so wonderful because there have been a number of seismic shifts where the entire field has suddenly looked a different way. I've maybe lived through two or three of those. I still think there will continue to be some because they come with almost surprising regularity.” - Andrej Karpathy - Ex-OpenAI, Ex-Tesla AI
Andrej Karpathy, one of the most recognisable figures in artificial intelligence, has spent his career at the epicentre of the field’s defining moments in both research and large-scale industry deployment.
Karpathy’s background is defined by deep technical expertise and a front-row seat to AI’s rapid evolution. Having completed his PhD at Stanford and held pivotal research positions, he worked alongside Geoffrey Hinton at the University of Toronto during the early surge of deep learning. His career encompasses key roles at Tesla, where he led the Autopilot vision team, and at OpenAI, contributing to some of the world’s most prominent large language models and generative AI systems. This vantage point has allowed him to participate in, and reflect upon, the discipline’s “seismic shifts”.
Karpathy’s narrative has been shaped by three inflection points:
- The emergence of deep neural networks from a niche field to mainstream AI, spearheaded by the success of AlexNet and the subsequent shift of the research community toward neural architectures.
- The drive towards agent-based systems, with early enthusiasm for reinforcement learning (RL) and game-based environments (such as Atari and Go). Karpathy himself was cautious about the utility of games as the true path to intelligence, focusing instead on agents acting within the real digital world.
- The rise of large language models (LLMs)—transformers trained on vast internet datasets, shifting the locus of AI from task-specific systems to general-purpose models with the ability to perform a broad suite of tasks, and in-context learning.
His reflection on these ‘regular’ paradigm shifts arises from lived experience: "I've maybe lived through two or three of those. I still think there will continue to be some because they come with almost surprising regularity." These moments recalibrate assumptions, redirect research priorities, and set new benchmarks for capability. Karpathy’s practical orientation—building “useful things” rather than targeting biological intelligence or pure AGI—shapes his approach to both innovation and scepticism about hype.
Context of the Quote In his conversation with podcaster Dwarkesh Patel, Karpathy elaborates on the recurring nature of breakthroughs. He contrasts AI’s rapid, transformative leaps with other scientific fields, noting that in machine learning, scaling up data, compute, and novel architectures can yield abrupt improvements—yet each wave often triggers both excessive optimism and later recalibration. A major point he raises is the lack of linearity: the field does not “smoothly” approach AGI, but rather proceeds via discontinuities, often catalysed by new ideas or techniques that were previously out of favour or overlooked.
Karpathy relates how, early in his career, neural networks were a marginal interest and large-scale “representation learning” was only beginning to be considered viable by a minority in the community. With the advent of AlexNet, the landscape shifted overnight, rapidly making previous assumptions obsolete. Later, the pursuit of RL-driven agents led to a phase where entire research agendas were oriented toward gameplay and synthetic environments—another phase later superseded by the transformer revolution and language models. Karpathy reflects candidly on earlier missteps, as well as the discipline’s collective tendency to over- or under-predict the timetable and trajectory of progress.
Leading Theorists and Intellectual Heritage The AI revolutions Karpathy describes are inseparable from the influential figures and ideas that have shaped each phase:
- Geoffrey Hinton: Hailed as the “godfather of AI”, Hinton was instrumental in deep learning’s breakthrough, advancing techniques for training multilayered neural networks and championing representation learning against prevailing orthodoxy.
- Yann LeCun: Developed convolutional neural networks (CNNs), foundational for computer vision and the 2010s wave of deep learning success.
- Yoshua Bengio: Co-architect of the deep learning movement and a key figure in developing unsupervised and generative models.
- Richard Sutton: Principal proponent of reinforcement learning, Sutton articulated the value of “animal-like” intelligence: learning from direct interaction with environments, reward, and adaptation. Sutton’s perspective frequently informs debates about the relationship between model architectures and living intelligence, encouraging a focus on agents and lifelong learning.
Karpathy’s own stance is partly a pragmatic response to this heritage: rather than pursuing analogues of biological brains, he views the productive path as building digital “ghosts”—entities that learn by imitation and are shaped by patterns in data, rather than evolutionary processes.
Beyond individual theorists, the field’s quantum leaps are rooted in a culture of intellectual rivalry and rapid intellectual cross-pollination:
- The convolutional and recurrent networks of the 2010s pushed the boundaries of what neural networks could do.
- The development and scaling of transformer-based architectures (as in Google’s “Attention is All You Need”) dramatically changed both natural language processing and the structure of the field itself.
- The introduction of algorithms for in-context learning and large-scale unsupervised pre-training marked a break with hand-crafted representation engineering.
The Architecture of Progress: Seismic Shifts and Pragmatic Tension Karpathy’s insight is that these shifts are not just about faster hardware or bigger datasets; they reflect the field’s unique ecology—where new methods can rapidly become dominant and overturn accumulated orthodoxy. The combination of open scientific exchange, rapid deployment, and intense commercialisation creates fertile ground for frequent realignment.
His observation on the “regularity” of shifts also signals a strategic realism: each wave brings both opportunity and risk. New architectures (such as transformers or large reinforcement learning agents) frequently overshoot expectations before their real limitations become clear. Karpathy remains measured on both promise and limitation—anticipating continued progress, but cautioning against overpredictions and hype cycles that fail to reckon with the “march of nines” needed to reach true reliability and impact.
Closing Perspective The context of Karpathy’s quote is an AI ecosystem that advances not through steady accretion, but in leaps—each driven by conceptual, technical, and organisational realignments. As such, understanding progress in AI demands both technical literacy and historical awareness: the sharp pivots that have marked past decades are likely to recur, with equally profound effects on how intelligence is conceived, built, and deployed.

|
| |
| |
“The countries that control compute will control AI. You cannot have compute without energy.” - Jonathan Ross - CEO Groq
Jonathan Ross stands at the intersection of geopolitics, energy economics, and technological determinism. As founder and CEO of Groq, the Silicon Valley firm challenging Nvidia's dominance in AI infrastructure, Ross articulated a proposition of stark clarity during his September 2025 appearance on Harry Stebbings' 20VC podcast: "The countries that control compute will control AI. You cannot have compute without energy."
This observation transcends technical architecture. Ross is describing the emergence of a new geopolitical currency—one where computational capacity, rather than traditional measures of industrial might, determines economic sovereignty and strategic advantage in the 21st century. His thesis rests on an uncomfortable reality: artificial intelligence, regardless of algorithmic sophistication or model architecture, cannot function without the physical substrate of compute. And compute, in turn, cannot exist without abundant, reliable energy.
The Architecture of Advantage
Ross's perspective derives from direct experience building the infrastructure that powers modern AI. At Google, he initiated what became the Tensor Processing Unit (TPU) project—custom silicon that allowed the company to train and deploy machine learning models at scale. This wasn't academic research; it was the foundation upon which Google's AI capabilities were built. When Amazon and Microsoft attempted to recruit him in 2016 to develop similar capabilities, Ross recognised a pattern: the concentration of advanced AI compute in too few hands represented a strategic vulnerability.
His response was to establish Groq in 2016, developing Language Processing Units optimised for inference—the phase where trained models actually perform useful work. The company has since raised over $3 billion and achieved a valuation approaching $7 billion, positioning itself as one of Nvidia's most credible challengers in the AI hardware market. But Ross's ambitions extend beyond corporate competition. He views Groq's mission as democratising access to compute—creating abundant supply where artificial scarcity might otherwise concentrate power.
The quote itself emerged during a discussion about global AI competitiveness. Ross had been explaining why European nations, despite possessing strong research talent and model development capabilities (Mistral being a prominent example), risk strategic irrelevance without corresponding investment in computational infrastructure and energy capacity. A brilliant model without compute to run it, he argued, will lose to a mediocre model backed by ten times the computational resources. This isn't theoretical—it's the lived reality of the current AI landscape, where rate limits and inference capacity constraints determine what services can scale and which markets can be served.
The Energy Calculus
The energy dimension of Ross's statement carries particular weight. Modern AI training and inference require extraordinary amounts of electrical power. The hyperscalers—Google, Microsoft, Amazon, Meta—are each committing tens of billions of dollars annually to AI infrastructure, with significant portions dedicated to data centre construction and energy provision. Microsoft recently announced it wouldn't make certain GPU clusters available through Azure because the company generated higher returns using that compute internally rather than renting it to customers. This decision, more than any strategic presentation, reveals the economic value density of AI compute.
Ross draws explicit parallels to the early petroleum industry: a period of chaotic exploration where a few "gushers" delivered extraordinary returns whilst most ventures yielded nothing. In this analogy, compute is the new oil—a fundamental input that determines economic output and strategic positioning. But unlike oil, compute demand doesn't saturate. Ross describes AI demand as "insatiable": if OpenAI or Anthropic received twice their current inference capacity, their revenue would nearly double within a month. The bottleneck isn't customer appetite; it's supply.
This creates a concerning dynamic for nations without indigenous energy abundance or the political will to develop it. Ross specifically highlighted Europe's predicament: impressive AI research capabilities undermined by insufficient energy infrastructure and regulatory hesitance around nuclear power. He contrasted this with Norway's renewable capacity (80% wind utilisation) or Japan's pragmatic reactivation of nuclear facilities—examples of countries aligning energy policy with computational ambition. The message is uncomfortable but clear: technical sophistication in model development cannot compensate for material disadvantage in energy and compute capacity.
Strategic Implications
The geopolitical dimension becomes more acute when considering China's position. Ross noted that whilst Chinese models like DeepSeek may be cheaper to train (through various optimisations and potential subsidies), they remain more expensive to run at inference—approximately ten times more costly per token generated. This matters because inference, not training, determines scalability and market viability. China can subsidise AI deployment domestically, but globally—what Ross terms the "away game"—cost structure determines competitiveness. Countries cannot simply construct nuclear plants at will; energy infrastructure takes decades to build.
This asymmetry creates opportunity for nations with existing energy advantages. The United States, despite higher nominal costs, benefits from established infrastructure and diverse energy sources. However, Ross's framework suggests this advantage is neither permanent nor guaranteed. Control over compute requires continuous investment in both silicon capability and energy generation. Nations that fail to maintain pace risk dependency—importing not just technology, but the capacity for economic and strategic autonomy.
The corporate analogy proves instructive. Ross predicts that every major AI company—OpenAI, Anthropic, Google, and others—will eventually develop proprietary chips, not necessarily to outperform Nvidia technically, but to ensure supply security and strategic control. Nvidia currently dominates not purely through superior GPU architecture, but through control of high-bandwidth memory (HBM) supply chains. Building custom silicon allows organisations to diversify supply and avoid allocation constraints that might limit their operational capacity. What applies to corporations applies equally to nations: vertical integration in compute infrastructure is increasingly a prerequisite for strategic autonomy.
The Theorists and Precedents
Ross's thesis echoes several established frameworks in economic and technological thought, though he synthesises them into a distinctly contemporary proposition.
Harold Innis, the Canadian economic historian, developed the concept of "staples theory" in the 1930s and 1940s—the idea that economies organised around the extraction and export of key commodities (fur, fish, timber, oil) develop institutional structures, trade relationships, and power dynamics shaped by those materials. Innis later extended this thinking to communication technologies in works like Empire and Communications (1950) and The Bias of Communication (1951), arguing that the dominant medium of a society shapes its political and social organisation. Ross's formulation applies Innisian logic to computational infrastructure: the nations that control the "staples" of the AI economy—energy and compute—will shape the institutional and economic order that emerges.
Carlota Perez, the Venezuelan-British economist, provided a framework for understanding technological revolutions in Technological Revolutions and Financial Capital (2002). Perez identified how major technological shifts (steam power, railways, electricity, mass production, information technology) follow predictable patterns: installation phases characterised by financial speculation and infrastructure building, followed by deployment phases where the technology becomes economically productive. Ross's observation about current AI investment—massive capital expenditure by hyperscalers, uncertain returns, experimental deployment—maps cleanly onto Perez's installation phase. The question, implicit in his quote, is which nations will control the infrastructure when the deployment phase arrives and returns become tangible.
W. Brian Arthur, economist and complexity theorist, articulated the concept of "increasing returns" in technology markets through works like Increasing Returns and Path Dependence in the Economy (1994). Arthur demonstrated how early advantages in technology sectors compound through network effects, learning curves, and complementary ecosystems—creating winner-take-most dynamics rather than the diminishing returns assumed in classical economics. Ross's emphasis on compute abundance follows this logic: early investment in computational infrastructure creates compounding advantages in AI capability, which drives economic returns, which fund further compute investment. Nations entering this cycle late face escalating barriers to entry.
Joseph Schumpeter, the Austrian-American economist, introduced the concept of "creative destruction" in Capitalism, Socialism and Democracy (1942)—the idea that economic development proceeds through radical innovation that renders existing capital obsolete. Ross explicitly invokes Schumpeterian dynamics when discussing the risk that next-generation AI chips might render current hardware unprofitable before it amortises. This uncertainty amplifies the strategic calculus: nations must invest in compute infrastructure knowing that technological obsolescence might arrive before economic returns materialise. Yet failing to invest guarantees strategic irrelevance.
William Stanley Jevons, the 19th-century English economist, observed what became known as Jevons Paradox in The Coal Question (1865): as technology makes resource use more efficient, total consumption typically increases rather than decreases because efficiency makes the resource more economically viable for new applications. Ross applies this directly to AI compute, noting that as inference becomes cheaper (through better chips or more efficient models), demand expands faster than costs decline. This means the total addressable market for compute grows continuously—making control over production capacity increasingly valuable.
Nicholas Georgescu-Roegen, the Romanian-American economist, pioneered bioeconomics and introduced entropy concepts to economic analysis in The Entropy Law and the Economic Process (1971). Georgescu-Roegen argued that economic activity is fundamentally constrained by thermodynamic laws—specifically, that all economic processes dissipate energy and cannot be sustained without continuous energy inputs. Ross's insistence that "you cannot have compute without energy" is pure Georgescu-Roegen: AI systems, regardless of algorithmic elegance, are bound by physical laws. Compute is thermodynamically expensive—training large models requires megawatts, inference at scale requires sustained power generation. Nations without access to abundant energy cannot sustain AI economies, regardless of their talent or capital.
Mancur Olson, the American economist and political scientist, explored collective action problems and the relationship between institutional quality and economic outcomes in works like The Rise and Decline of Nations (1982). Olson demonstrated how established interest groups can create institutional sclerosis that prevents necessary adaptation. Ross's observations about European regulatory hesitance and infrastructure underinvestment reflect Olsonian dynamics: incumbent energy interests, environmental lobbies, and risk-averse political structures prevent the aggressive nuclear or renewable expansion required for AI competitiveness. Meanwhile, nations with different institutional arrangements (or greater perceived strategic urgency) act more decisively.
Paul Romer, the American economist and Nobel laureate, developed endogenous growth theory, arguing in works like "Endogenous Technological Change" (1990) that economic growth derives from deliberate investment in knowledge and technology rather than external factors. Romer's framework emphasises the non-rivalry of ideas (knowledge can be used by multiple actors simultaneously) but the rivalry of physical inputs required to implement them. Ross's thesis fits perfectly: AI algorithms can be copied and disseminated, but the computational infrastructure to deploy them at scale cannot. This creates a fundamental asymmetry that determines economic power.
The Historical Pattern
History provides sobering precedents for resource-driven geopolitical competition. Britain's dominance in the 19th century rested substantially on coal abundance that powered industrial machinery and naval supremacy. The United States' 20th-century ascendance correlated with petroleum access and the industrial capacity to refine and deploy it. Oil-dependent economies in the Middle East gained geopolitical leverage disproportionate to their population or industrial capacity purely through energy reserves.
Ross suggests we are witnessing the emergence of a similar dynamic, but with a critical difference: AI compute is both resource-intensive (requiring enormous energy) and productivity-amplifying (making other economic activity more efficient). This creates a multiplicative effect where compute advantages compound through both direct application (better AI services) and indirect effects (more efficient production of goods and services across the economy). A nation with abundant compute doesn't just have better chatbots—it has more efficient logistics, agricultural systems, manufacturing processes, and financial services.
The "away game" concept Ross introduced during the podcast discussion adds a critical dimension. China, despite substantial domestic AI investment and capabilities, faces structural disadvantages in global competition because international customers cannot simply replicate China's energy subsidies or infrastructure. This creates opportunities for nations with more favourable cost structures or energy profiles, but only if they invest in both compute capacity and energy generation.
The Future Ross Envisions
Throughout the podcast, Ross painted a vision of AI-driven abundance that challenges conventional fears of technological unemployment. He predicts labour shortages, not mass unemployment, driven by three mechanisms: deflationary pressure (AI makes goods and services cheaper), workforce opt-out (people work less as living costs decline), and new industry creation (entirely new job categories emerge, like "vibe coding"—programming through natural language rather than formal syntax).
This optimistic scenario depends entirely on computational abundance. If compute remains scarce and concentrated, AI benefits accrue primarily to those controlling the infrastructure. Ross's mission with Groq—creating faster deployment cycles (six months versus two years for GPUs), operating globally distributed data centres, optimising for cost efficiency rather than margin maximisation—aims to prevent that concentration. But the same logic applies at the national level. Countries without indigenous compute capacity will import AI services, capturing some productivity benefits but remaining dependent on external providers for the infrastructure that increasingly underpins economic activity.
The comparison Ross offers—LLMs as "telescopes of the mind"—is deliberately chosen. Galileo's telescope revolutionised human understanding but required specific material capabilities to construct and use. Nations without optical manufacturing capacity could not participate in astronomical discovery. Similarly, nations without computational and energy infrastructure cannot participate fully in the AI economy, regardless of their algorithmic sophistication or research talent.
Conclusion
Ross's statement—"The countries that control compute will control AI. You cannot have compute without energy"—distils a complex geopolitical and economic reality into stark clarity. It combines Innisian materialism (infrastructure determines power), Schumpeterian dynamism (innovation renders existing capital obsolete), Jevonsian counterintuition (efficiency increases total consumption), and Georgescu-Roegen's thermodynamic constraints (economic activity requires energy dissipation).
The implications are uncomfortable for nations unprepared to make the necessary investments. Technical prowess in model development provides no strategic moat if the computational infrastructure to deploy those models remains controlled elsewhere. Energy abundance, or the political will to develop it, becomes a prerequisite for AI sovereignty. And AI sovereignty increasingly determines economic competitiveness across sectors.
Ross occupies a unique vantage point—neither pure academic nor disinterested observer, but an operator building the infrastructure that will determine whether his prediction proves correct. Groq's valuation and customer demand suggest the market validates his thesis. Whether nations respond with corresponding urgency remains an open question. But the framework Ross articulates will likely define strategic competition for the remainder of the decade: compute as currency, energy as prerequisite, and algorithmic sophistication as necessary but insufficient for competitive advantage.

|
| |
| |
“Be the person your dog thinks you are!” - J.W. Stephens - Author
The quote "Be the person your dog thinks you are!" represents a profound philosophical challenge wrapped in disarming simplicity. It invites us to examine the gap between our idealised selves and our everyday reality through the lens of unconditional canine devotion. This seemingly light-hearted exhortation carries surprising depth when examined within the broader context of authenticity, aspiration and the moral psychology of personal development.
The Author and the Quote's Origins
J.W. Stephens, a seventh-generation native Texan, has spent considerable time travelling and living across various locations in Texas and internationally. Whilst the search results provide limited biographical detail about this particular author, the quote itself reveals a distinctively American sensibility—one that combines practical wisdom with accessible moral instruction. The invocation of dogs as moral exemplars reflects a cultural tradition deeply embedded in American life, where the human-canine bond serves as both comfort and conscience.
The brilliance of Stephens' formulation lies in its rhetorical structure. By positioning the dog's perception as the aspirational standard, the quote accomplishes several objectives simultaneously: it acknowledges our frequent moral shortcomings, suggests that we already possess knowledge of higher standards, and implies that achieving those standards is within reach. The dog becomes both witness and ideal reader—uncritical yet somehow capable of perceiving our better nature.
The quote functions as what philosophers might term a "regulative ideal"—not a description of what we are, but a vision of what we might become. Dogs, in their apparent inability to recognise human duplicity or moral inconsistency, treat their owners as wholly trustworthy, infinitely capable, and fundamentally good. This perception, whether accurate or illusory, creates a moral challenge: can we rise to meet it?
Philosophical Foundations: Authenticity and the Divided Self
The intellectual lineage underpinning this seemingly simple maxim extends deep into Western philosophical tradition, touching upon questions of authenticity, self-knowledge, and moral psychology that have preoccupied thinkers for millennia.
Søren Kierkegaard (1813-1855) stands as perhaps the most important theorist of authenticity in Western philosophy. The Danish philosopher argued that modern life creates a condition he termed "despair"—not necessarily experienced as anguish, but as a fundamental disconnection from one's true self. Kierkegaard distinguished between the aesthetic, ethical, and religious stages of existence, arguing that most people remain trapped in the aesthetic stage, living according to immediate gratification and social conformity rather than choosing themselves authentically. His concept of "becoming who you are" anticipates Stephens' formulation, though Kierkegaard's vision is considerably darker and more demanding. For Kierkegaard, authentic selfhood requires a "leap of faith" and acceptance of radical responsibility for one's choices. The dog's unwavering faith in its owner might serve, in Kierkegaardian terms, as a model of the absolute commitment required for authentic existence.
Jean-Paul Sartre (1905-1980) developed Kierkegaard's insights in a secular, existentialist direction. Sartre's notion of "bad faith" (mauvaise foi) describes the human tendency to deceive ourselves about our freedom and responsibility. We pretend we are determined by circumstances, social roles, or past choices when we remain fundamentally free. Sartre argued that consciousness is "condemned to be free"—we cannot escape the burden of defining ourselves through our choices. The gap between who we are and who we claim to be constitutes a form of self-deception Sartre found both universal and contemptible. Stephens' quote addresses precisely this gap: the dog sees us as we might be, whilst we often live as something less. Sartre would likely appreciate the quote's implicit demand that we accept responsibility for closing that distance.
Martin Heidegger (1889-1976) approached similar territory through his concept of "authenticity" (Eigentlichkeit) versus "inauthenticity" (Uneigentlichkeit). For Heidegger, most human existence is characterised by "fallenness"—an absorption in the everyday world of "das Man" (the "They" or anonymous public). We live according to what "one does" rather than choosing our own path. Authentic existence requires confronting our own mortality and finitude, accepting that we are "beings-toward-death" who must take ownership of our existence. The dog's perspective, unburdened by social conformity and living entirely in the present, might represent what Heidegger termed "dwelling"—a mode of being that is at home in the world without falling into inauthenticity.
The Psychology of Self-Perception and Moral Development
Moving from continental philosophy to empirical psychology, several theorists have explored the mechanisms by which we maintain multiple versions of ourselves and how we might reconcile them.
Carl Rogers (1902-1987), the founder of person-centred therapy, developed a comprehensive theory of the self that illuminates Stephens' insight. Rogers distinguished between the "real self" (who we actually are) and the "ideal self" (who we think we should be). Psychological health, for Rogers, requires "congruence"—alignment between these different self-concepts. When the gap between real and ideal becomes too wide, we experience anxiety and employ defence mechanisms to protect our self-image. Rogers believed that unconditional positive regard—accepting someone fully without judgment—was essential for psychological growth. The dog's perception of its owner represents precisely this unconditional acceptance, creating what Rogers termed "conditions of worth" that are entirely positive. Paradoxically, this complete acceptance might free us to change precisely because we feel safe enough to acknowledge our shortcomings.
Albert Bandura (born 1925) developed social learning theory and the concept of self-efficacy, which bears directly on Stephens' formulation. Bandura argued that our beliefs about our capabilities significantly influence what we attempt and accomplish. When we believe others see us as capable (as dogs manifestly do), we are more likely to attempt difficult tasks and persist through obstacles. The dog's unwavering confidence in its owner might serve as what Bandura termed "vicarious experience"—seeing ourselves succeed through another's eyes increases our own self-efficacy beliefs. Moreover, Bandura's later work on moral disengagement explains how we rationalise behaviour that conflicts with our moral standards. The dog's perspective, by refusing such disengagement, might serve as a corrective to self-justification.
Carol Dweck (born 1946) has explored how our beliefs about human qualities affect achievement and personal development. Her distinction between "fixed" and "growth" mindsets illuminates an important dimension of Stephens' quote. A fixed mindset assumes that qualities like character, intelligence, and moral worth are static; a growth mindset sees them as developable through effort. The dog's perception suggests a growth-oriented view: it sees potential rather than limitation, possibility rather than fixed character. The quote implies that we can become what the dog already believes us to be—a quintessentially growth-minded position.
Moral Philosophy and the Ethics of Character
The quote also engages fundamental questions in moral philosophy about the nature of virtue and how character develops.
Aristotle (384-322 BCE) provides the foundational framework for understanding character development in Western thought. His concept of eudaimonia (often translated as "flourishing" or "the good life") centres on the cultivation of virtues through habituation. For Aristotle, we become virtuous by practising virtuous actions until they become second nature. The dog's perception might serve as what Aristotle termed the "great-souled man's" self-regard—not arrogance but appropriate recognition of one's potential for excellence. However, Aristotle would likely caution that merely aspiring to virtue is insufficient; one must cultivate the practical wisdom (phronesis) to know what virtue requires in specific circumstances and the habituated character to act accordingly.
Immanuel Kant (1724-1804) approached moral philosophy from a radically different angle, yet his thought illuminates Stephens' insight in unexpected ways. Kant argued that morality stems from rational duty rather than inclination or consequence. The famous categorical imperative demands that we act only according to maxims we could will to be universal laws. Kant's moral agent acts from duty, not because they feel like it or because they fear consequences. The gap between our behaviour and the dog's perception might be understood in Kantian terms as the difference between acting from inclination (doing good when convenient) and acting from duty (doing good because it is right). The dog, in its innocence, cannot distinguish these motivations—it simply expects consistent goodness. Rising to meet that expectation would require developing what Kant termed a "good will"—the disposition to do right regardless of inclination.
Lawrence Kohlberg (1927-1987) developed a stage theory of moral development that explains how moral reasoning evolves from childhood through adulthood. Kohlberg identified six stages across three levels: pre-conventional (focused on rewards and punishment), conventional (focused on social approval and law), and post-conventional (focused on universal ethical principles). The dog's expectation might be understood as operating at a pre-conventional level—it assumes goodness without complex reasoning. Yet meeting that expectation could require post-conventional thinking: choosing to be good not because others are watching but because we have internalised principles of integrity and compassion. The quote thus invites us to use a simple, pre-moral faith as leverage for developing genuine moral sophistication.
Contemporary Perspectives: Positive Psychology and Virtue Ethics
Recent decades have seen renewed interest in character and human flourishing, providing additional context for understanding Stephens' insight.
Martin Seligman (born 1942), founder of positive psychology, has shifted psychological focus from pathology to wellbeing. His PERMA model identifies five elements of flourishing: Positive emotion, Engagement, Relationships, Meaning, and Accomplishment. The human-dog relationship exemplifies several of these elements, particularly the relationship component. Seligman's research on "learned optimism" suggests that how we explain events to ourselves affects our wellbeing and achievement. The dog's relentlessly optimistic view of its owner might serve as a model of the explanatory style Seligman advocates—one that sees setbacks as temporary and successes as reflective of stable, positive qualities.
Christopher Peterson (1950-2012) and Martin Seligman collaborated to identify character strengths and virtues across cultures, resulting in the Values in Action (VIA) classification. Their research identified 24 character strengths organised under six core virtues: wisdom, courage, humanity, justice, temperance, and transcendence. The quote implicitly challenges us to develop these strengths not because doing so maximises utility or fulfils duty, but because integrity demands that our actions align with our self-understanding. The dog sees us as possessing these virtues; the challenge is to deserve that vision.
Alasdair MacIntyre (born 1929) has argued for recovering Aristotelian virtue ethics in modern life. MacIntyre contends that the Enlightenment project of grounding morality in reason alone has failed, leaving us with emotivism—the view that moral judgments merely express feelings. He advocates returning to virtue ethics situated within narrative traditions and communities of practice. The dog-owner relationship might be understood as one such practice—a context with implicit standards and goods internal to it (loyalty, care, companionship) that shape character over time. Becoming worthy of the dog's trust requires participating authentically in this practice rather than merely going through the motions.
The Human-Animal Bond as Moral Mirror
The specific invocation of dogs, rather than humans, as moral arbiters merits examination. This choice reflects both cultural realities and deeper philosophical insights about the nature of moral perception.
Dogs occupy a unique position in human society. Unlike wild animals, they have co-evolved with humans for thousands of years, developing sophisticated abilities to read human gestures, expressions, and intentions. Yet unlike humans, they appear incapable of the complex social calculations that govern human relationships—judgement tempered by self-interest, conditional approval based on social status, or critical evaluation moderated by personal advantage.
Emmanuel Levinas (1906-1995) developed an ethics based on the "face-to-face" encounter with the Other, arguing that the face of the other person makes an ethical demand on us that precedes rational calculation. Whilst Levinas focused on human faces, his insight extends to our relationships with dogs. The dog's upturned face, its evident trust and expectation, creates an ethical demand: we are called to respond to its vulnerability and faith. The dog cannot protect itself from our betrayal; it depends entirely on our goodness. This radical vulnerability and trust creates what Levinas termed the "infinite responsibility" we bear toward the Other.
The dog's perception is powerful precisely because it is not strategic. Dogs do not love us because they have calculated that doing so serves their interests (though it does). They do not withhold affection to manipulate behaviour (though behavioural conditioning certainly plays a role in the relationship). From the human perspective, the dog's devotion appears absolute and uncalculating. This creates a moral asymmetry: the dog trusts completely, whilst we retain the capacity for betrayal or manipulation. Stephens' quote leverages this asymmetry, suggesting that we should honour such trust by becoming worthy of it.
Practical Implications: From Aspiration to Action
The quote's enduring appeal lies partly in its practical accessibility. Unlike philosophical treatises on authenticity or virtue that can seem abstract and demanding, Stephens offers a concrete, imaginable standard. Most dog owners have experienced the moment of returning home to exuberant welcome, seeing themselves reflected in their dog's unconditional joy. The gap between that reflection and one's self-knowledge of moral compromise or character weakness becomes tangible.
Yet the quote's simplicity risks trivialising genuine moral development. Becoming "the person your dog thinks you are" is not achieved through positive thinking or simple willpower. It requires sustained effort, honest self-examination, and often painful acknowledgment of failure. The philosophical traditions outlined above suggest several pathways:
The existentialist approach demands radical honesty about our freedom and responsibility. We must acknowledge that we choose ourselves moment by moment, that no external circumstance determines our character, and that self-deception about this freedom represents moral failure. The dog's trust becomes a call to authentic choice.
The Aristotelian approach emphasises habituation and practice. We must identify the virtues we lack, create situations that require practising them, and persist until virtuous behaviour becomes natural. The dog's expectation provides motivation for this long-term character development.
The psychological approach focuses on congruence and self-efficacy. We must reduce the gap between real and ideal self through honest self-assessment and incremental change, using the dog's confidence as a source of belief in our capacity to change.
The virtue ethics approach situates character development within practices and traditions. The dog-owner relationship itself becomes a site for developing virtues like responsibility, patience, and compassion through daily engagement.
The Quote in Contemporary Context
Stephens' formulation resonates particularly in an era characterised by anxiety about authenticity. Social media creates pressure to curate idealised self-presentations whilst simultaneously exposing the gap between image and reality. Political and institutional leaders frequently fail to live up to professed values, creating cynicism about whether integrity is possible or even desirable. In this context, the dog's uncomplicated faith offers both comfort and challenge—comfort that somewhere we are seen as fundamentally good, challenge that we might actually become so.
The quote also speaks to contemporary concerns about meaning and purpose. In a secular age lacking consensus on ultimate values, the question "How should I live?" lacks obvious answers. Stephens bypasses theological and philosophical complexities by offering an existentially grounded response: live up to the best version of yourself as reflected in uncritical devotion. This moves the question from abstract principle to lived relationship, from theoretical ethics to embodied practice.
Moreover, the invocation of dogs rather than humans as moral mirrors acknowledges a therapeutic insight: sometimes we need non-judgmental acceptance before we can change. The dog provides that acceptance automatically, creating psychological safety within which development becomes possible. In an achievement-oriented culture that often ties worth to productivity and success, the dog's valuation based simply on existence—you are wonderful because you are you—offers profound relief and, paradoxically, motivation for growth.
The quote ultimately works because it short-circuits our elaborate mechanisms of self-justification. We know we are not as good as our dogs think we are. We know this immediately and intuitively, without needing philosophical argument. The quote simply asks: what if you were? What if you closed that gap? The question haunts precisely because the answer seems simultaneously impossible and within reach—because we have glimpsed that better self in our dog's eyes and cannot quite forget it.

|
| |
| |
“Oftentimes, if you reason about things from first principles, what's working today incredibly well — if you could reason about it from first principles and ask yourself on what foundation that first principle is built and how that would change over time — it allows you to hopefully see around corners.” - Jensen Huang - CEO Nvidia
Jensen Huang’s quote was delivered in the context of an in-depth dialogue with institutional investors on the trajectory of Nvidia, the evolution of artificial intelligence, and strategies for anticipating and shaping the technological future.
Context of the Quote
The quote was made during an interview at a Citadel Securities event in October 2025, hosted by Konstantine Buhler, a partner at Sequoia Capital. The dialogue’s audience consisted of leading institutional investors, all seeking avenues for sustainable advantage or 'edge'. The conversation explored the founding moments of Nvidia in the early 1990s, through the reinvention of the graphics processing unit (GPU), the creation of new computing markets, and the subsequent rise of Nvidia as the platform underpinning the global AI boom. The question of how to ‘see around corners’ — to anticipate technology and industry shifts before they crystallise for others — was at the core of the discussion. Huang’s answer, invoking first-principles reasoning, linked Nvidia’s success to its ability to continually revisit and challenge foundational assumptions, and to methodically project how they will be redefined by progress in science and technology.
Jensen Huang: Profile and Approach
Jensen Huang, born in Tainan, Taiwan in 1963, immigrated to the United States as a child, experiencing the formative challenges of cultural dislocation, financial hardship, and adversity. He obtained his undergraduate degree in electrical engineering from Oregon State University and a master’s from Stanford University. After working at AMD and LSI Logic, he co-founded Nvidia in 1993 at 30, reportedly at a Denny’s restaurant. From the outset, the company faced daunting odds — neither established market nor assured funding, and frequent existential risk in the initial years.
Huang is distinguished not only by technical fluency — he is deeply involved in hardware and software architecture — but also by an ability to translate complexity for diverse audiences. He eschews corporate formality in favour of trademark leather jackets and a focus on product. His leadership style is marked by humility, a willingness to bet on emerging ideas, and what he describes as “urgent innovation” born of early near-failure. This disposition has been integral to Nvidia's progress, especially as the company repeatedly “invented markets” and defined entirely new categories, such as accelerated computing and AI infrastructure.
By 2024, Nvidia became the world’s most valuable public company, with its GPUs foundational to gaming, scientific computing, and, critically, the rise of AI. Huang’s awards — from the IEEE Founder’s Medal to listing among Time Magazine’s 100 most influential — underscore his reputation as a technologist and strategic thinker. He is widely recognised for being able to establish technical direction well before it becomes market consensus, an approach reflected in the quote.
First-Principles Thinking: Theoretical Foundations
Huang’s endorsement of “first principles” echoes a method of problem-solving and innovation associated with thinkers as diverse as Aristotle, Isaac Newton, and, in the modern era, entrepreneurs and strategists such as Elon Musk. The essence of first-principles thinking is to break down complex systems to their most fundamental truths — concepts that cannot be deduced from anything simpler — and to reason forward from those axioms, unconstrained by traditional assumptions, analogies, or received wisdom.
- Aristotle was the first to coin the term “first principles”, distinguishing knowledge derived from irreducible foundational truths from knowledge obtained through analogy or precedent.
- René Descartes advocated for systematic doubt and logical rebuilding of knowledge from foundational elements.
- Richard Feynman, the physicist, was famous for urging students to “understand from first principles”, encouraging deep understanding and avoidance of rote memorisation or mere pattern recognition.
- Elon Musk is often cited as a contemporary example, applying first-principles thinking to industries as varied as automotive (Tesla), space (SpaceX), and energy. Musk has described the technique as “boiling things down to the most fundamental truths and then reasoning up from there,” directly influencing not just product architectures but also cost models and operational methods.
Application in Technology and AI
First-principles thinking is particularly powerful in periods of technological transition:
- In computing, first principles were invoked by Carver Mead and Lynn Conway, who reimagined the semiconductor industry in the 1970s by establishing the foundational laws for microchip design, known as Mead-Conway methodology. This approach was cited by Huang as influential for predicting the physical limitations of transistor miniaturisation and motivating Nvidia’s focus on accelerated computing.
- Clayton Christensen, cited by Huang as an influence, introduced the idea of disruptive innovation, arguing that market leaders must question incumbent logic and anticipate non-linear shifts in technology. His books on disruption and innovation strategy have shaped how leaders approach structural shifts and avoid the “innovator’s dilemma”.
- The leap from von Neumann architectures to parallel, heterogenous, and ultimately AI-accelerated computing frameworks — as pioneered by Nvidia’s CUDA platform and deep learning libraries — was possible because leaders at Nvidia systematically revisited underlying assumptions about how computation should be structured for new workloads, rather than simply iterating on the status quo.
- The AI revolution itself was catalysed by the “deep learning” paradigm, championed by Geoffrey Hinton, Yann LeCun, and Andrew Ng. Each demonstrated that previous architectures, which had reached plateaus, could be superseded by entirely new approaches, provided there was willingness to reinterpret the problem from mathematical and computational fundamentals.
Backstory of the Leading Theorists
The ecosystem that enabled Nvidia’s transformation is shaped by a series of foundational theorists:
- Mead and Conway: Their 1979 textbook and methodologies codified the “first-principles” approach in chip design, allowing for the explosive growth of Silicon Valley’s fabless innovation model.
- Gordon Moore: Moore’s Law, while originally an empirical observation, inspired decades of innovation, but its eventual slow-down prompted leaders such as Huang to look for new “first principles” to govern progress, beyond mere transistor scaling.
- Clayton Christensen: His disruption theory is foundational in understanding why entire industries fail to see the next shift — and how those who challenge orthodoxy from first principles are able to “see around corners”.
- Geoffrey Hinton, Yann LeCun, Andrew Ng: These pioneers directly enabled the deep learning revolution by returning to first principles on how learning — both human and artificial — could function at scale. Their work with neural networks, widely doubted after earlier “AI winters”, was vindicated with landmark results like AlexNet (2012), enabled by Nvidia GPUs.
Implications
Jensen Huang’s quote is neither idle philosophy nor abstract advice — it is a methodology proven repeatedly by his own journey and by the history of technology. It is a call to scrutinise assumptions, break complex structures to their most elemental truths, and reconstruct strategy consciously from the bedrock of what is not likely to change, but also to ask: on what foundation do these principles rest, and how will these foundations themselves evolve.
Organisations and individuals who internalise this approach are equipped not only to compete in current markets, but to invent new ones — to anticipate and shape the next paradigm, rather than reacting to it.

|
| |
| |
“What I think we have to do going forward...is figure out ways to remove some of the knowledge and to keep what I call this cognitive core. It's this intelligent entity that is stripped from knowledge but contains the algorithms and contains the magic of intelligence and problem-solving and the strategies of it and all this stuff.” - Andrej Karpathy - Ex-OpenAI, Ex-Tesla AI
Andrej Karpathy's observation about the need to "strip away knowledge whilst retaining the cognitive core" represents one of the most penetrating insights into contemporary artificial intelligence development. Speaking on Dwarkesh Patel's podcast in October 2025, Karpathy—formerly a leading figure at both OpenAI and Tesla's Autopilot programme—articulated a fundamental tension at the heart of modern AI: the current generation of large language models have become prodigious memorisers, yet this very capability may be constraining their potential for genuine intelligence.
The Paradox of Pre-training
To comprehend Karpathy's thesis, one must first understand the architecture of contemporary AI systems. Large language models are trained on vast corpora—often 15 trillion tokens or more—through a process called pre-training. During this phase, models learn to predict the next token in a sequence, effectively compressing the entire internet into their neural networks. Karpathy describes this compressed representation as only "0.07 bits per token" for a model like Llama 3 70B, highlighting the extraordinary degree of compression occurring.
This compression serves two distinct functions, which Karpathy carefully delineates. First, models accumulate factual knowledge—the content of Wikipedia articles, the specifics of historical events, the details of scientific papers. Second, and more crucially, they develop what Karpathy terms "algorithmic patterns"—the capacity for in-context learning, the ability to recognise and complete patterns, the fundamental mechanisms of reasoning itself.
The problem, as Karpathy sees it, is that contemporary models have become too adept at the former whilst the latter remains the true seat of intelligence. When a model can regurgitate passages verbatim or recite obscure facts, it demonstrates remarkable memory. But this same capability creates what he calls a "distraction"—the model becomes reliant on its hazy recollections of training data rather than developing robust reasoning algorithms that could operate independently of specific factual knowledge.
The Cognitive Core Concept
Karpathy's proposed solution is to isolate and preserve what he terms the "cognitive core"—an intelligent entity stripped of encyclopaedic knowledge but retaining the fundamental algorithms of problem-solving, the strategies of thought, and what he describes as "the magic of intelligence." This concept represents a profound shift in how we conceptualise artificial intelligence.
Consider the analogy to human cognition. Humans are remarkably poor memorisers compared to AI systems. Present a human with a random sequence of numbers, and they'll struggle after seven or eight digits. Yet this apparent limitation forces humans to develop robust pattern-recognition capabilities and abstract reasoning skills. We're compelled to "see the forest for the trees" precisely because we cannot memorise every individual tree.
Karpathy suggests that AI systems would benefit from similar constraints. A model with less memory but stronger reasoning capabilities would be forced to look up factual information whilst maintaining sophisticated algorithms for processing that information. Such a system would more closely resemble human intelligence—not in its limitations, but in the way those limitations drive the development of generalisable cognitive strategies.
The implications extend beyond mere technical architecture. Karpathy envisions cognitive cores as compact as one billion parameters—potentially even smaller—that could operate as genuine reasoning engines rather than glorified databases. These systems would "know that they don't know" when confronted with factual questions, prompting them to retrieve information whilst applying sophisticated analysis. The result would be AI that thinks more than it remembers, that reasons rather than recites.
From Evolution to Engineering: The Path Not Taken
Karpathy's perspective on AI development diverges sharply from what he calls the "Richard Sutton viewpoint"—the notion that we should build AI systems analogous to biological intelligence, learning from scratch through reinforcement learning in the manner of animals. Instead, Karpathy argues we're building what he evocatively describes as "ghosts" or "spirit entities"—ethereal intelligences that emerge from imitating human-generated text rather than evolving through environmental interaction.
This distinction illuminates a crucial divergence in AI philosophy. Biological intelligence, as embodied in animals, emerges from evolution operating over millions of years, with vast amounts of capability "baked in" to neural circuitry. A zebra foal runs within minutes of birth not through reinforcement learning but through evolutionary encoding. Humans similarly arrive with substantial cognitive machinery pre-installed, with lifetime learning representing maturation and refinement rather than learning from first principles.
By contrast, contemporary AI systems learn through what Karpathy terms "crappy evolution"—pre-training on internet documents serves as a compressed, accelerated alternative to evolutionary optimisation. This process creates entities fundamentally different from biological intelligence, optimised for different tasks through different mechanisms. The current approach imitates the products of human intelligence (text, code, conversations) rather than replicating the developmental process that creates intelligence.
The Limits of Current Learning Paradigms
Karpathy's critique extends to reinforcement learning, which he describes with characteristic bluntness as "terrible." His concerns illuminate deep problems in how AI systems currently learn from experience. In reinforcement learning, a model generates hundreds of solution attempts, and those that arrive at correct answers have every intermediate step up-weighted, whilst failed attempts are down-weighted. Karpathy calls this "sucking supervision through a straw"—extracting minimal learning signal from vast amounts of computational work.
The fundamental issue is noise. When a solution works, not every step along the way was necessarily correct or optimal. The model may have taken wrong turns, pursued dead ends, or stumbled upon the answer despite flawed reasoning. Yet reinforcement learning broadcasts the final reward across the entire trajectory, reinforcing both good and bad reasoning indiscriminately. Karpathy notes that "you may have gone down the wrong alleys until you arrived at the right solution," yet every mistaken step gets marked as something to do more of.
Humans, by contrast, engage in sophisticated post-hoc analysis. After solving a problem, we reflect on which approaches worked, which didn't, and why. We don't simply label an entire problem-solving session as "good" or "bad"—we dissect our reasoning, identify productive and unproductive strategies, and refine our approach. Current AI systems lack this reflective capacity entirely.
This limitation connects to broader questions about how AI systems might achieve continual learning—the ability to genuinely learn from ongoing experience rather than requiring massive retraining. Karpathy suggests that humans engage in a nightly "distillation phase" during sleep, processing the day's experiences and integrating insights into long-term knowledge. AI systems have no equivalent mechanism. They simply restart from the same state each time, unable to evolve based on individual experiences.
Model Collapse and the Entropy Problem
A subtle but critical concern in Karpathy's analysis is what he terms "model collapse"—the tendency of AI systems to produce outputs that occupy "a very tiny manifold of the possible space of thoughts." Ask ChatGPT to tell a joke repeatedly, and you'll receive the same three jokes. Request reflection on a topic multiple times, and you'll observe striking similarity across responses. The models are "silently collapsed," lacking the entropy and diversity that characterises human thought.
This phenomenon creates profound challenges for synthetic data generation, a technique labs use to create additional training material. If models generate training data for themselves or subsequent models, this collapsed distribution gradually dominates the training corpus. Training on one's own outputs creates a dangerous feedback loop—each generation becomes less diverse, more stereotyped, more "collapsed" than the last. Karpathy suggests this may not even be a solvable problem, noting that humans similarly "collapse over time," becoming more rigid and less creative as they age, revisiting the same thoughts and patterns with decreasing learning rates.
The contrast with children is illuminating. Young minds, not yet "overfitted" to the world, produce shocking, creative, unexpected responses precisely because they haven't collapsed into standard patterns of thought. This freshness, this maintenance of high entropy in cognitive processes, may be essential to genuine intelligence. Yet our current training paradigms actively work against it, rewarding convergence towards common patterns and penalising deviation.
The Decade of Agents: Why Progress Takes Time
When Karpathy states this will be "the decade of agents" rather than "the year of agents," he draws on hard-won experience from five years leading Tesla's Autopilot programme. His insights into why artificial intelligence deployment takes far longer than demonstrations suggest carry particular weight given this background.
The central concept is what Karpathy calls "the march of nines." Getting something to work 90% of the time—the level typically showcased in demonstrations—represents merely the first nine in "99.9%." Each additional nine requires equivalent effort. During his tenure at Tesla, the team progressed through perhaps two or three nines over five years. More crucially, numerous nines remain before self-driving cars achieve true autonomy at scale.
This pattern isn't unique to autonomous vehicles. Karpathy argues it applies across safety-critical domains, including software engineering. When code errors can leak millions of Social Security numbers or create critical security vulnerabilities, the cost of failure becomes prohibitively high. The demo-to-product gap widens dramatically. What works impressively in controlled conditions fails in countless edge cases when confronting reality's full complexity.
Waymo's experience illustrates this challenge. Despite providing "perfect drives" as early as 2014, the company still operates limited deployments requiring elaborate teleoperation infrastructure and supervision. Humans haven't been removed; they've been rendered invisible, beaming in remotely to handle edge cases. The technology lives in a "pulled-back future"—functional but not yet economical, capable but not yet scalable.
Contemporary AI agents face analogous challenges. Whilst Claude and GPT-5 Pro demonstrate remarkable capabilities, they remain what Karpathy characterises as "elementary school students"—savants with perfect memory but lacking robust reasoning across all necessary dimensions. They're "cognitively deficient" in ways users intuitively recognise even if they can't articulate precisely what's missing.
The Software Engineering Puzzle
Perhaps no domain better illustrates the puzzling contours of current AI capabilities than software engineering. Karpathy notes, somewhat ruefully, that whilst these systems were meant to enable "any economically valuable task," API revenue remains "dominated by coding." This supposedly general intelligence overwhelmingly excels at one specific domain.
This concentration isn't accidental. Code enjoys unique properties that make it ideal for current AI architectures. Software development has always operated through text—terminals, editors, version control systems all manipulate textual representations. LLMs, trained on internet text, encounter code as a native format. Moreover, decades of infrastructure exist for handling code textually: diff tools for showing changes, IDEs for navigation, testing frameworks for verification.
Contrast this with domains lacking such infrastructure. Creating presentations involves spatial arrangement and visual design—there's no "diff" for slides that elegantly shows modifications. Many knowledge work tasks involve physical documents, in-person interactions, or tacit knowledge that resists textual representation. These domains haven't been pre-optimised for AI interaction in the way software development has.
Yet even in coding, Karpathy remains sceptical of current capabilities for cutting-edge work. When building nanoChat, a repository implementing a complete ChatGPT clone in simplified form, he found AI tools valuable for autocomplete and handling familiar patterns but inadequate for novel architectural decisions. The models kept trying to impose standard approaches when he deliberately chose non-standard implementations. They couldn't comprehend his custom solutions, constantly suggesting deprecated APIs and bloating code with unnecessary defensive programming.
This points to a deeper truth: current models excel at reproducing common patterns from their training data but struggle with code "that has never been written before"—precisely the domain of frontier AI research itself. The recursive self-improvement that some forecast, where AI systems rapidly enhance their own capabilities, founders on this limitation. Models can accelerate work within established paradigms but cannot yet pioneer truly novel approaches.
The Trajectory of Intelligence Explosion
Karpathy's perspective on potential intelligence explosions diverges sharply from both pessimistic and optimistic extremes. He sees AI not as a discrete, alien technology but as a continuation of computing's evolution—part of an ongoing automation trend stretching back through compilers, high-level programming languages, and computer-aided design tools. From this view, the "intelligence explosion" has already been occurring for decades, visible in the exponential GDP growth curve that represents accumulated automation across countless domains.
This framing leads to counterintuitive predictions. Rather than expecting AI to suddenly accelerate economic growth from 2% annually to 20%, Karpathy suggests it will enable continued progress along the existing exponential trajectory. Just as computers, the internet, and mobile phones transformed society without producing visible discontinuities in aggregate growth statistics, AI will diffuse gradually across industries, maintaining rather than disrupting established growth patterns.
This gradualism doesn't imply insignificance. The compounding effects of sustained exponential growth produce extraordinary transformation over time. But it does suggest that simple extrapolations from impressive demonstrations to imminent superintelligence misunderstand how technology integrates into society. There will be no discrete moment when "AGI" arrives and everything changes. Instead, we'll experience continuous advancement in capabilities, continuous expansion of automation, and continuous adaptation of society to new technological possibilities.
The analogy to the Industrial Revolution proves instructive. That transformation didn't occur through a single breakthrough but through cascading improvements across multiple technologies and practices, gradually shifting society from 0.2% annual growth to 2%. Similarly, AI's impact will emerge through countless incremental deployments, each automating specific tasks, enabling new workflows, and creating feedback loops that accelerate subsequent progress.
The Human Element: Education in an AI Future
Karpathy's work on Eureka, his educational initiative, reveals his deepest concerns about AI's trajectory. He fears not that AI will fail but that "humanity gets disempowered by it," relegated to the sidelines like the portly, passive citizens of WALL-E. His solution lies in radically reimagining education around the principle that "pre-AGI education is useful; post-AGI education is fun."
The analogy to fitness culture illuminates this vision. Nobody needs physical strength to manipulate heavy objects—we have machines for that. Yet gyms proliferate because exercise serves intrinsic human needs: health, aesthetics, the satisfaction of physical mastery. Similarly, even in a world where AI handles most cognitive labour, humans will pursue learning for its inherent rewards: the pleasure of understanding, the status of expertise, the deep satisfaction of mental cultivation.
But achieving this vision requires solving a technical problem: making learning genuinely easy and rewarding. Currently, most people abandon learning because they encounter material that's too difficult or too trivial, bouncing between frustration and boredom. Karpathy describes the experience of working with an expert language tutor who maintained a perfect calibration—always presenting challenges at the edge of current capability, never boring, never overwhelming. This created a state where "I was the only constraint to learning," with knowledge delivery perfectly optimised.
Replicating this experience at scale represents what Karpathy sees as education's great technical challenge. Current AI tutors, despite their sophistication, remain far from this standard. They can answer questions but cannot probe understanding, identify gaps, or sequence material to create optimal learning trajectories. The capability exists in exceptional human tutors; the challenge lies in encoding it algorithmically.
Yet Karpathy sees this challenge as tractable. Just as AI has transformed coding through autocomplete and code generation, it will eventually transform education through personalised, responsive tutoring. When learning becomes "trivial"—not in the sense of requiring no effort but in the sense of encountering no artificial obstacles—humans will pursue it enthusiastically. Not everyone will become an expert in everything, but the ceiling on human capability will rise dramatically as the floor on accessibility descends.
The Physics of Understanding: Karpathy's Pedagogical Philosophy
Karpathy's approach to teaching reveals principles applicable far beyond AI. His background in physics instilled what he describes as finding "first-order terms"—identifying the essential, dominant factors in any system whilst recognising that second and third-order effects exist but matter less. This habit of abstraction, of seeing spherical cows where others see only messy complexity, enables the creation of minimal, illustrative examples that capture phenomena's essence.
MicroGrad exemplifies this approach perfectly. In 100 lines of Python, Karpathy implements backpropagation—the fundamental algorithm underlying all neural network training. Everything else in modern deep learning frameworks, he notes, is "just efficiency"—optimisations for speed, memory management, numerical stability. But the intellectual core, the actual mechanism by which networks learn, fits in 100 comprehensible lines. This distillation makes the previously arcane accessible.
The broader principle involves "untangling knowledge"—reorganising understanding so each concept depends only on what precedes it. This creates "ramps to knowledge" where learners never encounter gaps or leaps that would require them to take claims on faith. The famous transformer tutorial embodies this, beginning with a simple bigram model (literally a lookup table) and progressively adding components, each motivated by solving a specific limitation of what came before.
This approach contrasts sharply with the standard academic practice of presenting solutions before establishing problems, or introducing abstractions before concrete examples. Karpathy sees such approaches as, in his words, "a dick move"—they rob learners of the opportunity to grapple with challenges themselves, to develop intuition about what solutions might work, and to appreciate why particular approaches succeed where alternatives fail. The pedagogical crime isn't challenging students; it's presenting answers without first establishing questions.
Leading Theorists: The Intellectual Lineage
Richard Sutton and the Bitter Lesson
Richard Sutton, a pioneering reinforcement learning researcher, articulated what became known as "the bitter lesson"—the observation that simple, scalable methods leveraging computation consistently outperform approaches incorporating human knowledge or structural assumptions. His perspective suggests that the path to artificial general intelligence lies through learning algorithms powerful enough to discover structure from experience, much as evolution discovered biological intelligence.
Sutton's famous assertion that "if you got to the squirrel, you'd be most of the way to AGI" reflects this view. Animal intelligence, in his framework, represents the core achievement—the fundamental learning algorithms that enable organisms to navigate environments, solve problems, and adapt to challenges. Human language and culture, whilst impressive, represent relatively minor additions to this foundation.
Karpathy respectfully dissents. His "we're building ghosts, not animals" formulation captures the divergence: current AI systems don't replicate the learning processes that create biological intelligence. They imitate the products of human intelligence (text, code, reasoning traces) rather than replicating its developmental origins. This distinction matters profoundly for predicting AI's trajectory and understanding its capabilities and limitations.
Geoffrey Hinton and the Neural Network Renaissance
Geoffrey Hinton, often termed the "godfather of AI," pioneered the neural network approaches that underpin contemporary systems. His persistence through decades when neural networks were unfashionable, his development of backpropagation techniques, and his later work on capsule networks and other architectures established the foundation for today's large language models.
Karpathy studied directly under Hinton at the University of Toronto, experiencing firsthand the intellectual ferment as deep learning began its ascent to dominance. Hinton's influence appears throughout Karpathy's thinking—the emphasis on learning from data rather than hand-crafted rules, the focus on representation learning, the conviction that scale and simplicity often trump elaborate architectural innovations.
Yet Karpathy's view extends beyond his mentor's. Where Hinton focused primarily on perception (particularly computer vision), Karpathy grapples with the full scope of intelligence—reasoning, planning, continual learning, multi-agent interaction. His work synthesises Hinton's foundational insights with broader questions about cognitive architecture and the nature of understanding itself.
Yann LeCun and Convolutional Networks
Yann LeCun's development of convolutional neural networks in 1989 represented the first successful application of gradient descent to real-world pattern recognition. His work on handwritten digit recognition established core principles: the power of hierarchical feature learning, the importance of translation invariance, the value of specialised architectures for specific domains.
Karpathy's reconstruction of LeCun's 1989 network, time-travelling 33 years of algorithmic improvements, reveals his appreciation for this lineage. He found that pure algorithmic advances—modern optimisers, better architectures, regularisation techniques—could halve error rates. But achieving further gains required more data and more computation. This trinity—algorithms, data, compute—advances in lockstep, with no single factor dominating.
This lesson shapes Karpathy's predictions about AI's future. He expects continued progress across all three dimensions, with the next decade bringing better algorithms, vaster datasets, more powerful hardware, and more efficient software. But no breakthrough in any single dimension will produce discontinuous acceleration. Progress emerges from the intersection of many incremental improvements.
The Broader Intellectual Context
The debate Karpathy engages extends beyond specific individuals to fundamental questions about intelligence itself. Does intelligence arise primarily from general learning algorithms (the Sutton view) or from accumulated structure and innate mechanisms (the evolutionary perspective)? Can we build intelligence by imitating its products (the current LLM approach) or must we replicate its developmental processes? Will artificial intelligence remain fundamentally tool-like, augmenting human capability, or evolve into genuinely autonomous agents pursuing their own goals?
These questions connect to century-old debates in psychology and cognitive science between behaviourists emphasising learning and nativists emphasising innate structure. They echo discussions in evolutionary biology about the relative roles of genetic determination and developmental plasticity. They parallel arguments in philosophy of mind about whether intelligence requires embodiment or can exist as pure information processing.
Karpathy's position threads between extremes. He acknowledges both the power of learning from data and the necessity of architectural structure. He recognises both the distinctiveness of AI systems and their illuminating analogies to biological intelligence. He balances optimism about AI's potential with realism about current limitations and the difficulty of translating demonstrations into robust, deployed systems.
The Cognitive Core in Context: A New Paradigm for Intelligence
The concept of a cognitive core stripped of factual knowledge represents more than a technical proposal—it's a reconceptualisation of what intelligence fundamentally is. Rather than viewing intelligence as encompassing both reasoning algorithms and accumulated knowledge, Karpathy proposes treating these as separate, with reasoning capability as the essence and factual knowledge as external resources to be accessed rather than internalised.
This separation mirrors certain aspects of human cognition whilst diverging in others. Humans do maintain a distinction between knowing how to think and knowing specific facts—we can reason about novel situations without direct experience, applying general problem-solving strategies learned in one domain to challenges in another. Yet our factual knowledge isn't purely external; it shapes the very structure of our reasoning, creating rich semantic networks that enable rapid, intuitive judgement.
The proposal to strip AI systems down to cognitive cores involves accepting tradeoffs. Such systems would need to perform external lookups for factual information, introducing latency and dependency on knowledge bases. They would lack the pattern-matching capabilities that arise from vast memorisation, potentially missing connections between superficially unrelated domains. They might struggle with tasks requiring seamless integration of many small facts, where lookup costs would dominate processing time.
Yet the gains could prove transformative. A genuine cognitive core—compact, efficient, focused on algorithmic reasoning rather than fact retrieval—could operate in settings where current models fail. Edge deployment becomes feasible when models don't require storing terabytes of parameters. Personalisation becomes practical when core reasoning engines can be fine-tuned or adapted without retraining on entire knowledge corpora. Interpretability improves when reasoning processes aren't obscured by retrieval of memorised patterns.
Most profoundly, genuine cognitive cores might avoid the collapse and loss of entropy that plagues current models. Freed from the burden of maintaining consistency with vast memorised datasets, such systems could explore more diverse solution spaces, generate more varied outputs, and maintain the creative flexibility that characterises human cognition at its best.
Implications for the Decade Ahead
Karpathy's decade-long timeline for agentic AI reflects hard-earned wisdom about technology deployment. His experience with autonomous vehicles taught him that impressive demonstrations represent merely the beginning of a long productisation journey. Each additional "nine" of reliability—moving from 90% to 99% to 99.9% accuracy—requires comparable effort. Safety-critical domains demand many nines before deployment becomes acceptable.
This reality shapes expectations for AI's economic impact. Rather than sudden disruption, we'll witness gradual diffusion across domains with varying characteristics. Tasks that are repetitive, well-defined, purely digital, and allowing high error rates will automate first. Call centre work exemplifies this profile—short interaction horizons, clear success criteria, limited context requirements, tolerance for occasional failures that human supervisors can catch.
More complex knowledge work will resist automation longer. Radiologists, consultants, accountants—professionals whose work involves lengthy timescales, subtle judgements, extensive context, and high costs of error—will see AI augmentation before replacement. The pattern will resemble Waymo's current state: AI handling routine cases whilst humans supervise, intervene in edge cases, and maintain ultimate responsibility.
This graduated deployment creates an "autonomy slider"—a continuous spectrum from pure human operation through various degrees of AI assistance to eventual full automation. Most jobs won't flip discretely from human to machine. Instead, they'll slide along this spectrum as AI capabilities improve and organisations develop confidence in delegation. This process will unfold over years or decades, not months.
The economic implications differ from both optimistic and pessimistic extremes. We won't see overnight mass unemployment—the gradual nature of deployment, the persistence of edge cases requiring human judgement, and society's adaptation through creating new roles all mitigate disruption. But neither will we see disappointing underutilisation—the compound effect of many small automations across countless tasks will produce genuine transformation.
The Path Forward: Research Priorities
Karpathy's analysis suggests several critical research directions for developing robust, capable AI systems. First, developing methods to isolate cognitive cores from memorised knowledge whilst maintaining reasoning capability. This might involve novel training objectives that penalise rote memorisation whilst rewarding generalisation, or architectural innovations that separate knowledge storage from reasoning mechanisms.
Second, creating effective continual learning systems that can distil experience into lasting improvements without catastrophic forgetting or model collapse. This requires moving beyond simple fine-tuning toward something more akin to the reflection and consolidation humans perform during sleep—identifying patterns in experience, extracting lessons, and integrating insights whilst maintaining diversity.
Third, advancing beyond current reinforcement learning to richer forms of learning from experience. Rather than broadcasting sparse reward signals across entire trajectories, systems need sophisticated credit assignment that identifies which reasoning steps contributed to success and which didn't. This might involve explicit review processes where models analyse their own problem-solving attempts, or meta-learning approaches that learn how to learn from experience.
Fourth, developing multi-agent systems with genuine culture—shared knowledge bases that agents collectively maintain and evolve, self-play mechanisms that drive capability improvement through competition, and organisational structures that enable collaboration without centralized control. Current systems remain fundamentally solitary; genuine agent economies will require breakthroughs in coordination and communication.
Fifth, and perhaps most ambitiously, maintaining entropy in AI systems—preventing the collapse toward stereotyped outputs that currently plagues even frontier models. This might involve explicit diversity penalties, adversarial training to prevent convergence, or inspiration from biological systems that maintain variation through mechanisms like mutation and recombination.
Conclusion: Intelligence as Engineering Challenge
Andrej Karpathy's vision of the cognitive core represents a mature perspective on artificial intelligence—neither breathlessly optimistic about imminent superintelligence nor dismissively pessimistic about current limitations. He sees AI as an engineering challenge rather than a mystical threshold, requiring patient work across multiple dimensions rather than awaiting a single breakthrough.
This perspective derives from direct experience with the messy reality of deploying AI systems at scale. Self-driving cars that work perfectly in demonstrations still require years of refinement before handling edge cases reliably. Coding agents that generate impressive solutions for common problems still struggle with novel architectural challenges. Educational AI that answers questions adequately still falls far short of expert human tutors' adaptive responsiveness.
Yet within these limitations lies genuine progress. Models continue improving along multiple dimensions simultaneously. Infrastructure for deploying and managing AI systems grows more sophisticated. Understanding of these systems' capabilities and constraints becomes more nuanced. The path forward is visible, even if it stretches further than optimists anticipated.
The concept of stripping knowledge to reveal the cognitive core captures this mature vision perfectly. Rather than pursuing ever-larger models memorising ever-more data, we might achieve more capable intelligence through subtraction—removing the crutch of memorisation to force development of robust reasoning algorithms. Like humans compelled to abstract and generalise because we cannot remember everything, AI systems might benefit from similar constraints.
This vision offers hope not for sudden transformation but for steady progress—the kind that compounds over decades into revolutionary change. It suggests that the hard technical problems of intelligence remain tractable whilst acknowledging their genuine difficulty. Most importantly, it positions humans not as passive observers of AI's ascent but as active participants in shaping its development and ensuring its integration enhances rather than diminishes human flourishing.
The decade ahead will test these ideas. We'll discover whether cognitive cores can be effectively isolated, whether continual learning mechanisms can be made robust, whether the demo-to-product gap can be bridged across diverse domains. The answers will shape not just the trajectory of AI technology but the future of human society in an increasingly automated world. Karpathy's contribution lies in framing these questions with clarity, drawing on hard-won experience to guide expectations, and reminding us that the most profound challenges often yield to patient, disciplined engineering rather than waiting for miraculous breakthroughs.

|
| |
| |
“I feel like the [ AI ] problems are tractable, they're surmountable, but they're still difficult. If I just average it out, it just feels like a decade [ to AGI ] to me.” - Andrej Karpathy - Ex-OpenAI, Ex-Tesla AI
Andrej Karpathy’s reflection—“I feel like the [ AI ] problems are tractable, they're surmountable, but they're still difficult. If I just average it out, it just feels like a decade [ to AGI ] to me.”—encapsulates both a grounded optimism and a caution honed through years at the forefront of artificial intelligence research. Understanding this statement requires context about the speaker, the evolution of the field, and the intellectual landscape that shapes contemporary thinking on artificial general intelligence (AGI).
Andrej Karpathy: Technical Leadership and Shaping AI’s Trajectory
Karpathy is recognised as one of the most influential figures in modern AI. With a doctorate under Geoffrey Hinton, the so-called “godfather” of deep learning, Karpathy’s early career put him at the confluence of academic breakthroughs and industrial deployment. At Stanford, he helped launch the seminal CS231n course, which became a training ground for a generation of practitioners. He subsequently led critical efforts at OpenAI and Tesla, where he served as Director of AI, architecting large-scale deep learning systems for both language and autonomous driving.
From the earliest days of deep learning, Karpathy has witnessed—and helped drive—several “seismic shifts” that have periodically redefined the field. He recalls, for example, the transition from neural networks being considered a niche topic to their explosive relevance with the advent of AlexNet. At OpenAI, he observed the limitations of reinforcement learning when applied too soon to general agent-building and became an early proponent of focusing on practical, useful systems rather than chasing abstract analogies with biological evolution.
Karpathy’s approach is self-consciously pragmatic. He discounts analogies between AI and animal evolution, preferring to frame current efforts as “summoning ghosts,” i.e., building digital entities trained by imitation, not evolved intelligence. His career has taught him to discount industry hype cycles and focus on the “march of nines”—the painstaking work required to close the gap between impressive demos and robust, trustworthy products. This stance runs through his entire philosophy on AI progress.
Context for the Quote: Realism amidst Exponential Hype
The statement about AGI’s timeline emerges from Karpathy’s nuanced position between the extremes of utopian accelerationism and excessive scepticism. Against a backdrop of industry figures claiming near-term transformative breakthroughs, Karpathy advocates for a middle path: current models represent significant progress, but numerous “cognitive deficits” persist. Key limitations include the lack of robust continual learning, difficulties generalising out-of-distribution, and the absence of key memory and reasoning capabilities seen in human intelligence.
Karpathy classifies present-day AI systems as “competent, but not yet capable agents”—useful in narrow domains, such as code generation, but unable to function autonomously in open-ended, real-world contexts. He highlights how models exhibit an uncanny ability to memorise, yet often lack the generalisation skills required for truly adaptive behaviour; they’re powerful, but brittle. The hard problems left are not insurmountable, but solving them—including integrating richer memory, developing agency, and building reliable, context-sensitive learning—will take sustained, multi-year effort.
AGI and the Broader Field: Dialogue with Leading Theorists
Karpathy’s thinking exists in dialogue with several foundational theorists:
-
Geoffrey Hinton: Pioneered deep learning and neural network approaches that underlie all current large-scale AI. His early conviction in neural networks, once seen as fringe, is now mainstream, but Hinton remains open to new architectural breakthroughs.
-
Richard Sutton: A major proponent of reinforcement learning as a route to general intelligence. Sutton’s vision focuses on “building animals”—systems capable of learning from scratch via trial and error in complex environments—whereas Karpathy now sees this as less immediately relevant than imitation-based, practically grounded approaches.
-
Yann LeCun: Another deep learning pioneer, LeCun has championed the continuous push toward self-supervised learning and innovations within model architecture.
-
The Scaling Optimists: The school of thought, including some in the OpenAI and DeepMind circles, who argue that simply increasing model size and data, within current paradigms, will inexorably deliver AGI. Karpathy explicitly distances himself from this view, arguing for the necessity of algorithmic innovation and socio-technical integration.
Karpathy sees the arc of AI progress as analogous to general trends in automation and computing: exponential in aggregate, but marked by periods of over-prediction, gradual integration, and non-linear deployment. He draws lessons from the slow maturation of self-driving cars—a field he led at Tesla—where early demos quickly gave way to years of incremental improvement, ironing out “the last nines” to reach real-world reliability.
He also foregrounds the human side of the equation: as AI’s technical capability increases, the question becomes as much about organisational integration, legal and social adaptation, as it does about raw model performance.
In Summary: Surmountable Yet Difficult
Karpathy’s “decade to AGI” estimate is anchored in a sober appreciation of both technical tractability and practical difficulty. He is neither pessimistic nor a hype-driven optimist. Instead, he projects that AGI—defined as machines able to deliver the full spectrum of knowledge work at human levels—will require another decade of systematic progress spanning model architecture, algorithmic innovation, memory, continual learning, and above all, integration with the complex realities of the real world.
His perspective stands out for its blend of technical rigour, historical awareness, and humility in the face of both engineering constraints and the unpredictability of broader socio-technical systems. In this, Karpathy situates himself in conversation with a lineage of thinkers who have repeatedly recalibrated the AI field’s ambitions—and whose own varied predictions continue to shape the ongoing march toward general intelligence.

|
| |
| |
“I don’t think venture is an 'asset class' in the sense many LPs think... You need dozens of Figma-sized outcomes every year to make that math work; I don't see that many. So the only thing that breaks is the return assumption. Venture is return-free risk, not a risk-free return.” - Roelof Botha - Senior Steward, Sequoia
Botha's mathematical argument is straightforward and devastating. With approximately $250 billion flowing annually into US venture capital and limited partners expecting net IRRs in the 12% range, the implied arithmetic requires roughly $1 trillion in annual exit value over typical fund horizons. Yet historical data reveals only about 20 companies per year achieve realised exits worth $1 billion or more. Even if we generously assume that frontier AI companies will produce larger outcomes than historical norms, the gulf between required and probable returns remains vast. The statement "you need dozens of Figma-sized outcomes every year to make that math work" underscores the sheer improbability of meeting aggregate return expectations.
This is not merely academic scepticism. Botha speaks from the vantage point of having navigated multiple market cycles whilst maintaining Sequoia's position at the apex of venture performance. His perspective is informed by personal experience across technology's most significant transitions: he took PayPal public as CFO at 28 in 2002—the first "dotcom" to go public after the crash—advocated for YouTube's acquisition two years before Google bought it for $1.65 billion, and has since been instrumental in investments spanning Instagram, Square, MongoDB, Unity and DoorDash. When someone with this track record states that venture doesn't function as an asset class in the conventional sense, it merits serious attention.
The Institutional Memory Problem
Botha's critique exposes a fundamental tension in how institutional capital thinks about venture. The asset class framework assumes diversification, scalability and predictable return distributions—characteristics that venture capital demonstrably lacks. The data consistently show extreme power law dynamics: top-decile and top-quartile performance diverge dramatically from median outcomes, and the gap has widened as more capital has entered the market. Limited partners treating venture as they would bonds or equities—allocating based on target portfolio weights and rebalancing mechanically—are applying frameworks designed for normally distributed returns to a domain where outcomes follow profoundly skewed distributions.
The historical precedent supports Botha's scepticism. When one examines the roster of leading Silicon Valley venture firms from 1990, most have ceased to exist or have faded into irrelevance. Even amongst firms that survived, maintaining top-tier performance across multiple decades and generational transitions remains vanishingly rare. Sequoia itself has institutionalised "healthy paranoia" through daily rituals—including wall-to-wall printing of "we are only as good as our next investment" in each partner's handwriting—precisely because sustained excellence is so improbable.
Cost Structure and Margin Dynamics
Botha's broader investment philosophy, evident throughout the conversation, provides essential context for understanding why he believes current capital deployment is fundamentally misaligned with probable outcomes. His emphasis on cost structure and unit economics—"cost is an advantage, not price"—reflects a disciplined focus on companies that can achieve sustainable margins rather than those burning capital to chase topline growth. This stands in sharp contrast to the behaviour incentivised when excessive capital seeks deployment: founders are encouraged to prioritise scale over efficiency, and investors compete on valuation rather than value-add partnership.
The contemporary challenge in AI applications illustrates this tension. Many AI-enabled software companies currently exhibit compressed gross margins—perhaps 40% rather than the 80% typical of pre-AI SaaS businesses—due to inference costs. Botha's view is that these margins will improve materially over time as algorithms become more efficient, open-source models compete with proprietary offerings, and founders deploy model ensembles that match use cases to cost-value ratios. However, this requires patient capital willing to underwrite margin expansion paths rather than demanding immediate profitability or hyper-growth at any cost. The current abundance of venture capital undermines this discipline.
Decision-Making Architecture and Team Composition
Sequoia's internal governance mechanisms reveal how a firm can maintain investment discipline amidst market exuberance. The partnership employs anonymous preliminary voting across approximately 12 participants per fund meeting, premortems that explicitly name cognitive biases in investment memoranda, and a culture of "front-stabbing" where dissent must be voiced directly and substantively. This architecture is designed to surface honest disagreement whilst preserving the conviction necessary for outlier bets. Critically, Sequoia has deliberately kept its investment team small—roughly 25 investors total—to maintain the trust required for candid debate. This stands in stark contrast to firms that have scaled headcount aggressively to deploy larger funds.
The personnel profile Botha describes—"pirates, not people who want to join the navy"—reflects a specific cultural DNA: competitive, irreverent, non-conformist individuals who nonetheless possess high integrity and play as a team. This is not window dressing; it's a functional requirement for maintaining the dissonance between institutional humility ("we are only as good as our next investment") and individual conviction (the willingness to champion contrarian positions). The challenge for most organisations is that these traits—competitive individualism and collaborative teamwork, paranoia and boldness—create inherent tensions that require active cultural management.
Implications for Founders and Emerging Managers
For founders, Botha's analysis suggests that the current abundance of venture capital may be more liability than asset. Excess funding often undermines the discipline required to build durable businesses with strong unit economics and sustainable margins. The historical pattern he references—spreading talent thin, similar to 1999—implies that many startups are overstaffed, over-capitalised and under-focused on the cost structures that ultimately determine competitive advantage. Founders who resist the temptation to raise at inflated valuations and instead prioritise capital efficiency may find themselves better positioned when market conditions normalise.
For emerging fund managers, the message is equally stark: network development, relationship cultivation and demonstrable value-add matter far more than deploying large pools of capital. Botha's advice to "build the network and the tributaries" reflects a business model predicated on access and partnership rather than balance sheet scale. Managers who attempt to compete by raising ever-larger funds are swimming against the arithmetic Botha outlines—there simply aren't enough outsized outcomes to justify the capital deployed.
Theoretical Foundations: Power Laws and Portfolio Construction
Botha's argument intersects with longstanding academic debates about venture capital portfolio construction and return dynamics. The seminal work by Korteweg and Sorensen (2010) on risk adjustment in venture returns demonstrated that much of venture's apparent outperformance disappears when properly accounting for risk and selection bias. Subsequent research by Ewens, Jones and Rhodes-Kropf (2013) on the price of diversification showed that venture returns exhibit extreme skewness, with top-decile funds capturing disproportionate value. Harris, Jenkinson and Kaplan (2014) found that whilst top-quartile venture funds consistently outperform public markets, median and below-median funds underperform even after adjusting for leverage and illiquidity.
The theoretical challenge is that venture capital has always been characterised by power law dynamics—Chris Dixon and others have popularised Nassim Taleb's observation that venture returns follow a power law distribution where a small number of investments generate the majority of returns. What Botha is arguing is that current capital inflows have pushed the industry beyond the point where even sophisticated portfolio construction can reliably generate attractive risk-adjusted returns for the typical investor. This is distinct from claiming that no attractive opportunities exist; rather, he's asserting that the quantum of attractive opportunities relative to deployed capital has reached unsustainable levels.
Historical Analogues and Market Cycles
The 1999 parallel Botha invokes is instructive. During the dotcom bubble, venture capital fundraising surged from roughly $35 billion in 1998 to $106 billion in 2000. The subsequent crash saw fundraising collapse to $10 billion by 2003. What's often forgotten is that the best-performing funds from that era—those that generated genuine alpha—tended to be smaller, more selective vehicles that maintained investment discipline even as capital availability surged. Sequoia itself raised a relatively modest $450 million fund in 1999, resisting the temptation to scale fund size aggressively.
The 2021 parallel is equally relevant. As growth-stage valuations reached unprecedented levels and tourists flooded into venture capital, established firms faced pressure to compete on valuation, deploy capital faster and compromise on diligence. Firms that maintained discipline—insisting on demonstrable unit economics, sustainable margins and realistic growth assumptions—found themselves losing competitive processes to investors willing to accept flimsier evidence of value creation. Botha's framing suggests that this dynamic represents not temporary market froth but rather structural oversupply.
The Broader Context: Technology Adoption and Market Scale
Botha's longer-term optimism about technology's impact provides important nuance. He acknowledges that the scale of technology markets has expanded dramatically—from 300 million internet users during his PayPal tenure to four billion people with high-speed mobile devices today. He's explicit that frontier technologies like AI, robotics, genomics and blockchain-based financial infrastructure will create substantial value. His scepticism is not about innovation potential but rather about the mismatch between capital deployed and capturable returns.
This distinction matters for interpreting the "return-free risk" characterisation. Botha is not arguing that venture capital cannot generate exceptional returns for skilled practitioners with disciplined processes and selective deployment. Sequoia's portfolio—representing roughly 30% of NASDAQ market capitalisation from companies backed whilst private—demonstrates that outlier performance remains achievable. Rather, he's asserting that treating venture as a passive, diversifiable asset class suitable for broad institutional allocation is fundamentally misconceived.
The Economics of Intelligence and Margin Evolution
The AI-specific dimension of Botha's analysis deserves separate consideration. His framework for evaluating AI application companies combines near-term pragmatism with medium-term optimism about cost curves. In the near term, many AI-enabled products exhibit compressed margins due to inference costs, and investors must assess whether unit economics or pricing power justify those margins. Over the medium term, he expects substantial margin improvement driven by algorithmic efficiency gains, open-source model competition, economies of scale and intelligent model selection (deploying frontier models only where value justifies cost).
This view has profound implications for how investors should evaluate AI companies today. Those applying conventional SaaS valuation multiples without adjusting for current margin compression may be overvaluing companies whose competitive position depends on unsustainable subsidisation of compute costs. Conversely, those dismissing AI applications entirely based on current margin profiles may be underestimating the trajectory of cost improvement. Disciplined diligence requires explicit modeling of margin evolution paths, sensitivity to underlying cost curves and realistic assessment of pricing power as intelligence commoditises.
Governance, Conflict and Confidentiality
The operational challenges Botha describes—managing portfolio conflicts, preserving confidentiality and navigating situations where portfolio companies evolve into competitive adjacencies—illuminate the practical tensions that arise when firms operate as deep business partners rather than passive capital providers. His example of Stripe and Square converging into overlapping domains, requiring recusal from certain meetings and investment memos, illustrates that even well-intentioned conflict management involves trade-offs and constraints.
This dimension connects to the broader question of whether venture capital should be structured as a relationship business or as a capital-allocation optimisation problem. Firms pursuing the former model—exemplified by Sequoia's emphasis on board service, operational partnership and long-term stewardship—necessarily accept constraints on portfolio breadth and sector coverage. Firms pursuing the latter can achieve greater diversification and sector coverage but sacrifice depth of partnership and founder alignment. Neither model is categorically superior, but they imply different return profiles and different sources of competitive advantage.
Implications for Limited Partner Strategy
For institutional investors, Botha's analysis suggests a fundamental rethinking of venture allocation strategy. The orthodox approach—establishing a target allocation to venture capital as an asset class, selecting a diversified portfolio of fund managers across vintage years and strategies, and rebalancing mechanically—is predicated on assumptions that Botha's data directly contradict. If venture exhibits power law returns at both the company level and the fund level, and if capital oversupply has pushed the industry beyond the point where diversification reliably captures attractive risk-adjusted returns, then LPs should concentrate capital with demonstrably superior managers rather than pursuing broad diversification.
This implies dramatically different behaviour: willingness to pay premium economics for access to top-decile managers, acceptance of capacity constraints and queue positions, focus on relationship development and value demonstration rather than purely financial negotiation. It also implies scepticism towards emerging managers unless they can articulate genuine edge—proprietary deal flow, differentiated value-add, or domain expertise that translates to selection advantage.
The alternative—acknowledging that venture capital allocation is effectively a form of economic development or innovation subsidy that happens to generate modest risk-adjusted returns—is intellectually honest but conflicts with fiduciary obligations. Endowments, pension funds and sovereign wealth vehicles investing primarily for financial return should perhaps treat venture capital as a satellite allocation justified by lottery-ticket optionality rather than as a core portfolio component meriting multi-billion-pound allocations.
Roelof Botha's path to this perspective reflects an unusual combination of operating experience, investment track record and institutional leadership. Born in Pretoria in September 1973, he studied actuarial science, economics and statistics at the University of Cape Town before earning an MBA from Stanford's Graduate School of Business, where he graduated as the Henry Ford Scholar—the top student in his class. His actuarial training instilled a probabilistic framework and long-term thinking that pervades his investment philosophy.
After working as a business analyst at McKinsey in Johannesburg from 1996 to 1998, Botha joined PayPal in 2000 as director of corporate development whilst still a Stanford student. He became PayPal's chief financial officer in September 2001 at age 27, navigating the company through its February 2002 initial public offering and subsequent October 2002 acquisition by eBay. PayPal's IPO occurred during a period of profound scepticism about internet businesses—one 2001 article titled "Earth to Palo Alto" essentially ridiculed the company—yet PayPal's financial discipline and clear path to profitability vindicated the decision.
Botha joined Sequoia Capital in January 2003, working closely with Michael Moritz, who had been PayPal's lead investor. He was promoted to partner in 2007 following Google's acquisition of YouTube, an investment he had championed two years earlier. The YouTube founders were friends from his PayPal days, and Botha worked with them in Sequoia's offices iterating on the product during its formative stages. His subsequent investments include Instagram (acquired by Facebook for $1 billion in 2012), Square (public market capitalisation exceeding $40 billion at peak), MongoDB (public since 2017), Unity Technologies (public 2020-2023), Natera and numerous others.
He became head of Sequoia's US venture operations in 2010 alongside Jim Goetz, assumed sole leadership of the US business in 2017 whilst Doug Leone served as global senior steward, and was elevated to senior steward of Sequoia's global operations in July 2022. His tenure has coincided with Sequoia's organisational evolution—including the controversial 2021 introduction of the Sequoia Capital Fund, a permanent capital vehicle designed to hold positions indefinitely rather than liquidating according to traditional fund timelines—and with substantial turbulence in technology markets.
Botha's intellectual formation reflects the intersection of actuarial risk assessment, McKinsey-style structured problem-solving and the crucible of operating in a high-growth technology company during both exuberance and crisis. His repeated emphasis on cost structure, margin dynamics and unit economics reflects operating experience rather than purely financial analysis. The actuarial lens—thinking in terms of probability distributions, long time horizons and avoiding ruin—distinguishes his analytical framework from investors whose backgrounds emphasise pattern recognition or momentum-driven investing.

|
| |
| |
“If the firm grows and you expand and you can invest in other areas for growth, we’ll wind up with more jobs... we have at every step along the journey for the last forty years as technology has made us more productive. I don’t think it’s different this time [with AI].” - David Solomon - Goldman Sachs CEO
David Michael Solomon, born in 1962 in Hartsdale, New York, is an American investment banker and DJ, currently serving as the CEO and Chairman of Goldman Sachs. His journey into the financial sector began after he graduated with a BA in political science from Hamilton College. Initially, Solomon worked at Irving Trust Company and Drexel Burnham before joining Bear Stearns. In 1999, he moved to Goldman Sachs as a partner and became co-head of the High Yield and Leveraged Loan Business.
Solomon's rise within Goldman Sachs was swift and strategic. He became the co-head of the Investment Banking Division in 2006 and held this role for a decade. In 2017, he was appointed President and Chief Operating Officer, and by October 2018, he succeeded Lloyd Blankfein as CEO. He became Chairman in January 2019.
Beyond his financial career, Solomon is known for his passion for music, producing electronic dance music under the alias "DJ D-Sol". He has performed at various venues, including nightclubs and music festivals in New York, Miami, and The Bahamas.
Context of the Quote
The quote highlights Solomon's perspective on technology and job creation in the financial sector. He suggests that while technology, particularly AI, can enhance productivity and potentially lead to job reductions in certain areas, the overall growth of the firm will create more opportunities for employment. This view is rooted in his experience observing how technological advancements have historically led to increased productivity and growth for Goldman Sachs.
Leading Theorists on AI and Employment
Several leading theorists have explored the impact of AI on employment, with divergent views:
-
Joseph Schumpeter is famous for his theory of "creative destruction," which suggests that technological innovations often lead to the destruction of existing jobs but also create new ones. This cycle is seen as essential for economic growth and innovation.
-
Klaus Schwab, founder of the World Economic Forum, has discussed the Fourth Industrial Revolution, emphasizing how AI and automation will transform industries. However, he also highlights the potential for new job creation in emerging sectors.
-
Economists Erik Brynjolfsson and Andrew McAfee have written extensively on how technology can lead to both job displacement and creation. They argue that while AI may reduce certain types of jobs, it also fosters economic growth and new opportunities.
These theorists provide a backdrop for understanding Solomon's optimistic view on AI's impact on employment, focusing on the potential for growth and innovation to offset job losses.
Conclusion
David Solomon's quote encapsulates his optimism about the interplay between technology and job creation. Focusing on the strategic growth of Goldman Sachs, he believes that technological advancements will enhance productivity and create opportunities for expansion, ultimately leading to more employment opportunities. This perspective aligns with broader discussions among economists and theorists on the transformative role of AI in the workplace.

|
| |
| |
“Markets run in cycles, and whenever we've historically had a significant acceleration in a new technology that creates a lot of capital formation and therefore lots of interesting new companies around it, you generally see the market run ahead of the potential. Are there going to be winners and losers? There are going to be winners and losers.” - David Solomon - Goldman Sachs CEO
The quote, “Markets run in cycles, and whenever we've historically had a significant acceleration in a new technology that creates a lot of capital formation and therefore lots of interesting new companies around it, you generally see the market run ahead of the potential. Are there going to be winners and losers? There are going to be winners and losers,” comes from a public discussion with David Solomon, CEO of Goldman Sachs, during Italian Tech Week in October 2025. This statement was made in the context of a wide-ranging interview that addressed the state of the US and global economy, the impact of fiscal stimulus and technology infrastructure spending, and, critically, the current investment climate surrounding artificial intelligence (AI) and other emergent technologies.
Solomon’s comments were prompted by questions around the record-breaking rallies in US and global equity markets and specifically the extraordinary market capitalisations reached by leading tech firms. He highlighted the familiar historical pattern: periods of market exuberance often occur when new technologies spur rapid capital formation, leading to the emergence of numerous new companies around a transformative theme. Solomon drew parallels with the Dot-com boom to underscore the cyclical nature of markets and to remind investors that dramatic phases of growth inevitably produce both outsized winners and significant casualties.
His insight reflects a seasoned banker’s view, grounded in empirical observation: while technological waves can drive periods of remarkable wealth creation and productivity gains, they also tend to attract speculative excesses. Market valuations in these periods often disconnect from underlying fundamentals, setting the stage for later corrections. The resulting market shake-outs separate enduring companies from those that fail to deliver sustainable value.
About David Solomon
David M. Solomon is one of the most prominent figures in global finance, serving as the CEO and Chairman of Goldman Sachs since 2018. Raised in New York and a graduate of Hamilton College, Solomon has built his reputation over four decades in banking—rising through leadership positions at Irving Trust, Drexel Burnham, and Bear Stearns before joining Goldman Sachs in 1999 as a partner. He subsequently became global head of the Financing Group, then co-head of the Investment Banking Division, playing a central role in shaping the firm’s capital markets strategy.
Solomon is known for his advocacy of organisational modernisation and culture change at Goldman Sachs—prioritising employee well-being, increasing agility, and investing heavily in technology. He combines traditional deal-making acumen with an openness to digital transformation. Beyond banking, Solomon has a notable side-career as a DJ under the name DJ D-Sol, performing electronic dance music at high-profile venues.
Solomon’s career reflects both the conservatism and innovative ambition associated with modern Wall Street leadership: an ability to see risk cycles clearly, and a willingness to pivot business models to suit shifts in technological and regulatory environments. His net worth in 2025 is estimated between $85 million and $200 million, owing to decades of compensation, equity, and investment performance.
Theoretical Foundations: Cycles, Disruptive Innovation, and Market Dynamics
Solomon’s perspective draws implicitly on a lineage of economic theory and market analysis concerning cycles of innovation, capital formation, and asset bubbles. Leading theorists and their contributions include:
-
Joseph Schumpeter: Schumpeter's theory of creative destruction posited that economic progress is driven by cycles of innovation, where new technologies disrupt existing industries, create new market leaders, and ultimately cause the obsolescence or failure of firms unable to adapt. Schumpeter emphasised how innovation clusters drive periods of rapid growth, investment surges, and, frequently, speculative excess.
-
Carlota Perez: In Technological Revolutions and Financial Capital (2002), Perez advanced a model of techno-economic paradigms, proposing that every major technological revolution (e.g., steam, electricity, information technology) proceeds through phases: an initial installation period—characterised by exuberant capital inflows, speculation, and bubble formation—followed by a recessionary correction, and, eventually, a deployment period, where productive uses of the technology diffuse more broadly, generating deep-seated economic gains and societal transformation. Perez’s work helps contextualise Solomon’s caution about markets running ahead of potential.
-
Charles Kindleberger and Hyman Minsky: Both scholars examined the dynamics of financial bubbles. Kindleberger, in Manias, Panics, and Crashes, and Minsky, through his Financial Instability Hypothesis, described how debt-fuelled euphoria and positive feedback loops of speculation can drive financial markets to overshoot the intrinsic value created by innovation, inevitably resulting in busts.
-
Clayton Christensen: Christensen’s concept of disruptive innovation explains how emergent technologies, initially undervalued by incumbents, can rapidly upend entire industries—creating new winners while displacing former market leaders. His framework helps clarify Solomon’s points about the unpredictability of which companies will ultimately capture value in the current AI wave.
-
Benoit Mandelbrot: Applying his fractal and complexity theory to financial markets, Mandelbrot challenged the notion of equilibrium and randomness in price movement, demonstrating that markets are prone to extreme events—outlier outcomes that, while improbable under standard models, are a recurrent feature of cyclical booms and busts.
Practical Relevance in Today’s Environment
The patterns stressed by Solomon, and their theoretical antecedents, are especially resonant given the current environment: massive capital allocations into AI, cloud infrastructure, and adjacent technologies—a context reminiscent of previous eras where transformative innovations led markets both to moments of extraordinary wealth creation and subsequent corrections. These cycles remain a central lens for investors and business leaders navigating this era of technological acceleration.
By referencing both history and the future, Solomon encapsulates the balance between optimism over the potential of new technology and clear-eyed vigilance about the risks endemic to all periods of market exuberance.

|
| |
|