Select Page

Global Advisors | Quantified Strategy Consulting

Quote
Quote: Naval Ravikant – Venture Capitalist

Quote: Naval Ravikant – Venture Capitalist

“UI is pre-AI.” – Naval Ravikant – Venture Capitalist

Naval Ravikant stands as one of Silicon Valley’s most influential yet unconventional thinkers—a figure who bridges the gap between pragmatic entrepreneurship and philosophical inquiry. His observation that “UI is pre-AI” reflects a distinctive perspective on technological evolution that warrants careful examination, particularly given his track record as an early-stage investor in transformative technologies.

The Architect of Modern Startup Infrastructure

Ravikant’s influence on the technology landscape extends far beyond individual company investments. As co-founder, chairman, and former CEO of AngelList, he fundamentally altered how early-stage capital flows through the startup ecosystem. AngelList democratised access to venture funding, creating infrastructure that connected aspiring entrepreneurs with angel investors and venture capital firms on an unprecedented scale. This wasn’t merely a business achievement; it represented a structural shift in how innovation gets financed globally.

His investment portfolio reflects prescient timing and discerning judgement. Ravikant invested early in companies including Twitter, Uber, Foursquare, Postmates, Yammer, and Stack Overflow—investments that collectively generated over 70 exits and more than 10 unicorn companies. This track record positions him not as a lucky investor, but as someone with genuine pattern recognition capability regarding which technologies would matter most.

Beyond the Venture Capital Thesis

What distinguishes Ravikant from conventional venture capitalists is his deliberate rejection of the traditional founder mythology. He explicitly advocates against the “hustle mentality” that dominates startup culture, instead promoting a more holistic conception of wealth that encompasses time, freedom, and peace of mind alongside financial returns. This philosophy shapes how he evaluates opportunities and mentors founders—he considers not merely whether a business will scale, but whether it will scale without scaling stress.

His broader intellectual contributions extend through multiple channels. With more than 2.4 million followers on Twitter (X), Ravikant regularly shares aphoristic insights blending practical wisdom with Eastern philosophical traditions. His appearances on influential podcasts, particularly the Tim Ferriss Show and Joe Rogan Experience, have introduced his thinking to audiences far beyond Silicon Valley. Most notably, his “How to Get Rich (without getting lucky)” thread has become foundational reading across technology and business communities, articulating principles around leverage through code, capital, and content.

Understanding “UI is Pre-AI”

The quote “UI is pre-AI” requires interpretation within Ravikant’s broader intellectual framework and the contemporary technological landscape. The statement operates on multiple levels simultaneously.

The Literal Interpretation: User interface design and development necessarily precedes artificial intelligence implementation in most technology products. This reflects a practical observation about product development sequencing—one must typically establish how users interact with systems before embedding intelligent automation into those interactions. In this sense, the UI is the foundational layer upon which AI capabilities are subsequently layered.

The Philosophical Dimension: More provocatively, the statement suggests that how we structure human-computer interaction through interface design fundamentally shapes the possibilities for what artificial intelligence can accomplish. The interface isn’t merely a presentation layer; it represents the primary contact point between human intent and computational capability. Before AI can be genuinely useful, the interface must make that utility legible and accessible to end users.

The Investment Perspective: For Ravikant specifically, this observation carries investment implications. It suggests that companies solving user experience problems will likely remain valuable even as AI capabilities evolve, whereas companies that focus purely on algorithmic sophistication without considering user interaction may find their innovations trapped in laboratory conditions rather than deployed in markets.

The Theoretical Lineage

Ravikant’s observation sits within a longer intellectual tradition examining the relationship between interface, interaction, and technological capability.

Don Norman and Human-Centered Design: The foundational modern work on this subject emerged from Don Norman’s research at the University of California, San Diego, particularly his seminal work on design of everyday things. Norman argued that excellent product design requires intimate understanding of human cognition, perception, and behaviour. Before any technological system—intelligent or otherwise—can create value, it must accommodate human limitations and leverage human strengths through thoughtful interface design.

Douglas Engelbart and Augmentation Philosophy: Douglas Engelbart’s mid-twentieth-century work on human-computer augmentation established that technology’s primary purpose should be extending human capability rather than replacing human judgment. His thinking implied that interfaces represent the crucial bridge between human cognition and computational power. Without well-designed interfaces, the most powerful computational systems remain inert.

Alan Kay and Dynabook Vision: Alan Kay’s vision of personal computing—articulated through concepts like the Dynabook—emphasised that technology’s democratising potential depends entirely on interface accessibility. Kay recognised that computational power matters far less than whether ordinary people can productively engage with that power through intuitive interaction models.

Contemporary HCI Research: Modern human-computer interaction research builds on these foundations, examining how interface design shapes which problems users attempt to solve and how they conceptualise solutions. Researchers like Shneiderman and Plaisant have demonstrated empirically that interface design decisions have second-order effects on what users believe is possible with technology.

The Contemporary Context

Ravikant’s statement carries particular resonance in the current artificial intelligence moment. As organisations rush to integrate large language models and other AI systems into products, many commit what might be called “technology-first” errors—embedding sophisticated algorithms into user experiences that haven’t been thoughtfully designed to accommodate them.

Meaningful user interface design for AI-powered systems requires addressing distinct challenges: How do users understand what an AI system can and cannot do? How is uncertainty communicated? How are edge cases handled? What happens when the AI makes errors? These questions cannot be answered through better algorithms alone; they require interface innovation.

Ravikant’s observation thus functions as a corrective to the current technological moment. It suggests that the companies genuinely transforming industries through artificial intelligence will likely be those that simultaneously innovate in both algorithmic capability and user interface design. The interface becomes pre-AI not merely chronologically but causally—shaping what artificial intelligence can accomplish in practice rather than merely in principle.

Investment Philosophy Integration

This observation aligns with Ravikant’s broader investment thesis emphasising leverage and scalability. An excellent user interface represents exactly this kind of leverage—it scales human attention and human decision-making without requiring proportional increases in effort or resources. Similarly, artificial intelligence scaled through well-designed interfaces amplifies this effect, allowing individual users or organisations to accomplish work that previously required teams.

Ravikant’s focus on investments at seed and Series A stages across media, content, cloud infrastructure, and AI reflects implicit confidence that the foundational layer of how humans interact with technology remains unsettled terrain. Rather than assuming interface design has been solved, he appears to recognise that each new technological capability—whether cloud infrastructure or artificial intelligence—creates new design challenges and opportunities.

The quote ultimately encapsulates a distinctive investment perspective: that attention to human interaction, to aesthetics, to usability, represents not secondary ornamentation but primary technological strategy. In an era of intense focus on algorithmic sophistication, Ravikant reminds us that the interface through which those algorithms engage with human needs and human judgment represents the true frontier of technological value creation.

read more
Quote: Ilya Sutskever – Safe Superintelligence

Quote: Ilya Sutskever – Safe Superintelligence

“The robustness of people is really staggering.” – Ilya Sutskever – Safe Superintelligence

This statement, made in his November 2025 conversation with Dwarkesh Patel, comes from someone uniquely positioned to make such judgments: co-founder and Chief Scientist of Safe Superintelligence Inc., former Chief Scientist at OpenAI, and co-author of AlexNet—the 2012 paper that launched the modern deep learning era.

Sutskever’s claim about robustness points to something deeper than mere durability or fault tolerance. He is identifying a distinctive quality of human learning: the ability to function effectively across radically diverse contexts, to self-correct without explicit external signals, to maintain coherent purpose and judgment despite incomplete information and environmental volatility, and to do all this with sparse data and limited feedback. These capacities are not incidental features of human intelligence. They are central to what makes human learning fundamentally different from—and vastly superior to—current AI systems.

Understanding what Sutskever means by robustness requires examining not just human capabilities but the specific ways in which AI systems are fragile by comparison. It requires recognising what humans possess that machines do not. And it requires understanding why this gap matters profoundly for the future of artificial intelligence.

What Robustness Actually Means: Beyond Mere Reliability

In engineering and systems design, robustness typically refers to a system’s ability to continue functioning when exposed to perturbations, noise, or unexpected conditions. A robust bridge continues standing despite wind, earthquakes, or traffic loads beyond its design specifications. A robust algorithm produces correct outputs despite noisy inputs or computational errors.

But human robustness operates on an entirely different plane. It encompasses far more than mere persistence through adversity. Human robustness includes:

  1. Flexible adaptation across domains: A teenager learns to drive after ten hours of practice and then applies principles of vehicle control, spatial reasoning, and risk assessment to entirely new contexts—motorcycles, trucks, parking in unfamiliar cities. The principles transfer because they have been learned at a level of abstraction and generality that allows principled application to novel situations.
  2. Self-correction without external reward: A learner recognises when they have made an error not through explicit feedback but through an internal sense of rightness or wrongness—what Sutskever terms a “value function” and what we experience as intuition, confidence, or unease. A pianist knows immediately when they have struck a wrong note; they do not need external evaluation. This internal evaluative system enables rapid, efficient learning.
  3. Judgment under uncertainty: Humans routinely make decisions with incomplete information, tolerating ambiguity whilst maintaining coherent action. A teenager drives defensively not because they can compute precise risk probabilities but because they possess an internalized model of danger, derived from limited experience but somehow applicable to novel situations.
  4. Stability across time scales: Human goals, values, and learning integrate across vastly different temporal horizons. A person may pursue long-term education goals whilst adapting to immediate challenges, and these different time scales cohere into a unified, purposeful trajectory. This temporal integration is largely absent from current AI systems, which optimise for immediate reward signals or fixed objectives.
  5. Learning from sparse feedback: Humans learn from remarkably little data. A child sees a dog once or twice and thereafter recognises dogs in novel contexts, even in stylised drawings or unfamiliar breeds. This learning from sparse examples contrasts sharply with AI systems requiring thousands or millions of examples to achieve equivalent recognition.

This multifaceted robustness is what Sutskever identifies as “staggering”—not because it is strong but because it operates across so many dimensions simultaneously whilst remaining stable, efficient, and purposeful.

The Fragility of Current AI: Why Models Break

The contrast becomes clear when examining where current AI systems are fragile. Sutskever frequently illustrates this through the “jagged behaviour” problem: models that perform superhuman on benchmarks yet fail in elementary ways during real-world deployment.

A language model can score in the 88th percentile on the bar examination yet, when asked to debug code, introduces new errors whilst fixing previous ones. It cycles between mistakes even when provided clear feedback. It lacks the internal evaluative sense that tells a human programmer, “This approach is leading nowhere; I should try something different.” The model lacks robust value functions—internal signals that guide learning and action.

This fragility manifests across multiple dimensions:

  1. Distribution shift fragility: Models trained on one distribution of data often fail dramatically when confronted with data that differs from training distribution, even slightly. A vision system trained on images with certain lighting conditions fails on images with different lighting. A language model trained primarily on Western internet text struggles with cultural contexts it has not heavily encountered. Humans, by contrast, maintain competence across remarkable variation—different languages, accents, cultural contexts, lighting conditions, perspectives.
  2. Benchmark overfitting: Contemporary AI systems achieve extraordinary performance on carefully constructed evaluation tasks yet fail at the underlying capability the benchmark purports to measure. This occurs because models have been optimised (through reinforcement learning) specifically to perform well on benchmarks rather than to develop robust understanding. Sutskever has noted that this reward hacking is often unintentional—companies genuinely seeking to improve models inadvertently create RL environments that optimise for benchmark performance rather than genuine capability.
  3. Lack of principled abstraction: Models often memorise patterns rather than developing principled understanding. This manifests as inability to apply learned knowledge to genuinely novel contexts. A model may solve thousands of addition problems yet fail on a slightly different formulation it has not encountered. A human, having understood addition as a principle, applies it to any context where addition is relevant.
  4. Absence of internal feedback mechanisms: Current reinforcement learning typically provides feedback only at the end of long trajectories. A model can pursue 1,000 steps of reasoning down an unpromising path, only to receive a training signal after the trajectory completes. Humans, by contrast, possess continuous internal feedback—emotions, intuition, confidence levels—that signal whether reasoning is productive or should be redirected. This enables far more efficient learning.

The Value Function Hypothesis: Emotions as Robust Learning Machinery

Sutskever’s analysis points toward a crucial hypothesis: human robustness depends fundamentally on value functions—internal mechanisms that provide continuous, robust evaluation of states and actions.

In machine learning, a value function is a learned estimate of expected future reward or utility from a given state. In human neurobiology, value functions are implemented, Sutskever argues, through emotions and affective states. Fear signals danger. Confidence signals competence. Boredom signals that current activity is unproductive. Satisfaction signals that effort has succeeded. These emotional states, which evolution has refined over millions of years, serve as robust evaluative signals that guide learning and behaviour.

Sutskever illustrates this with a striking neurological case: a person who suffered brain damage affecting emotional processing. Despite retaining normal IQ, puzzle-solving ability, and articulate cognition, this person became radically incapable of making even trivial decisions. Choosing which socks to wear would take hours. Financial decisions became catastrophically poor. This person could think but could not effectively decide or act—suggesting that emotions (and the value functions they implement) are not peripheral to human cognition but absolutely central to effective agency.

What makes human value functions particularly robust is their simplicity and stability. They are not learned during a person’s lifetime through explicit training. They are evolved, hard-coded by billions of years of biological evolution into neural structures that remain remarkably consistent across human populations and contexts. A person experiences hunger, fear, social connection, and achievement similarly whether in ancient hunter-gatherer societies or modern industrial ones—because these value functions were shaped by evolutionary pressures that remained relatively stable.

This evolutionary hardcoding of value functions may be crucial to human learning robustness. Imagine trying to teach a child through explicit reward signals alone: “Do this task and receive points; optimise for points.” This would be inefficient and brittle. Instead, humans learn through value functions that are deeply embedded, emotionally weighted, and robust across contexts. A child learns to speak not through external reward optimisation but through intrinsic motivation—social connection, curiosity, the inherent satisfaction of communication. These motivations persist across contexts and enable robust learning.

Current AI systems largely lack this. They optimise for explicitly defined reward signals or benchmark metrics. These are fragile by comparison—vulnerable to reward hacking, overfitting, distribution shift, and the brittle transfer failures Sutskever observes.

Why This Matters Now: The Transition Point

Sutskever’s observation about human robustness arrives at a precise historical moment. As of November 2025, the AI industry is transitioning from what he terms the “age of scaling” (2020–2025) to what will be the “age of research” (2026 onward). This transition is driven by recognition that scaling alone is reaching diminishing returns. The next advances will require fundamental breakthroughs in understanding how to build systems that learn and adapt robustly—like humans do.

This creates an urgent research agenda: How do you build AI systems that possess human-like robustness? This is not a question that scales with compute or data. It is a research question—requiring new architectures, learning algorithms, training procedures, and conceptual frameworks.

Sutskever’s identification of robustness as the key distinguishing feature of human learning sets the research direction for the next phase of AI development. The question is not “how do we make bigger models” but “how do we build systems with value functions that enable efficient, self-correcting, context-robust learning?”

The Research Frontier: Leading Theorists Addressing Robustness

Antonio Damasio: The Somatic Marker Hypothesis

Antonio Damasio, neuroscientist at USC and authority on emotion and decision-making, has developed the somatic marker hypothesis—a framework explaining how emotions serve as rapid evaluative signals that guide decisions and learning. Damasio’s work provides neuroscientific grounding for Sutskever’s hypothesis that value functions (implemented as emotions) are central to effective agency. Damasio’s case studies of patients with emotional processing deficits closely parallel Sutskever’s neurological example—demonstrating that emotional value functions are prerequisites for robust, adaptive decision-making.

Judea Pearl: Causal Models and Robust Reasoning

Judea Pearl, pioneer in causal inference and probabilistic reasoning, has argued that correlation-based learning has fundamental limits and that robust generalisation requires learning causal structure—the underlying relationships between variables that remain stable across contexts. Pearl’s work suggests that human robustness derives partly from learning causal models rather than mere patterns. When a human understands how something works (causally), that understanding transfers to novel contexts. Current AI systems, lacking robust causal models, fail at transfer—a key component of robustness.

Karl Friston: The Free Energy Principle

Karl Friston, neuroscientist at University College London, has developed the free energy principle—a unified framework explaining how biological systems, including humans, maintain robustness by minimising prediction error and maintaining models of their environment and themselves. The principle suggests that what makes humans robust is not fixed programming but a general learning mechanism that continuously refines internal models to reduce surprise. This framework has profound implications for building robust AI: rather than optimising for external rewards, systems should optimise for maintaining accurate models of reality, enabling principled generalisation.

Stuart Russell: Learning Under Uncertainty and Value Alignment

Stuart Russell, UC Berkeley’s leading AI safety researcher, has emphasised that robust AI systems must remain genuinely uncertain about objectives and learn from interaction rather than operating under fixed goal specifications. Russell’s work suggests that rigidity about objectives makes systems fragile—vulnerable to reward hacking and context-specific failure. Robustness requires systems that maintain epistemic humility and adapt their understanding of what matters based on continued learning. This directly parallels how human value systems are robust: they are not brittle doctrines but evolving frameworks that integrate experience.

Demis Hassabis and DeepMind’s Continual Learning Research

Demis Hassabis, CEO of DeepMind, has invested substantial effort into systems that learn continuously from environmental interaction rather than through discrete offline training phases. DeepMind’s research on continual reinforcement learning, meta-learning, and adaptive systems reflects the insight that robustness emerges not from static pre-training but from ongoing interaction with environments—enabling systems to refine their models and value functions continuously. This parallels human learning, which is fundamentally continual rather than episodic.

Yann LeCun: Self-Supervised Learning and World Models

Yann LeCun, Meta’s Chief AI Scientist, has advocated for learning approaches that enable systems to build internal models of how the world works—what he terms world models—through self-supervised learning. LeCun argues that robust generalisation requires systems that understand causal structure and dynamics, not merely correlations. His work on self-supervised learning suggests that systems trained to predict and model their environments develop more robust representations than systems optimised for specific external tasks.

The Evolutionary Basis: Why Humans Have Robust Value Functions

Understanding human robustness requires appreciating why evolution equipped humans with sophisticated, stable value function systems.

For millions of years, humans and our ancestors faced fundamentally uncertain environments. The reward signals available—immediate sensory feedback, social acceptance, achievement, safety—needed to guide learning and behaviour across vast diversity of contexts. Evolution could not hard-code specific solutions for every possible situation. Instead, it encoded general-purpose value functions—emotions and motivational states—that would guide adaptive behaviour across contexts.

Consider fear. Fear is a robust value function signal that something is dangerous. This signal evolved in environments full of predators and hazards. Yet the same fear response that protected ancestral humans from predators also keeps modern humans safe from traffic, heights, and social rejection. The value function is robust because it operates on a general principle—danger—rather than specific memorised hazards.

Similarly, social connection, curiosity, achievement, and other human motivations evolved as general-purpose signals that, across millions of years, correlated with survival and reproduction. They remain remarkably stable across radically different modern contexts—different cultures, technologies, and social structures—because they operate at a level of abstraction robust to context change.

Current AI systems, by contrast, lack this evolutionary heritage. They are trained from scratch, often on specific tasks, with reward signals explicitly engineered for those tasks. These reward signals are fragile by comparison—vulnerable to distribution shift, overfitting, and context-specificity.

Implications for Safe AI Development

Sutskever’s emphasis on human robustness carries profound implications for safe AI development. Robust systems are safer systems. A system with genuine value functions—robust internal signals about what matters—is less vulnerable to reward hacking, specification gaming, or deployment failures. A system that learns continuously and maintains epistemic humility is more likely to remain aligned as its capabilities increase.

Conversely, current AI systems’ lack of robustness is dangerous. Systems optimised for narrow metrics can fail catastrophically when deployed in novel contexts. Systems lacking robust value functions cannot self-correct or maintain appropriate caution. Systems that cannot learn from deployment feedback remain brittle.

Building AI systems with human-like robustness is therefore not merely an efficiency question—though efficiency matters greatly. It is fundamentally a safety question. The development of robust value functions, continual learning capabilities, and general-purpose evaluative mechanisms is central to ensuring that advanced AI systems remain beneficial as they become more powerful.

The Research Direction: From Scaling to Robustness

Sutskever’s observation that “the robustness of people is really staggering” reorients the entire research agenda. The question is no longer primarily “how do we scale?” but “how do we build systems with robust value functions, efficient learning, and genuine adaptability across contexts?”

This requires:

  • Architectural innovation: New neural network structures that embed or can learn robust evaluative mechanisms—value functions analogous to human emotions.
  • Training methodology: Learning procedures that enable systems to develop genuine self-correction capabilities, learn from sparse feedback, and maintain robustness across distribution shift.
  • Theoretical understanding: Deeper mathematical and conceptual frameworks explaining what makes value functions robust and how to implement them in artificial systems.
  • Integration of findings from neuroscience, evolutionary biology, and decision theory: Drawing on multiple fields to understand the principles underlying human robustness and translating them into machine learning.

Conclusion: Robustness as the Frontier

When Sutskever identifies human robustness as “staggering,” he is not offering admiration but diagnosis. He is pointing out that current AI systems fundamentally lack what makes humans effective learners: robust value functions, efficient learning from sparse feedback, genuine self-correction, and adaptive generalisation across contexts.

The next era of AI research—the age of research beginning in 2026—will be defined largely by attempts to solve this problem. The organisation or research group that successfully builds AI systems with human-like robustness will not merely have achieved technical progress. They will have moved substantially closer to systems that learn efficiently, generalise reliably, and remain aligned to human values even as they become more capable.

Human robustness is not incidental. It is fundamental—the quality that makes human learning efficient, adaptive, and safe. Replicating it in artificial systems represents the frontier of AI research and development.

read more
Quote: Ilya Sutskever – Safe Superintelligence

Quote: Ilya Sutskever – Safe Superintelligence

“These models somehow just generalize dramatically worse than people. It’s super obvious. That seems like a very fundamental thing.” – Ilya Sutskever – Safe Superintelligence

Sutskever, as co-founder and Chief Scientist of Safe Superintelligence Inc. (SSI), has emerged as one of the most influential voices in AI strategy and research direction. His trajectory illustrates the depth of his authority: co-author of AlexNet (2012), the paper that ignited the deep learning revolution; Chief Scientist at OpenAI during the development of GPT-2 and GPT-3; and now directing a $3 billion research organisation explicitly committed to solving the generalisation problem rather than pursuing incremental scaling.

His assertion about generalisation deficiency is not rhetorical flourish. It represents a fundamental diagnostic claim about why current AI systems, despite superhuman performance on benchmarks, remain brittle, unreliable, and poorly suited to robust real-world deployment. Understanding this claim requires examining what generalisation actually means, why it matters, and what the gap between human and AI learning reveals about the future of artificial intelligence.

What Generalisation Means: Beyond Benchmark Performance

Generalisation, in machine learning, refers to the ability of a system to apply knowledge learned in one context to novel, unfamiliar contexts it has not explicitly encountered during training. A model that generalises well can transfer principles, patterns, and capabilities across domains. A model that generalises poorly becomes a brittle specialist—effective within narrow training distributions but fragile when confronted with variation, novelty, or real-world complexity.

The crisis Sutskever identifies is this: contemporary large language models and frontier AI systems achieve extraordinary performance on carefully curated evaluation tasks and benchmarks. GPT-4 scores in the 88th percentile of the bar exam. O1 solves competition mathematics problems at elite levels. Yet these same systems, when deployed into unconstrained real-world workflows, exhibit what Sutskever terms “jagged” behaviour—they repeat errors, introduce new bugs whilst fixing previous ones, cycle between mistakes even with clear corrective feedback, and fail in ways that suggest fundamentally incomplete understanding rather than mere data scarcity.

This paradox reveals a hidden truth: benchmark performance and deployment robustness are not tightly coupled. An AI system can memorise, pattern-match, and perform well on evaluation metrics whilst failing to develop the kind of flexible, transferable understanding that enables genuine competence.

The Sample Efficiency Question: Orders of Magnitude of Difference

Underlying the generalisation crisis is a more specific puzzle: sample efficiency. Why does it require vastly more training data for AI systems to achieve competence in a domain than it takes humans?

A human child learns to recognise objects through a few thousand exposures. Contemporary vision models require millions. A teenager learns to drive in approximately ten hours of practice; AI systems struggle to achieve equivalent robustness with orders of magnitude more training. A university student learns to code, write mathematically, and reason about abstract concepts—domains that did not exist during human evolutionary history—with remarkably few examples and little explicit feedback.

This disparity points to something fundamental: humans possess not merely better priors or more specialised knowledge, but better general-purpose learning machinery. The principle underlying human learning efficiency remains largely unexpressed in mathematical or computational terms. Current AI systems lack it.

Sutskever’s diagnostic claim is that this gap reflects not engineering immaturity or the need for more compute, but the absence of a conceptual breakthrough—a missing principle of how to build systems that learn as efficiently as humans do. The implication is stark: you cannot scale your way out of this problem. More data and more compute, applied to existing methodologies, will not solve it. The bottleneck is epistemic, not computational.

Why Current Models Fail at Generalisation: The Competitive Programming Analogy

Sutskever illustrates the generalisation problem through an instructive analogy. Imagine two competitive programmers:

Student A dedicates 10,000 hours to competitive programming. They memorise every algorithm, every proof technique, every problem pattern. They become exceptionally skilled within competitive programming itself—one of the very best.

Student B spends only 100 hours on competitive programming but develops deeper, more flexible understanding. They grasp underlying principles rather than memorising solutions.

When both pursue careers in software engineering, Student B typically outperforms Student A. Why? Because Student A has optimised for a narrow domain and lacks the flexible transfer of understanding that Student B developed through lighter but more principled engagement.

Current frontier AI models, in Sutskever’s assessment, resemble Student A. They are trained on enormous quantities of narrowly curated data—competitive programming problems, benchmark evaluation tasks, reinforcement learning environments explicitly designed to optimise for measurable performance. They have been “over-trained” on carefully optimised domains but lack the flexible, generalised understanding that enables robust performance in novel contexts.

This over-optimisation problem is compounded by a subtle but crucial factor: reinforcement learning optimisation targets. Companies designing RL training environments face substantial degrees of freedom in how to construct reward signals. Sutskever observes that there is often a systematic bias: RL environments are subtly shaped to ensure models perform well on public benchmarks at release time, creating a form of unintentional reward hacking where the system becomes highly tuned to evaluation metrics rather than genuinely robust to real-world variation.

The Deeper Problem: Pre-Training’s Limits and RL’s Inefficiency

The generalisation crisis reflects deeper structural issues within contemporary AI training paradigms.

Pre-training’s opacity: Large-scale language model pre-training—trained on internet text data—provides models with an enormous foundation of patterns. Yet the way models rely on this pre-training data is poorly understood. When a model fails, it is unclear whether the failure reflects insufficient statistical support in the training distribution or whether something more fundamental is missing. Pre-training provides scale but at the cost of reasoning about what has actually been learned.

RL’s inefficiency: Current reinforcement learning approaches provide training signals only at the end of long trajectories. If a model spends thousands of steps reasoning about a problem and arrives at a dead end, it receives no signal until the trajectory completes. This is computationally wasteful. A more efficient learning system would provide intermediate evaluative feedback—signals that say, “this direction of reasoning is unpromising; abandon it now rather than after 1,000 more steps.” Sutskever hypothesises that this intermediate feedback mechanism—what he terms a “value function” and what evolutionary biology has encoded as emotions—is crucial to sample-efficient learning.

The gap between how humans and current AI systems learn suggests that human learning operates on fundamentally different principles: continuous, intermediate evaluation; robust internal models of progress and performance; the ability to self-correct and redirect effort based on internal signals rather than external reward.

Generalisation as Proof of Concept: What Human Learning Reveals

A critical move in Sutskever’s argument is this: the fact that humans generalise vastly better than current AI systems is not merely an interesting curiosity—it is proof that better generalisation is achievable. The existence of human learners demonstrates, in principle, that a learning system can operate with orders of magnitude less data whilst maintaining superior robustness and transfer capability.

This reframes the research challenge. The question is no longer whether better generalisation is possible (humans prove it is) but rather what principle or mechanism underlies it. This principle could arise from:

  • Architectural innovations: new ways of structuring neural networks that embody better inductive biases for generalisation
  • Learning algorithms: different training procedures that more efficiently extract principles from limited data
  • Value function mechanisms: intermediate feedback systems that enable more efficient learning trajectories
  • Continual learning frameworks: systems that learn continuously from interaction rather than through discrete offline training phases

What matters is that Sutskever’s claim shifts the research agenda from “get more compute” to “discover the missing principle.”

The Strategic Implications: Why This Matters Now

Sutskever’s diagnosis, articulated in November 2025, arrives at a crucial moment. The AI industry has operated under the “age of scaling” paradigm since approximately 2020. During this period, the scaling laws discovered by OpenAI and others suggested a remarkably reliable relationship: larger models trained on more data with more compute reliably produced better performance.

This created a powerful strategic imperative: invest capital in compute, acquire data, build larger systems. The approach was low-risk from a research perspective because the outcome was relatively predictable. Companies could deploy enormous resources confident they would yield measurable returns.

By 2025, however, this model shows clear strain. Data is approaching finite limits. Computational resources, whilst vast, are not unlimited, and marginal returns diminish. Most importantly, the question has shifted: would 100 times more compute actually produce a qualitative transformation or merely incremental improvement? Sutskever’s answer is clear: the latter. This fundamentally reorients strategic thinking. If 100x scaling yields only incremental gains, the bottleneck is not compute but ideas. The competitive advantage belongs not to whoever can purchase the most GPUs but to whoever discovers the missing principle of generalisation.

Leading Theorists and Related Research Programs

Yann LeCun: World Models and Causal Learning

Yann LeCun, Meta’s Chief AI Scientist and a pioneer of deep learning, has long emphasized that current supervised learning approaches are fundamentally limited. His work on “world models”—internal representations that capture causal structure rather than mere correlation—points toward learning mechanisms that could enable better generalisation. LeCun’s argument is that humans learn causal models of how the world works, enabling robust generalisation because causal understanding is stable across contexts in a way that statistical correlation is not.

Geoffrey Hinton: Neuroscience-Inspired Learning

Geoffrey Hinton, recipient of the 2024 Nobel Prize in Physics for foundational deep learning work, has increasingly emphasized that neuroscience holds crucial clues for improving AI learning efficiency. His recent work on biological plausibility and learning mechanisms reflects conviction that important principles of how neural systems efficiently extract generalised understanding remain undiscovered. Hinton has expressed support for Sutskever’s research agenda, recognizing that the next frontier requires fundamental conceptual breakthroughs rather than incremental scaling.

Stuart Russell: Learning Under Uncertainty

Stuart Russell, UC Berkeley’s leading AI safety researcher, has articulated that robust AI alignment requires systems that remain genuinely uncertain about objectives and learn from interaction. This aligns with Sutskever’s emphasis on continual learning. Russell’s work highlights that systems designed to optimise fixed objectives without capacity for ongoing learning and adjustment tend to produce brittle, misaligned outcomes—a dynamic that improves when systems maintain epistemic humility and learn continuously.

Demis Hassabis and DeepMind’s Continual Learning Research

Demis Hassabis, CEO of DeepMind, has invested substantial research effort into systems that learn continually from environmental interaction rather than through discrete offline training phases. DeepMind’s work on continual reinforcement learning, meta-learning, and systems that adapt to new tasks reflects recognition that learning efficiency depends on how feedback is structured and integrated over time—not merely on total data quantity.

Judea Pearl: Causality and Abstraction

Judea Pearl, pioneering researcher in causal inference and probabilistic reasoning, has long argued that correlation-based learning has fundamental limits and that causal reasoning is necessary for genuine understanding and generalisation. His work on causal models and graphical representation of dependencies provides theoretical foundations for why systems that learn causal structure (rather than mere patterns) achieve better generalisation across domains.

The Research Agenda Going Forward

Sutskever’s claim that generalisation is the “very fundamental thing” reorients the entire research agenda. This shift has profound implications:

From scaling to methodology: Research emphasis moves from “how do we get more compute” to “what training procedures, architectural innovations, or learning algorithms enable human-like generalisation?”

From benchmarks to robustness: Evaluation shifts from benchmark performance to deployment reliability—how systems perform on novel, unconstrained tasks rather than carefully curated evaluations.

From monolithic pre-training to continual learning: The training paradigm shifts from discrete offline phases (pre-train, then RL, then deploy) toward systems that learn continuously from real-world interaction.

From scale as differentiator to ideas as differentiator: Competitive advantage in AI development becomes less about resource concentration and more about research insight—the organisation that discovers better generalisation principles gains asymmetric advantage.

The Deeper Question: What Humans Know That AI Doesn’t

Beneath Sutskever’s diagnostic claim lies a profound question: What do humans actually know about learning that AI systems don’t yet embody?

Humans learn efficiently because they:

  • Develop internal models of their own performance and progress (value functions)
  • Self-correct through continuous feedback rather than awaiting end-of-trajectory rewards
  • Transfer principles flexibly across domains rather than memorising domain-specific patterns
  • Learn from remarkably few examples through principled understanding rather than statistical averaging
  • Integrate feedback across time scales and contexts in ways that build robust, generalised knowledge

These capabilities do not require superhuman intelligence or extraordinary cognitive resources. A fifteen-year-old possesses them. Yet current AI systems, despite vastly larger parameter counts and more data, lack equivalent ability.

This gap is not accidental. It reflects that current AI development has optimised for the wrong targets—benchmark performance rather than genuine generalisation, scale rather than efficiency, memorisation rather than principled understanding. The next breakthrough requires not more of the same but fundamentally different approaches.

Conclusion: The Shift from Scaling to Discovery

Sutskever’s assertion that “these models somehow just generalize dramatically worse than people” is, at first glance, an observation of inadequacy. But reframed, it is actually a statement of profound optimism about what remains to be discovered. The fact that humans achieve vastly better generalisation proves that better generalisation is possible. The task ahead is not to accept poor generalisation as inevitable but to discover the principle that enables human-like learning efficiency.

This diagnostic shift—from “we need more compute” to “we need better understanding of generalisation”—represents the intellectual reorientation of AI research in 2025 and beyond. The age of scaling is ending not because scaling is impossible but because it has approached its productive limits. The age of research into fundamental learning principles is beginning. What emerges from this research agenda may prove far more consequential than any previous scaling increment.

read more
Quote: Ilya Sutskever – Safe Superintelligence

Quote: Ilya Sutskever – Safe Superintelligence

“Is the belief really, ‘Oh, it’s so big, but if you had 100x more, everything would be so different?’ It would be different, for sure. But is the belief that if you just 100x the scale, everything would be transformed? I don’t think that’s true. So it’s back to the age of research again, just with big computers.” – Ilya Sutskever – Safe Superintelligence

Ilya Sutskever stands as one of the most influential figures in modern artificial intelligence—a scientist whose work has fundamentally shaped the trajectory of deep learning over the past decade. As co-author of the seminal 2012 AlexNet paper, he helped catalyse the deep learning revolution that transformed machine vision and launched the contemporary AI era. His influence extends through his role as Chief Scientist at OpenAI, where he played a pivotal part in developing GPT-2 and GPT-3, the models that established large-scale language model pre-training as the dominant paradigm in AI research.

In late 2024, Sutskever departed OpenAI and co-founded Safe Superintelligence Inc. (SSI) alongside Daniel Gross and Daniel Levy, positioning the company as the world’s “first straight-shot SSI lab”—an organisation with a single focus: developing safe superintelligence without distraction from product development or revenue generation. The company has since raised $3 billion and reached a $32 billion valuation, reflecting investor confidence in Sutskever’s strategic vision and reputation.

The Context: The Exhaustion of Scaling

Sutskever’s quoted observation emerges from a moment of genuine inflection in AI development. For roughly five years—from 2020 to 2025—the AI industry operated under what he terms the “age of scaling.” This era was defined by a simple, powerful insight: that scaling pre-training data, computational resources, and model parameters yielded predictable improvements in model performance. Organisations could invest capital with low perceived risk, knowing that more compute plus more data plus larger models would reliably produce measurable gains.

This scaling paradigm was extraordinarily productive. It yielded GPT-3, GPT-4, and an entire generation of frontier models that demonstrated capabilities that astonished both researchers and the public. The logic was elegant: if you wanted better AI, you simply scaled the recipe. Sutskever himself was instrumental in validating this approach. The word “scaling” became conceptually magnetic, drawing resources, attention, and organisational focus toward a single axis of improvement.

Yet by 2024–2025, that era began showing clear signs of exhaustion. Data is finite—the amount of high-quality training material available on the internet is not infinite, and organisations are rapidly approaching meaningful constraints on pre-training data supply. Computational resources, whilst vast, are not unlimited, and the economic marginal returns on compute investment have become less obvious. Most critically, the empirical question has shifted: if current frontier labs have access to extraordinary computational resources, would 100 times more compute actually produce a qualitative transformation in capabilities, or merely incremental improvement?

Sutskever’s answer is direct: incremental, not transformative. This reframing is consequential because it redefines where the bottleneck actually lies. The constraint is no longer the ability to purchase more GPUs or accumulate more data. The constraint is ideas—novel technical approaches, new training methodologies, fundamentally different recipes for building AI systems.

The Jaggedness Problem: Theory Meeting Reality

One critical observation animates Sutskever’s thinking: a profound disconnect between benchmark performance and real-world robustness. Current models achieve superhuman performance on carefully constructed evaluation tasks—yet in deployment, they exhibit what Sutskever calls “jagged” behaviour. They repeat errors, introduce new bugs whilst fixing old ones, and cycle between mistakes even when given clear corrective feedback.

This apparent paradox suggests something deeper than mere data or compute insufficiency. It points to inadequate generalisation—the inability to transfer learning from narrow, benchmark-optimised domains into the messy complexity of real-world application. Sutskever frames this through an analogy: a competitive programmer who practises 10,000 hours on competition problems will be highly skilled within that narrow domain but often fails to transfer that knowledge flexibly to broader engineering challenges. Current models, in his assessment, resemble that hyper-specialised competitor rather than the flexible, adaptive learner.

The Core Insight: Generalisation Over Scale

The central thesis animating Sutskever’s work at SSI—and implicit in his quote—is that human-like generalisation and learning efficiency represent a fundamentally different ML principle than scaling, one that has not yet been discovered or operationalised within contemporary AI systems.

Humans learn with orders of magnitude less data than large models yet generalise far more robustly to novel contexts. A teenager learns to drive in roughly ten hours of practice; current AI systems struggle to acquire equivalent robustness with vastly more training data. This is not because humans possess specialised evolutionary priors for driving (a recent activity that evolution could not have optimized for); rather, it suggests humans employ a more general-purpose learning principle that contemporary AI has not yet captured.

Sutskever hypothesises that this principle is connected to what he terms “value functions”—internal mechanisms akin to emotions that provide continuous, intermediate feedback on actions and states, enabling more efficient learning than end-of-trajectory reward signals alone. Evolution appears to have hard-coded robust value functions—emotional and evaluative systems—that make humans viable, adaptive agents across radically different environments. Whether an equivalent principle can be extracted purely from pre-training data, rather than built into learning architecture, remains uncertain.

The Leading Theorists and Related Work

Yann LeCun and Data Efficiency

Yann LeCun, Meta’s Chief AI Scientist and a pioneer of deep learning, has long emphasised the importance of learning efficiency and the role of what he terms “world models” in understanding how agents learn causal structure from limited data. His work highlights that human vision achieves remarkable robustness from developmental data scarcity—children recognise cars after seeing far fewer exemplars than AI systems require—suggesting that the brain employs inductive biases or learning principles that current architectures lack.

Geoffrey Hinton and Neuroscience-Inspired AI

Geoffrey Hinton, winner of the 2024 Nobel Prize in Physics for his work on deep learning, has articulated concerns about AI safety and expressed support for Sutskever’s emphasis on fundamentally rethinking how AI systems learn and align. Hinton’s career-long emphasis on biologically plausible learning mechanisms—from Boltzmann machines to capsule networks—reflects a conviction that important principles for efficient learning remain undiscovered and that neuroscience offers crucial guidance.

Stuart Russell and Alignment Through Uncertainty

Stuart Russell, UC Berkeley’s leading AI safety researcher, has emphasised that robust AI alignment requires systems that remain genuinely uncertain about human values and continue learning from interaction, rather than attempting to encode fixed objectives. This aligns with Sutskever’s thesis that safe superintelligence requires continual learning in deployment rather than monolithic pre-training followed by fixed RL optimisation.

Demis Hassabis and Continual Learning

Demis Hassabis, CEO of DeepMind and a co-developer of AlphaGo, has invested significant research effort into systems that learn continually rather than through discrete training phases. This work recognises that biological intelligence fundamentally involves interaction with environments over time, generating diverse signals that guide learning—a principle SSI appears to be operationalising.

The Paradigm Shift: From Offline to Online Learning

Sutskever’s thinking reflects a broader intellectual shift visible across multiple frontiers of AI research. The dominant pre-training + RL framework assumes a clean separation: a model is trained offline on fixed data, then post-trained with reinforcement learning, then deployed. Increasingly, frontier researchers are questioning whether this separation reflects how learning should actually work.

His articulation of “age of research” signals a return to intellectual plurality and heterodox experimentation—the opposite of the monoculture that scaling paradigm created. When everyone is racing to scale the same recipe, innovation becomes incremental. When new recipes are required, diversity of approach becomes an asset rather than liability.

The Stakes and Implications

This reframing carries significant strategic implications. If the bottleneck is truly ideas rather than compute, then smaller, more cognitively coherent organisations with clear intellectual direction may outpace larger organisations constrained by product commitments, legacy systems, and organisational inertia. If the key innovation is a new training methodology—one that achieves human-like generalisation through different mechanisms—then the first organisation to discover and validate it may enjoy substantial competitive advantage, not through superior resources but through superior understanding.

Equally, this framing challenges the common assumption that AI capability is primarily a function of computational spend. If methodological innovation matters more than scale, the future of AI leadership becomes less a question of capital concentration and more a question of research insight—less about who can purchase the most GPUs, more about who can understand how learning actually works.

Sutskever’s quote thus represents not merely a rhetorical flourish but a fundamental reorientation of strategic thinking about AI development. The age of confident scaling is ending. The age of rigorous research into the principles of generalisation, sample efficiency, and robust learning has begun.

read more
Quote: Warren Buffet – Investor

Quote: Warren Buffet – Investor

“Never invest in a company without understanding its finances. The biggest losses in stocks come from companies with poor balance sheets.” – Warren Buffet – Investor

This statement encapsulates Warren Buffett’s foundational conviction that a thorough understanding of a company’s financial health is essential before any investment is made. Buffett, revered as one of the world’s most successful and influential investors, has built his career—and the fortunes of Berkshire Hathaway shareholders—by analysing company financials with forensic precision and prioritising robust balance sheets. A poor balance sheet typically signals overleveraging, weak cash flows, and vulnerability to adverse market cycles, all of which heighten the risk of capital loss.

Buffett’s approach can be traced directly to the principles of value investing: only purchase businesses trading below their intrinsic value, and rigorously avoid companies whose finances reveal underlying weakness. This discipline shields investors from the pitfalls of speculation and market fads. Paramount to this method is what Buffett calls a margin of safety—a buffer between a company’s market price and its real worth, aimed at mitigating downside risks, especially those stemming from fragile balance sheets. His preference for quality over quantity similarly reflects a bias towards investing larger sums in a select number of financially sound companies rather than spreading capital across numerous questionable prospects.

Throughout his career, Buffett has consistently advocated for investing only in businesses that one fully understands. He famously avoids complexity and “fashionable trends,” stating that clarity and financial strength supersede cleverness or hype. His guiding mantra to “never lose money,” and the prompt reminder “never forget the first rule,” further reinforces his risk-averse methodology.

Background on Warren Buffett

Born in 1930 in Omaha, Nebraska, Warren Buffett demonstrated an early fascination with business and investing. He operated as a stockbroker, bought and sold pinball machines, and eventually took over Berkshire Hathaway, transforming it from a struggling textile manufacturer into a global conglomerate. His stewardship is defined not only by outsized returns, but by a consistent, rational framework for capital allocation; he eschews speculation and prizes businesses with predictable earnings, capable leadership, and resilient competitive advantages. Buffett’s investment tenets, traced back to Benjamin Graham and refined with Charlie Munger, remain the benchmark for disciplined, risk-conscious investing.

Leading Theorists on Financial Analysis and Value Investing

The intellectual foundation of Buffett’s philosophy rests predominantly on the work of Benjamin Graham and, subsequently, David Dodd:

  • Benjamin Graham
    Often characterised as the “father of value investing,” Graham developed a rigorous framework for asset selection based on demonstrable financial solidity. His landmark work, The Intelligent Investor (1949), formalised the notion of intrinsic value, margin of safety, and the critical analysis of financial statements. Graham’s empirical, rules-based approach sought to remove emotion from investment decision-making, placing systematic, intensive financial review at the forefront.
  • David Dodd
    Co-author of Security Analysis with Graham, Dodd expanded and codified approaches for in-depth business valuation, championing comprehensive audit of balance sheets, income statements, and cash flow reports. The Graham-Dodd method remains the global standard for security analysis.
  • Charlie Munger
    Buffett’s long-time business partner, Charlie Munger, is credited with shaping the evolution from mere statistical bargains (“cigar butt” investing) towards businesses with enduring competitive advantage. Munger advocates a broadened mental toolkit (“worldly wisdom”) integrating qualitative insights—on management, culture, and durability—with rigorous financial vetting.
  • Peter Lynch
    Known for managing the Magellan Fund at Fidelity, Lynch famously encouraged investors to “know what you own,” reinforcing the necessity of understanding a business’s financial fibre before participation. He also stressed that the gravest investing errors stem from neglecting financial fundamentals, echoing Buffett’s caution on poor balance sheets.
  • John Bogle
    As the founder of Vanguard and inventor of the index fund, Bogle’s influence stems from his advocacy of broad diversification—but he also warned sharply against investing in companies without sound financial disclosure, because broad market risks are magnified in the presence of individual corporate failure.

Conclusion of Context

Buffett’s quote is not merely a rule-of-thumb—it expresses one of the most empirically validated truths in investment history: deep analysis of company finances is indispensable to avoiding catastrophic losses. The theorists who shaped this doctrine did so by instituting rigorous standards and repeatable frameworks that continue to underpin modern investment strategy. Buffett’s risk-averse, fundamentals-rooted vision stands as a beacon of prudence in an industry rife with speculation. His enduring message—understand the finances; invest only in quality—remains the starting point for both novice and veteran investors seeking resilience and sustainable wealth.

read more
Quote: Sam Walton – American retail pioneer

Quote: Sam Walton – American retail pioneer

“Great ideas come from everywhere if you just listen and look for them. You never know who’s going to have a great idea.” – Sam Walton – American retail pioneer

This quote epitomises Sam Walton’s core leadership principle—openness to ideas from all levels of an organisation. Walton, the founder of Walmart and Sam’s Club, was known for his relentless focus on operational efficiency, cost leadership, and, crucially, a culture that actively valued contributions from employees at every tier.

Walton’s approach stemmed from his own lived experience. Born in 1918 in rural Oklahoma, he grew up during the Great Depression—a time that instilled a profound respect for hard work and creative problem-solving. After service in the US Army, he managed a series of Ben Franklin variety stores. Denied the opportunity to pilot a new discount retail model by his franchisor, Walton struck out on his own, opening the first Walmart in Rogers, Arkansas in 1962, funded chiefly through personal risk and relentless ambition.

From the outset, Walton positioned himself as a learner—famously travelling across the United States to observe competitors and often spending time on the shop floor listening to the insights of front-line staff and customers. He believed valuable ideas could emerge from any source—cashiers, cleaners, managers, or suppliers—and his instinct was to capitalise on this collective intelligence.

His management style, shaped by humility and a drive to democratise innovation, helped Walmart scale from a single store to the world’s largest retailer by the early 1990s. The company’s relentless growth and robust internal culture were frequently attributed to Walton’s ability to source improvements and innovations bottom-up rather than solely relying on top-down direction.

About Sam Walton

Sam Walton (1918–1992) was an American retail pioneer who, from modest beginnings, changed global retailing. His vision for Walmart was centred on three guiding principles:

  • Offering low prices for everyday goods.
  • Maintaining empathetic customer service.
  • Cultivating a culture of shared ownership and continual improvement through employee engagement.

Despite his immense success and wealth, Walton was celebrated for his modesty—driving a used pickup, wearing simple clothes, and living in the same town where his first store opened. He ultimately built a business empire that, by 1992, encompassed over 2,000 stores and employed more than 380,000 people.

Leading Theorists Related to the Subject Matter

Walton’s quote and philosophy connect to three key schools of thought in innovation and management theory:

1. Peter Drucker
Peter Drucker, often called the father of modern management, advocated for management by walking around: leaders should remain closely connected to their organisations and use the intelligence of their workforce to inform decision-making. Drucker taught that innovation is an organisational discipline, not the exclusive preserve of senior leadership or R&D specialists.

2. Henry Chesbrough
Chesbrough developed the concept of open innovation, which posits that breakthrough ideas often originate outside a company’s traditional boundaries. He argued that organisations should purposefully encourage inflow and outflow of knowledge to accelerate innovation and create value, echoing Walton’s insistence that great ideas can (and should) come from anywhere.

3. Simon Sinek
In his influential work Start with Why, Sinek explores the notion that transformational leaders elicit deep engagement and innovative thinking by grounding teams in purpose (“Why”). Sinek identifies that companies weld innovation into their DNA when leaders empower all employees to contribute to improvement and strategic direction.

Theorist
Core Idea
Relevance to Walton’s Approach
Peter Drucker
Management by walking around; broad-based engagement
Walton’s direct engagement with staff
Henry Chesbrough
Open innovation; ideas flow in and out of the organisation
Walton’s receptivity beyond hierarchy
Simon Sinek
Purpose-based leadership for innovation and loyalty
Walton’s mission-driven, inclusive ethos

Additional Relevant Thinkers and Concepts

  • Clayton Christensen: In The Innovator’s Dilemma, he highlights the role of disruptive innovation which is frequently initiated by those closest to the customer or the front line, not at the corporate pinnacle.
  • Eric Ries: In The Lean Startup, Ries argues it is the fast feedback and agile learning from the ground up that enables organisations to innovate ahead of competitors—a direct parallel to Walton’s method of sourcing and testing ideas rapidly in store environments.

Sam Walton’s lasting impact is not just Walmart’s size, but the conviction that listening widely—to employees, customers, and the broader community—unlocks the innovations that fuel lasting competitive advantage. This belief is increasingly echoed in modern leadership thinking and remains foundational for organisations hoping to thrive in a fast-changing world.

read more
Quote: Stephen Schwartzman – Blackstone Founder

Quote: Stephen Schwartzman – Blackstone Founder

“You have to be very gentle around people. If you’re in a leadership position, people hear your words amplified. You have to be very careful what you say and how you say it. You always have to listen to what other people have to say. I genuinely want to know what everybody else thinks.” – Stephen Schwarzman – Blackstone Founder

“You have to be very gentle around people. If you’re in a leadership position, people hear your words amplified. You have to be very careful what you say and how you say it. You always have to listen to what other people have to say. I genuinely want to know what everybody else thinks.” – Stephen Schwarzman – Blackstone Founder

Stephen A. Schwarzman’s quote on gentle, thoughtful leadership encapsulates decades spent at the helm of Blackstone—the world’s largest alternative asset manager—where he forged a distinctive culture and process rooted in careful listening, respectful debate, humility, and operational excellence. The story behind this philosophy is marked by formative setbacks, institutional learning, and the broader evolution of modern leadership theory.

Stephen Schwarzman: Background and Significance

Stephen A. Schwarzman, born in 1947 in Philadelphia, rose to prominence after co-founding Blackstone in 1985 with Pete Peterson. Initially, private markets comprised a tiny fraction of institutional portfolios; under his stewardship, allocations in private assets have grown exponentially, fundamentally reshaping global investing. Schwarzman is renowned for his relentless pursuit of operational improvement, risk discipline, and market timing—his mantra, “Don’t lose money,” is enforced by multi-layered approval and rigorous debate.

Schwarzman’s experience as a leader is deeply shaped by early missteps. The Edgecomb Steel investment loss was pivotal: it catalyzed Blackstone’s institutionalized investment committees, de-risking debates, and a culture where anyone may challenge ideas so long as discussion remains fact-based and impersonal. This setback taught him accountability, humility, and the value of systemic learning—his response was not to retreat from risk, but to build a repeatable, challenge-driven process. Crucially, he narrates his own growth from a self-described “C or D executive” to a leader who values gentleness, clarity, humor, and private critique—understanding that words uttered from the top echo powerfully and can shape (or harm) culture.

Beyond technical accomplishments, Schwarzman’s legacy is one of building enduring institutions through codified values: integrity, decency, and hard work. His leadership maxim—“be gentle, clear, and high standard; always listen”—is a template for strong cultures, high performance, and sustainable growth.

The Context of the Quote

The quoted passage emerges from Schwarzman’s reflections on leadership lessons acquired over four decades. Known for candid self-assessment, he openly admits to early struggles with management style but evolved to prioritize humility, care, and active listening. At Blackstone, this meant never criticizing staff in public and always seeking divergent views to inform decisions. He emphasizes that a leader’s words carry amplified weight among teams and stakeholders; thus, intentional communication and genuine listening are essential for nurturing an environment of trust, engagement, and intelligent risk-taking.

This context is inseparable from Blackstone’s broader organizational playbook: institutionalized judgment, structured challenge, and brand-centered culture—all designed to accumulate wisdom, avoid repeating mistakes, and compound long-term value. Schwarzman’s leadership pathway is a case study in the power of personal evolution, open dialogue, and codified norms that outlast the founder himself.

Leading Theorists and Historical Foundations

Schwarzman’s leadership philosophy is broadly aligned with a lineage of thinkers who have shaped modern approaches to management, organizational behavior, and culture:

  • Peter Drucker: Often called the “father of modern management,” Drucker stressed that leadership is defined by results and relationships, not positional power. His work emphasized listening, empowering employees, and the ethical responsibility of those at the top.

  • Warren Bennis: Bennis advanced concepts of authentic leadership, self-awareness, and transparency. He argued that leaders should be vulnerable, model humility, and act as facilitators of collective intelligence rather than commanders.

  • Jim Collins: In “Good to Great,” Collins describes “Level 5 Leaders” as those who combine professional will with personal humility. Collins underscores that amplifying diverse viewpoints and creating cultures of disciplined debate lead to enduring success.

  • Edgar Schein: Schein’s studies of organizational culture reveal that leaders not only set behavioral norms through their actions and words but also shape “cultural DNA” by embedding values of learning, dialogue, and respect.

  • Amy Edmondson: Her pioneering work in psychological safety demonstrates that gentle leadership—rooted in listening and respect—fosters environments where people can challenge ideas, raise concerns, and innovate without fear.

Each of these theorists contributed to the understanding that gentle, attentive leadership is not weakness, but a source of institutional strength, resilience, and competitive advantage. Their concepts mirror the systems at Blackstone: open challenge, private correction, and leadership by example.

Schwarzman’s Distinction and Industry Impact

Schwarzman’s practice stands out in several ways. He institutionalized lessons from mistakes to create robust decision processes and a genuine challenge culture. His insistence on brand-building as strategy—where every decision, hire, and visual artifact reinforces trust—reflects an awareness of the symbolic weight of leadership. Under his guidance, Blackstone’s transformation from a two-person startup into a global giant offers a living illustration of how values, process, and leadership style drive superior, sustainable outcomes.

In summary, the quoted insight is not platitude, but hard-won experience from a legendary founder whose methods echo the best modern thinking on leadership, learning, and organizational resilience. The theorists tracing this journey—from Drucker to Edmondson—affirm that the path to “enduring greatness” lies in gentle authority, careful listening, institutionalized memory, and the humility to learn from every setback.

read more
Quote: Stephen Schwartzman – Blackstone Founder

Quote: Stephen Schwartzman – Blackstone Founder

“I always felt that somebody was only capable of one super effort to create something that can really be consequential. There are so many impediments to being successful. If you’re on the field, you’re there to win, and to win requires an enormous amount of practice – pushing yourself really to the breaking point.” – Stephen Schwarzman – Blackstone Founder

Stephen A. Schwarzman is a defining figure in global finance and alternative investments. He is Chairman, CEO, and Co-Founder of Blackstone, the world’s largest alternative investment firm, overseeing over $1.2 trillion in assets.

Backstory and Context of the Quote

Stephen Schwarzman’s perspective on effort, practice, and success is rooted in over four decades building Blackstone from a two-person start-up to an institution that has shaped capital markets worldwide. The referenced quote captures his philosophy: that achieving anything truly consequential demands a singular, maximal effort—a philosophy he practised as Blackstone’s founder and architect.

Schwarzman began his career in mergers and acquisitions at Lehman Brothers in the 1970s, where he met Peter G. Peterson. Their complementary backgrounds—a combination of strategic vision and operational drive—empowered them to establish Blackstone in 1985, initially with just $400,000 in seed capital and a big ambition to build a differentiated investment firm. The mid-1980s financial environment, marked by booming M&A activity, provided fertile ground for innovation in buyouts and private markets.

From the outset, Schwarzman instilled a culture of rigorous preparation and discipline. A landmark early setback—the unsuccessful investment in Edgecomb Steel—became a pivotal learning event. It led Schwarzman to institutionalise robust investment committees, open and adversarial (yet respectful) debate, and a relentless process of due diligence. This learning loop, focused on not losing money and fact-based challenge culture, shaped Blackstone’s internal systems and risk culture for decades to come.

His attitude to practice, perseverance, and operating at the limit is not merely rhetorical—it is Blackstone’s operational model: selecting complex assets, professionalising management, and adding value through operational transformation before timing exits for maximum advantage. The company’s strict approval layers, multi-stage risk screening, and exacting standards demonstrate Schwarzman’s belief that only by pushing to the limits of endurance—and addressing every potential weakness—can lasting value be created.

In his own words, Schwarzman attributes success not to innate brilliance but to grit, repetition, and the ability to learn from failure. This is underscored by his leadership style, which evolved towards being gentle, clear, and principled, setting high standards while building an enduring culture based on integrity, decency, and open debate.

About Stephen A. Schwarzman

  • Born in 1947 in Philadelphia, Schwarzman studied at Yale University (where he was a member of Skull and Bones) and earned an MBA from Harvard Business School.
  • Blackstone, which he co-founded in 1985, began as an M&A boutique and now operates across private equity, real estate, credit, hedge funds, infrastructure, and life sciences, making it a recognised leader in global investment management.
  • Under Schwarzman’s leadership, Blackstone institutionalised patient, active ownership—acquiring, improving, and timing the exit from portfolio companies for optimal results while actively shaping industry standards in governance and risk management.
  • He is also known for his philanthropy, having signed The Giving Pledge and contributed significantly to education, arts, and culture.
  • His autobiography, What It Takes: Lessons in the Pursuit of Excellence, distils the philosophy underpinning his business and personal success.
  • Schwarzman’s role as a public intellectual and advisor has seen him listed among the “World’s Most Powerful People” and “Time 100 Most Influential People”.

Leading Theorists and Intellectual Currents Related to the Quote

The themes embodied in Schwarzman’s philosophy—singular effort, practice to breaking point, coping with setbacks, and building institutional culture—draw on and intersect with several influential theorists and schools of thought in management and the psychology of high achievement:

  • Anders Ericsson (Deliberate Practice): Ericsson’s research underscores that deliberate practice—extended, focused effort with ongoing feedback—is critical to acquiring expert performance in any field. Schwarzman’s stress on “enormous amount of practice” parallels Ericsson’s findings that natural talent is far less important than methodical, sustained effort.
  • Angela Duckworth (Grit): Duckworth’s work on “grit” emphasises passion and perseverance for long-term goals as key predictors of success. Her research supports Schwarzman’s belief that breaking through obstacles—and continuing after setbacks—is fundamental for consequential achievement.
  • Carol Dweck (Growth Mindset): Dweck demonstrated that embracing a “growth mindset”—seeing failures as opportunities to learn rather than as endpoints—fosters resilience and continuous improvement. Schwarzman’s approach to institutionalising learning from failure at Blackstone reflects this theoretical foundation.
  • Peter Drucker (Management by Objectives and Institutional Culture): Drucker highlighted the importance of clear organisational goals, continuous learning, and leadership by values for building enduring institutions. Schwarzman’s insistence on codifying culture, open debate, and aligning every decision with the brand reflects Drucker’s emphasis on the importance of system and culture in organisational performance.
  • Jim Collins (Built to Last, Good to Great): Collins’ research into successful companies found a common thread of fanatical discipline, a culture of humility and rigorous debate, all driven by a sense of purpose. These elements are present throughout Blackstone’s governance model and leadership ethos as steered by Schwarzman.
  • Michael Porter (Competitive Strategy): Porter’s concept of sustained competitive advantage through unique positioning and strategic differentiation is echoed in Blackstone’s approach—actively improving operations rather than simply relying on market exposure, and committing to ‘winning’ through operational and structural edge.

Summary

Schwarzman’s quote is not only a personal reflection but also a distillation of enduring principles in high achievement and institutional leadership. It is the lived experience of building Blackstone—a case study in dedication, resilience, and the institutionalisation of excellence. His story, and the theoretical underpinnings echoed in his approach, provide a template for excellence and consequence in any field marked by complexity, competition, and the need for sustained, high-conviction effort.

read more
Quote: Alex Karp – Palantir CEO

Quote: Alex Karp – Palantir CEO

“The idea that chips and ontology is what you want to short is batsh*t crazy.” – Alex Karp -Palantir CEO

Alex Karp, co-founder and CEO of Palantir Technologies, delivered the now widely-circulated statement, “The idea that chips and ontology is what you want to short is batsh*t crazy,” in response to famed investor Michael Burry’s high-profile short positions against both Palantir and Nvidia. This sharp retort came at a time when Palantir, an enterprise software and artificial intelligence (AI) powerhouse, had just reported record earnings and was under intense media scrutiny for its meteoric stock rise and valuation.

Context of the Quote

The remark was made in early November 2025 during a CNBC interview, following public disclosures that Michael Burry—of “The Big Short” fame—had taken massive short positions in Palantir and Nvidia, two companies at the heart of the AI revolution. Burry’s move, reminiscent of his contrarian bets during the 2008 financial crisis, was interpreted by the market as both a challenge to the soaring “AI trade” and a critique of the underlying economics fueling the sector’s explosive growth.

Karp’s frustration was palpable: not only was Palantir producing what he described as “anomalous” financial results—outpacing virtually all competitors in growth, cash flow, and customer retention—but it was also emerging as the backbone of data-driven operations across government and industry. For Karp, Burry’s short bet went beyond traditional market scepticism; it targeted firms, products (“chips” and “ontology”—the foundational hardware for AI and the architecture for structuring knowledge), and business models proven to be both technically indispensable and commercially robust. Karp’s rejection of the “short chips and ontology” thesis underscores his belief in the enduring centrality of the technologies underpinning the modern AI stack.

Backstory and Profile: Alex Karp

Alex Karp stands out as one of Silicon Valley’s true iconoclasts:

  • Background and Education: Born in New York City in 1967, Karp holds a philosophy degree from Haverford College, a JD from Stanford, and a PhD in social theory from Goethe University Frankfurt, where he studied under and wrote about the influential philosopher Jürgen Habermas. This rare academic pedigree—blending law, philosophy, and critical theory—deeply informs both his contrarian mindset and his focus on the societal impact of technology.
  • Professional Arc: Before founding Palantir in 2004 with Peter Thiel and others, Karp had forged a career in finance, running the London-based Caedmon Group. At Palantir, he crafted a unique culture and business model, combining a wellness-oriented, sometimes spiritual corporate environment with the hard-nosed delivery of mission-critical systems for Western security, defence, and industry.
  • Leadership and Philosophy: Karp is known for his outspoken, unconventional leadership. Unafraid to challenge both Silicon Valley’s libertarian ethos and what he views as the groupthink of academic and financial “expert” classes, he publicly identifies as progressive—yet separates himself from establishment politics, remaining both a supporter of the US military and a critic of mainstream left and right ideologies. His style is at once brash and philosophical, combining deep skepticism of market orthodoxy with a strong belief in the capacity of technology to deliver real-world, not just notional, value.
  • Palantir’s Rise: Under Karp, Palantir grew from a niche contractor to one of the world’s most important data analytics and AI companies. Palantir’s products are deeply embedded in national security, commercial analytics, and industrial operations, making the company essential infrastructure in the rapidly evolving AI economy.

Theoretical Background: ‘Chips’ and ‘Ontology’

Karp’s phrase pairs two of the foundational concepts in modern AI and data-driven enterprise:

  • Chips: Here, “chips” refers specifically to advanced semiconductors (such as Nvidia’s GPUs) that provide the computational horsepower essential for training and deploying cutting-edge machine learning models. The AI revolution is inseparable from advances in chip design, leading to historic demand for high-performance hardware.
  • Ontology: In computer and information science, “ontology” describes the formal structuring and categorising of knowledge—making data comprehensible, searchable, and actionable by algorithms. Robust ontologies enable organisations to unify disparate data sources, automate analytical reasoning, and achieve the “second order” efficiencies of AI at scale.

Leading theorists in the domain of ontology and AI include:

  • John McCarthy: A founder of artificial intelligence, McCarthy’s foundational work on formal logic and semantics laid groundwork for modern ontological structures in AI.
  • Tim Berners-Lee: Creator of the World Wide Web, Berners-Lee developed the Semantic Web, championing knowledge structuring via ontologies—enabling data to be machine-readable and all but indispensable for AI’s next leap.
  • Thomas Gruber: Known for his widely cited definition of ontology in AI as “a specification of a conceptualisation,” Gruber’s research shaped the field’s approach to standardising knowledge representations for complex applications.

In the chip space, the pioneering work of:

  • Jensen Huang: CEO and co-founder of Nvidia, drove the company’s transformation from graphics to AI acceleration, cementing the centrality of chips as the hardware substrate for everything from generative AI to advanced analytics.
  • Gordon Moore and Robert Noyce: Their early explorations in semiconductor fabrication set the stage for the exponential hardware progress that enabled the modern AI era.

Insightful Context for the Modern Market Debate

The “chips and ontology” remark reflects a deep divide in contemporary technology investing:

  • On one side, sceptics like Burry see signs of speculative excess, reminiscent of prior bubbles, and bet against companies with high valuations—even when those companies dominate core technologies fundamental to AI.
  • On the other, leaders like Karp argue that while the broad “AI trade” risks pockets of overvaluation, the engine—the computational hardware (chips) and data-structuring logic (ontology)—are not just durable, but irreplaceable in the digital economy.

With Palantir and Nvidia at the centre of the current AI-driven transformation, Karp’s comment captures not just a rebuttal to market short-termism, but a broader endorsement of the foundational technologies that define the coming decade. The value of “chips and ontology” is, in Karp’s eyes, anchored not in market narrative but in empirical results and business necessity—a perspective rooted in a unique synthesis of philosophy, technology, and radical pragmatism.

read more
Quote: David Solomon – Goldman Sachs CEO

Quote: David Solomon – Goldman Sachs CEO

“Generally speaking people hate change. It’s human nature. But change is super important. It’s inevitable. In fact, on my desk in my office I have a little plaque that says ‘Change or die.’ As a business leader, one of the perspectives you have to have is that you’ve got to constantly evolve and change.” – David Solomon – Goldman Sachs CEO

The quoted insight comes from David M. Solomon, Chief Executive Officer and Chairman of Goldman Sachs, a role he has held since 2018. It was delivered during a high-profile interview at The Economic Club of Washington, D.C., 30 October 2025, as Solomon reflected on the necessity of adaptability both personally and as a leader within a globally significant financial institution.

“We have very smart people, and we can put these [AI] tools in their hands to make them more productive… By using AI to reimagine processes, we can create operating efficiencies that give us a scaled opportunity to reinvest in growth.” – David Solomon – Goldman Sachs CEO

David Solomon, Chairman and CEO of Goldman Sachs, delivered the quoted remarks during an interview at the HKMA Global Financial Leaders’ Investment Summit on 4 November 2025, articulating Goldman’s strategic approach to integrating artificial intelligence across its global franchise. His comments reflect both personal experience and institutional direction: leveraging new technology to drive productivity, reimagine workflows, and reinvest operational gains in sustainable growth, rather than pursuing simplistic headcount reductions or technological novelty for its own sake.

Backstory and Context of the Quote

David Solomon’s statement arises from Goldman Sachs’ current transformation—“Goldman Sachs 3.0”—centred on AI-driven process re-engineering. Rather than employing AI simply as a cost-cutting device, Solomon underscores its strategic role as an enabler for “very smart people” to magnify their productivity and impact. This perspective draws on his forty-year career in finance, where successive waves of technological disruption (from Lotus 1-2-3 spreadsheets to cloud computing) have consistently shifted how talent is leveraged, but have not diminished its central value.

The immediate business context is one of intense change: regulatory uncertainty in cross-border transactions, rebounding capital flows into China post-geopolitical tension, and a high backlog of M&A activity, particularly for large-cap US transactions. In this environment, efficiency gains from AI allow frontline teams to refocus on advisory, origination, and growth while adjusting operational models at a rapid pace. Solomon’s leadership style—pragmatic, unsentimental, and data-driven—favours process optimisation, open collaboration, and the breakdown of legacy silos.

About David Solomon

Background:

  • Born in Hartsdale, New York, in 1962; educated at Hamilton College with a BA in political science, then entered banking.
  • Career progression: Held senior roles at Irving Trust, Drexel Burnham, Bear Stearns; joined Goldman Sachs in 1999 as partner, eventually leading the Financing Group and serving as co-head of the Investment Banking Division for a decade.
  • Appointed President and COO in 2017, then CEO in October 2018 and Chairman in January 2019, succeeding Lloyd Blankfein.
  • Brought a reputation for transformative leadership, advocating modernisation, flattening hierarchies, and integrating technology across every aspect of the firm’s operations.

Leadership and Culture:

  • Solomon is credited with pushing through “One Goldman Sachs,” breaking down internal silos and incentivising cross-disciplinary collaboration.
  • He has modernised core HR and management practices: implemented real-time performance reviews, loosened dress codes, and raised compensation for programmers.
  • Personal interests—such as his sideline as DJ D-Sol—underscore his willingness to defy convention and challenge the insularity of Wall Street leadership.

Institutional Impact:

  • Under his stewardship, Goldman has accelerated its pivot to technology—automating trading operations, consolidating platforms, and committing substantial resources to digital transformation.
  • Notably, the current “GS 3.0” agenda focuses on automating six major workflows to direct freed capacity into growth, consistent with a multi-decade productivity trend.

Leading Theorists and Intellectual Lineage of AI-Driven Productivity in Business

Solomon’s vision is shaped and echoed by several foundational theorists in economics, management science, and artificial intelligence:

1. Clayton Christensen

  • Theory: Disruptive Innovation—frames how technological change transforms industries not through substitution but by enabling new business models and process efficiencies.
  • Relevance: Goldman Sachs’ approach to using AI to reimagine workflows and create new capabilities closely mirrors Christensen’s insights on sustaining versus disruptive innovation.

2. Erik Brynjolfsson & Andrew McAfee

  • Theory: Race Against the Machine, The Second Machine Age—chronicled how digital automation augments human productivity and reconfigures the labour market, not just replacing jobs but reshaping roles and enhancing output.
  • Relevance: Solomon’s argument for enabling smart people with better tools directly draws on Brynjolfsson’s proposition that the best organisational outcomes occur when firms successfully combine human and machine intelligence.

3. Michael Porter

  • Theory: Competitive Advantage—emphasised how operational efficiency and information advantage underpin sustained industry leadership.
  • Relevance: Porter’s ideas connect to Goldman’s agenda by showing that AI integration is not just about cost, but about improving information processing, strategic agility, and client service.

4. Herbert Simon

  • Theory: Bounded Rationality and Decision Support Systems—pioneered the concept that decision-making can be dramatically improved by systems that extend the cognitive capabilities of professionals.
  • Relevance: Solomon’s claim that AI puts better tools in the hands of talented staff traces its lineage to Simon’s vision of computers as skilled assistants, vital to complex modern organisations.

5. Geoffrey Hinton, Yann LeCun, Yoshua Bengio

  • Theory: Deep Learning—established the contemporary AI revolution underpinning business process automation, language models, and data analysis at enterprise scale.
  • Relevance: Without the breakthroughs made by these theorists, AI’s current generation—capable of augmenting financial analysis, risk modelling, and operational management—could not be applied as Solomon describes.

 

Synthesis and Strategic Implications

Solomon’s quote epitomises the intersection of pragmatic executive leadership and theoretical insight. His advocacy for AI-integrated productivity reinforces a management consensus: sustainable competitive advantage hinges not just on technology, but on empowering skilled individuals to unlock new modes of value creation. This approach is echoed by leading researchers who situate automation as a catalyst for role evolution, scalable efficiency, and the ability to redeploy resources into higher-value growth opportunities.

Goldman Sachs’ specific AI play is therefore neither a defensive move against headcount nor a speculative technological bet, but a calculated strategy rooted in both practical business history and contemporary academic theory—a paradigm for how large organisations can adapt, thrive, and lead in the face of continual disruption.

read more
Quote: Satya Nadella – Microsoft CEO

Quote: Satya Nadella – Microsoft CEO

“At scale, nothing is a commodity. We have to have our cost structure, supply-chain efficiency, and software efficiencies continue to compound to ensure margins. Scale – and one of the things I love about the OpenAI partnership – is it’s gotten us to scale. This is a scale game.” – Satya Nadella – Microsoft CEO

Satya Nadella has been at the helm of Microsoft since 2014, overseeing its transformation into one of the world’s most valuable technology companies. Born in Hyderabad, India, and educated in electrical engineering and computer science, Nadella joined Microsoft in 1992, quickly rising through the ranks in technical and business leadership roles. Prior to becoming CEO, he was best known for driving the rapid growth of Microsoft Azure, the company’s cloud infrastructure platform—a business now central to Microsoft’s global strategy.

Nadella’s leadership style is marked by systemic change—he has shifted Microsoft away from legacy, siloed software businesses and repositioned it as a cloud-first, AI-driven, and highly collaborative tech company. He is recognised for his ability to anticipate secular shifts—most notably, the move to hyperscale cloud computing and, more recently, the integration of advanced AI into core products such as GitHub Copilot and Microsoft 365 Copilot. His background—combining deep technical expertise with rigorous business training (MBA, University of Chicago)—enables him to bridge both the strategic and operational dimensions of global technology.

This quote was delivered in the context of Nadella’s public discussion on the scale economics of AI, hyperscale cloud, and the transformative partnership between Microsoft and OpenAI (the company behind ChatGPT, Sora, and GPT-4/5/6) on the BG2 podcast, 1st November 2025 In this conversation, Nadella outlines why, at the extreme end of global tech infrastructure, nothing remains a “commodity”: system costs, supply chain and manufacturing agility, and relentless software optimisation all become decisive sources of competitive advantage. He argues that scale—meaning not just size, but the compounding organisational learning and cost improvement unlocked by operating at frontier levels—determines who captures sustainable margins and market leadership.

The OpenAI partnership is, from Nadella’s perspective, a practical illustration of this thesis. By integrating OpenAI’s frontier models deeply (and at exclusive scale) within Azure, Microsoft has driven exponential increases in compute utilisation, data flows, and the learning rate of its software infrastructure. This allowed Microsoft to amortise fixed investments, rapidly reduce unit costs, and create a loop of innovation not accessible to smaller or less integrated competitors. In Nadella’s framing, scale is not a static achievement, but a perpetual game—one where the winners are those who compound advantages across the entire stack: from chip supply chains through to application software and business model design.

Theoretical Foundations and Key Thinkers

The quote’s themes intersect with multiple domains: economics of platforms, organisational learning, network effects, and innovation theory. Key theoretical underpinnings and thinkers include:

Scale Economics and Competitive Advantage

  • Alfred Chandler (1918–2007): Chandler’s work on the “visible hand” and the scale and scope of modern industrial firms remains foundational. He showed how scale, when coupled with managerial coordination, allows firms to achieve durable cost advantages and vertical integration.
  • Bruce Greenwald & Judd Kahn: In Competition Demystified (2005), they argue sustainable competitive advantage stems from barriers to entry—often reinforced by scale, especially via learning curves, supply chains, and distribution.

Network Effects and Platform Strategy

  • Jean Tirole & Marcel Boyer: Tirole’s work on platform economics shows how scale-dependent markets (like cloud and AI) naturally concentrate—network effects reinforce the value of leading platforms, and marginal cost advantage compounds alongside user and data scale.
  • Geoffrey Parker, Marshall Van Alstyne, Sangeet Paul Choudary: In their research and Platform Revolution, these thinkers elaborate how the value in digital markets accrues disproportionately to platforms that achieve scale—because transaction flows, learning, and innovation all reinforce one another.

Learning Curves and Experience Effects

  • The Boston Consulting Group (BCG): In the 1960s, Bruce Henderson’s concept of the “experience curve” formalised the insight that unit costs fall as cumulative output grows—the canonical explanation for why scale delivers persistent cost advantage.
  • Clayton Christensen: In The Innovator’s Dilemma, Christensen illustrates how technological discontinuities and learning rates enable new entrants to upend incumbent advantage—unless those incumbents achieve scale in the new paradigm.

Supply Chain and Operations

  • Taiichi Ohno and Shoichiro Toyoda (Toyota Production System): The industrial logic that relentless supply chain optimisation and compounding process improvements, rather than static cost reduction, underpin long-run advantage, especially during periods of rapid demand growth or supply constraint.

Economics of Cloud and AI

  • Hal Varian (Google, UC Berkeley): Varian’s analyses of cloud economics demonstrate the massive fixed-cost base and “public utility” logic of hyperscalers. He has argued that AI and cloud converge when scale enables learning (data/usage) to drive further cost and performance improvements.
  • Andrew Ng, Yann LeCun, Geoffrey Hinton: Pioneer practitioners in deep learning and large language models, whose work established the “scaling laws” now driving the AI infrastructure buildout—i.e., that model capability increases monotonically with scale of data, compute, and parameter count.

Why This Matters Now

Organisations at the digital frontier—notably Microsoft and OpenAI—are now locked in a scale game that is reshaping both industry structure and the global economy. The cost, complexity, and learning rate needed to operate at hyperscale mean that “commodities” (compute, storage, even software itself) cease to be generic. Instead, they become deeply differentiated by embedded knowledge, utilisation efficiency, supply-chain integration, and the ability to orchestrate investments across cycles of innovation.

Nadella’s observation underscores a reality that now applies well beyond technology: the compounding of competitive advantage at scale has become the critical determinant of sector leadership and value capture. This logic is transforming industries as diverse as finance, logistics, pharmaceuticals, and manufacturing—where the ability to build, learn, and optimise at scale fundamentally redefines what was once considered “commodity” business.

In summary: Satya Nadella’s words reflect not only Microsoft’s strategy but a broader economic and technological transformation, deeply rooted in the theory and practice of scale, network effects, and organisational learning. Theorists and practitioners—from Chandler and BCG to Christensen and Varian—have analysed these effects for decades, but the age of AI and cloud has made their insights more decisive than ever. At the heart of it: scale—properly understood and operationalised—remains the ultimate competitive lever.

read more
Quote: David Solomon – Goldman Sachs CEO

Quote: David Solomon – Goldman Sachs CEO

“Generally speaking people hate change. It’s human nature. But change is super important. It’s inevitable. In fact, on my desk in my office I have a little plaque that says ‘Change or die.’ As a business leader, one of the perspectives you have to have is that you’ve got to constantly evolve and change.” – David Solomon – Goldman Sachs CEO

The quoted insight comes from David M. Solomon, Chief Executive Officer and Chairman of Goldman Sachs, a role he has held since 2018. It was delivered during a high-profile interview at The Economic Club of Washington, D.C., 30 October 2025, as Solomon reflected on the necessity of adaptability both personally and as a leader within a globally significant financial institution.

His statement is emblematic of the strategic philosophy that has defined Solomon’s executive tenure. He uses the ‘Change or die’ principle to highlight the existential imperative for renewal in business, particularly in the context of technological transformation, competitive dynamics, and economic disruption.

Solomon’s leadership at Goldman Sachs has been characterised by deliberate modernisation. He has overseen the integration of advanced technology, notably in artificial intelligence and fintech, implemented culture and process reforms, adapted workforce practices, and expanded strategic initiatives in sustainable finance. His approach blends operational rigour with entrepreneurial responsiveness – a mindset shaped both by his formative years in high-yield credit markets at Drexel Burnham and Bear Stearns, and by his rise through leadership roles at Goldman Sachs.

His remark on change was prompted by questions of business resilience and the need for constant adaptation amidst macroeconomic uncertainty, regulatory flux, and the competitive imperatives of Wall Street. For Solomon, resisting change is an instinct, but enabling it is a necessity for long-term health and relevance — especially for institutions in rapidly converging markets.

About David M. Solomon

  • Born 1962, Hartsdale, New York.
  • Hamilton College graduate (BA Political Science).
  • Early career: Irving Trust, Drexel Burnham, Bear Stearns.
  • Joined Goldman Sachs as a partner in 1999, advancing through financing and investment banking leadership.
  • CEO from October 2018, Chairman from January 2019.
  • Known for a modernisation agenda, openness to innovation and talent, commitment to client service and culture reform.
  • Outside finance: Philanthropy, board service, and a second career as electronic dance music DJ “DJ D-Sol”, underscoring a multifaceted approach to leadership and personal renewal.

Theoretical Backstory: Leading Thinkers on Change and Organisational Adaptation

Solomon’s philosophy echoes decades of foundational theory in business strategy and organisational behaviour:

Charles Darwin (1809–1882)
While not a business theorist, Darwin’s principle of “survival of the fittest” is often cited in strategic literature to emphasise the adaptive imperative — those best equipped to change, survive.

Peter Drucker (1909–2005)
Drucker, regarded as the father of modern management, wrote extensively on innovation, entrepreneurial management and the need for “planned abandonment.” He argued, “The greatest danger in times of turbulence is not the turbulence; it is to act with yesterday’s logic.” Drucker’s legacy forms a pillar of contemporary change management, advising leaders not only to anticipate change but to institutionalise it.

John Kotter (b. 1947)
Kotter’s model for Leading Change remains a classic in change management. His eight-step framework starts with establishing a sense of urgency and is grounded in the idea that successful transformation is both necessary and achievable only with decisive leadership, clear vision, and broad engagement. Kotter demonstrated that people’s resistance to change is natural, but can be overcome through structured actions and emotionally resonant leadership.

Clayton Christensen (1952-2020)
Christensen’s work on disruptive innovation clarified how incumbents often fail by ignoring, dismissing, or underinvesting in change — even when it is inevitable. His concept of the “Innovator’s Dilemma” remains seminal, showing that leaders must embrace change not as an abstract imperative but as a strategic necessity, lest they be replaced or rendered obsolete.

Rosabeth Moss Kanter
Kanter’s work focuses on the human dynamics of change, the importance of culture, empowerment, and the “innovation habit” in organisations. She holds that the secret to business success is “constant, relentless innovation” and that resistance to change is deeply psychological, calling for leaders to engineer positive environments for innovation.

Integration: The Leadership Challenge

Solomon’s ethos channels these frameworks into practical executive guidance. For business leaders, particularly in financial services and Fortune 500 firms, the lesson is clear: inertia is lethal; organisational health depends on reimagining processes, culture, and client engagement for tomorrow’s challenges. The psychological aversion to change must be managed actively at all levels — from the boardroom to the front line.

In summary, the context of Solomon’s quote reflects not only a personal credo but also the consensus of generations of theoretical and practical leadership: only those prepared to “change or die” can expect to thrive and endure in an era defined by speed, disruption, and relentless unpredictability.

read more
Quote: Andrej Karpathy – Ex-OpenAI, Ex-Tesla AI

Quote: Andrej Karpathy – Ex-OpenAI, Ex-Tesla AI

“[With AI] we’re not building animals. We’re building ghosts or spirits.” – Andrej Karpathy – Ex-OpenAI, Ex-Tesla AI

Andrej Karpathy, renowned for his leadership roles at OpenAI and Tesla’s Autopilot programme, has been at the centre of advances in deep learning, neural networks, and applied artificial intelligence. His work traverses both academic research and industrial deployment, granting him a panoramic perspective on the state and direction of AI.

When Karpathy refers to building “ghosts or spirits,” he is drawing a conceptual line between biological intelligence—the product of millions of years of evolution—and artificial intelligence as developed through data-driven, digital systems. In his view, animals are “baked in” with instincts, embodiment, and innate learning capacities shaped by evolution, a process unfolding over geological timeframes. By contrast, today’s AI models are “ghosts” in the sense that they are ethereal, fully digital artefacts, trained to imitate human-generated data rather than to evolve or learn through direct interaction with the physical world. They lack bodily instincts and the evolutionary substrate that endows animals with survival strategies and adaptation mechanisms.

Karpathy describes the pre-training process that underpins large language models as a form of “crappy evolution”—a shortcut that builds digital entities by absorbing the statistical patterns of internet-scale data without the iterative adaptation of embodied beings. Consequently, these models are not “born” into the world like animals with built-in survival machinery; instead, they are bootstrapped as “ghosts,” imitating but not experiencing life.

 

The Cognitive Core—Karpathy’s Vision for AI Intelligence

Karpathy’s thinking has advanced towards the critical notion of the “cognitive core”: the kernel of intelligence responsible for reasoning, abstraction, and problem-solving, abstracted away from encyclopaedic factual knowledge. He argues that the true magic of intelligence is not in the passive recall of data, but in the flexible, generalisable ability to manipulate ideas, solve problems, and intuit patterns—capabilities that a system exhibits even when deprived of pre-programmed facts or exhaustive memory.

He warns against confusing memorisation (the stockpiling of internet facts within a model) with general intelligence, which arises from this cognitive core. The most promising path, in his view, is to isolate and refine this core, stripping away the accretions of memorised data, thereby developing something akin to a “ghost” of reasoning and abstraction rather than an “animal” shaped by instinct and inheritance.

This approach entails significant trade-offs: a cognitive core lacks the encyclopaedic reach of today’s massive models, but gains in adaptability, transparency, and the capacity for compositional, creative thought. By foregrounding reasoning machinery, Karpathy posits that AI can begin to mirror not the inflexibility of animals, but the open-ended, reflective qualities that characterise high-level problem-solving.

 

Karpathy’s Journey and Influence

Karpathy’s influence is rooted in a career spent on the frontier of AI research and deployment. His early proximity to Geoffrey Hinton at the University of Toronto placed him at the launch-point of the convolutional neural networks revolution, which fundamentally reshaped computer vision and pattern recognition.

At OpenAI, Karpathy contributed to an early focus on training agents to master digital environments (such as Atari games), a direction in retrospect he now considers premature. He found greater promise in systems that could interact with the digital world through knowledge work—precursors to today’s agentic models—a vision he is now helping to realise through ongoing work in educational technology and AI deployment.

Later, at Tesla, he directed the transformation of autonomous vehicles from demonstration to product, gaining hard-won appreciation for the “march of nines”—the reality that progressing from system prototypes that work 90% of the time to those that work 99.999% of the time requires exponentially more effort. This experience informs his scepticism towards aggressive timelines for “AGI” and his insistence on the qualitative differences between robust system deployment and controlled demonstrations.

 

The Leading Theorists Shaping the Debate

Karpathy’s conceptual framework emerges amid vibrant discourse within the AI community, shaped by several seminal thinkers:

Theorist
Core Idea
Relation to Karpathy’s Ghosts vs. Animals Analogy
Richard Sutton
General intelligence emerges through learning algorithms honed by evolution (“bitter lesson”)
Sutton advocates building “animals” via RL and continual learning; Karpathy sees modern AI as ghosts—data-trained, not evolved.
Geoffrey Hinton
Neural networks model learning and perception as statistical pattern discovery
Hinton’s legacy underpins the digital cortex, but Karpathy stresses what’s missing: embodied instincts, continual memory.
Yann LeCun
Convolutional neural networks and representation learning for perceptual tasks
LeCun’s work forms part of the “cortex”, but Karpathy highlights the missing brain structures and instincts for full generality.

Sutton’s “bitter lesson” posits that scale and generic algorithms, rather than domain-specific tricks, ultimately win—suggesting a focus on evolving animal-like intelligence. Karpathy, however, notes that current development practices, with their reliance on dataset imitation, sidestep the deep embodiment and evolutionary learning that define animal cognition. Instead, AI today creates digital ghosts—entities whose minds are not grounded in physical reality, but in the manifold of internet text and data.

Hinton and LeCun supply the neural and architectural foundations—the “cortex” and reasoning traces—while both Karpathy and their critics note the absence of rich, consolidated memory (the hippocampus analogue), instincts (amygdala), and the capacity for continual, self-motivated world interaction.

Why “Ghosts,” Not “Animals”?

The distinction is not simply philosophical. It carries direct consequences for:

  • Capabilities: AI “ghosts” excel at pattern reproduction, simulation, and surface reasoning but lack the embodied, instinctual grounding (spatial navigation, sensorimotor learning) of animals.
  • Limitations: They are subject to model collapse, producing uniform, repetitive outputs, lacking the spontaneous creativity and entropy seen in human (particularly child) cognition.
  • Future Directions: The field is now oriented towards distilling this cognitive core, seeking a scalable, adaptable reasoning engine—compact, efficient, and resilient to overfitting—rather than continuing to bloat models with ever more static memory.

This lens sharpens expectations: the way forward is not to mimic biology in its totality, but to pursue the unique strengths and affordances of a digital, disembodied intelligence—a spirit of the datasphere, not a beast evolved in the forest.

 

Broader Significance

Karpathy’s “ghosts” metaphor crystallises a critical moment in the evolution of AI as a discipline. It signals a turning point: the shift from brute-force memorisation of the internet to intelligent, creative algorithms capable of abstraction, reasoning, and adaptation.

This reframing is shaping not only the strategic priorities of the most advanced labs, but also the philosophical and practical questions underpinning the next decade of AI research and deployment. As AI becomes increasingly present in society, understanding its nature—not as an artificial animal, but as a digital ghost—will be essential to harnessing its strengths and mitigating its limitations.

read more
Quote: Sholto Douglas – Anthropic

Quote: Sholto Douglas – Anthropic

“People have said we’re hitting a plateau every month for three years… I look at how models are produced and every part could be improved. The training pipeline is primitive, held together by duct tape, best efforts, and late nights. There’s so much room to grow everywhere.” – Sholto Douglas – Anthropic

Sholto Douglas made the statement during a major public podcast interview in October 2025, coinciding with Anthropic’s release of Claude Sonnet 4.5—at the time, the world’s strongest and most “agentic” AI coding model. The comment specifically rebuts repeated industry and media assertions that large AI models have reached a ceiling or are slowing in progress. Douglas argues the opposite: that the field is in a phase of accelerating advancement, driven both by transformative hardware investment (“compute super-cycle”), new algorithmic techniques (particularly reinforcement learning and test-time compute), and the persistent “primitive” state of today’s AI engineering infrastructure.

He draws an analogy with early-stage, improvisational systems: the models are held together “by duct tape, best efforts, and late nights,” making clear that immense headroom for improvement remains at every level, from training data pipelines and distributed infrastructure to model architecture and reward design. As a result, every new benchmark and capability reveals further unrealised opportunity, with measurable progress charted month after month.

Douglas’s deeper implication is that claims of a plateau often arise from surface-level analysis or the “saturation” of public benchmarks, not from a rigorous understanding of what is technically possible or how much scale remains untapped across the technical stack.

Sholto Douglas: Career Trajectory and Perspective

Sholto Douglas is a leading member of Anthropic’s technical staff, focused on scaling reinforcement learning and agentic AI. His unconventional journey illustrates both the new talent paradigm and the nature of breakthrough AI research today:

  • Early Life and Mentorship: Douglas grew up in Australia, where he benefited from unusually strong academic and athletic mentorship. His mother, an accomplished physician frustrated by systemic barriers, instilled discipline and a systemic approach; his Olympic-level fencing coach provided a first-hand experience of how repeated, directed effort leads to world-class performance.
  • Academic Formation: He studied computer science and robotics as an undergraduate, with a focus on practical experimentation and a global mindset. A turning point was reading the “scaling hypothesis” for AGI, convincing him that progress on artificial general intelligence was feasible within a decade—and worth devoting his career to.
  • Independent Innovation: As a student, Douglas built “bedroom-scale” foundation models for robotics, working independently on large-scale data collection, simulation, and early adoption of transformer-based methods. This entrepreneurial approach—demonstrating initiative and technical depth without formal institutional backing—proved decisive.
  • Google (Gemini and DeepMind): His independent work brought him to Google, where he joined just before the release of ChatGPT, in time to witness and help drive the rapid unification and acceleration of Google’s AI efforts (Gemini, Brain, DeepMind). He co-designed new inference infrastructure that reduced costs and worked at the intersection of large-scale learning, reinforcement learning, and applied reasoning.
  • Anthropic (from 2025): Drawn by Anthropic’s focus on measurable, near-term economic impact and deep alignment work, Douglas joined to lead and scale reinforcement learning research—helping push the capability frontier for agentic models. He values a culture where every contributor understands and can articulate how their work advances both capability and safety in AI.

Douglas is distinctive for his advocacy of “taste” in AI research, favouring mechanistic understanding and simplicity over clever domain-specific tricks—a direct homage to Richard Sutton’s “bitter lesson.” This perspective shapes his belief that the greatest advances will come not from hiding complexity with hand-crafted heuristics, but from scaling general algorithms and rigorous feedback loops.

 

Intellectual and Scientific Context: The ‘Plateau’ Debate and Leading Theorists

The debate around the so-called “AI plateau” is best understood against the backdrop of core advances and recurring philosophical arguments in machine learning.

The “Bitter Lesson” and Richard Sutton

  • Richard Sutton (University of Alberta, DeepMind), one of the founding figures in reinforcement learning, crystallised the field’s “bitter lesson”: that general, scalable methods powered by increased compute will eventually outperform more elegant, hand-crafted, domain-specific approaches.
  • In practical terms, this means that the field’s recent leaps—from vision to language to coding—are powered less by clever new inductive biases, and more by architectural simplicity plus massive compute and data. Sutton has also maintained that real progress in AI will come from reinforcement learning with minimal task-specific assumptions and maximal data, computation, and feedback.

Yann LeCun and Alternative Paradigms

  • Yann LeCun (Meta, NYU), a pioneer of deep learning, has maintained that the transformer paradigm is limited and that fundamentally novel architectures are necessary for human-like reasoning and autonomy. He argues that unsupervised/self-supervised learning and new world-modelling approaches will be required.
  • LeCun’s disagreement with Sutton’s “bitter lesson” centres on the claim that scaling is not the final answer: new representation learning, memory, and planning mechanisms will be needed to reach AGI.

Shane Legg, Demis Hassabis, and DeepMind

  • DeepMind’s approach has historically been “science-first,” tackling a broad swathe of human intelligence challenges (AlphaGo, AlphaFold, science AI), promoting a research culture that takes long-horizon bets on new architectures (memory-augmented neural networks, world models, differentiable reasoning).
  • Demis Hassabis and Shane Legg (DeepMind co-founders) have advocated for testing a diversity of approaches, believing that the path to AGI is not yet clear—though they too acknowledge the value of massive scale and reinforcement learning.

The Scaling Hypothesis: GW’s Essay and the Modern Era

  • The so-called “scaling hypothesis”—the idea that simply making models larger and providing more compute and data will continue yielding improvements—has become the default “bet” for Anthropic, OpenAI, and others. Douglas refers directly to this intellectual lineage as the critical “hinge” moment that set his trajectory.
  • This hypothesis is now being extended into new areas, including agentic systems where long context, verification, memory, and reinforcement learning allow models to reliably pursue complex, multi-step goals semi-autonomously.
 

Summing Up: The Current Frontier

Today, researchers like Douglas are moving beyond the original transformer pre-training paradigm, leveraging multi-axis scaling (pre-training, RL, test-time compute), richer reward systems, and continuous experimentation to drive model capabilities in coding, digital productivity, and emerging physical domains (robotics and manipulation).

Douglas’s quote epitomises the view that not only has performance not plateaued—every “limitation” encountered is a signpost for further exponential improvement. The modest, “patchwork” nature of current AI infrastructure is a competitive advantage: it means there is vast room for optimisation, iteration, and compounding gains in capability.

As the field races into a new era of agentic AI and economic impact, his perspective serves as a grounded, inside-out refutation of technological pessimism and a call to action grounded in both technical understanding and relentless ambition.

read more
Quote: Julian Schrittwieser – Anthropic

Quote: Julian Schrittwieser – Anthropic

“The talk about AI bubbles seemed very divorced from what was happening in frontier labs and what we were seeing. We are not seeing any slowdown of progress.” – Julian Schrittwieser – Anthropic

Those closest to technical breakthroughs are witnessing a pattern of sustained, compounding advancement that is often underestimated by commentators and investors. This perspective underscores both the power and limitations of conventional intuitions regarding exponential technological progress.

 

Context of the Quote

Schrittwieser delivered these remarks in a 2025 interview on the MAD Podcast, prompted by widespread discourse on the so-called ‘AI bubble’. His key contention is that debate around an AI investment or hype “bubble” feels disconnected from the lived reality inside the world’s top research labs, where the practical pace of innovation remains brisk and outwardly undiminished. He outlines that, according to direct observation and internal benchmarks at labs such as Anthropic, progress remains on a highly consistent exponential curve: “every three to four months, the model is able to do a task that is twice as long as before completely on its own”.

He draws an analogy to the early days of COVID-19, where exponential growth was invisible until it became overwhelming; the same mathematical processes, Schrittwieser contends, apply to AI system capabilities. While public narratives about bubbles often reference the dot-com era, he highlights a bifurcation: frontier labs sustain robust, revenue-generating trajectories, while the wider AI ecosystem might experience bubble-like effects in valuation. But at the core—the technology itself continues to improve at a predictably exponential rate well supported by both qualitative experience and benchmark data.

Schrittwieser’s view, rooted in immediate, operational knowledge, is that the default expectation of a linear future is mistaken: advances in autonomy, reasoning, and productivity are compounding. This means genuinely transformative impacts—such as AI agents that function at expert level or beyond for extended, unsupervised tasks—are poised to arrive sooner than many anticipate.

 

Profile: Julian Schrittwieser

Julian Schrittwieser is one of the world’s leading artificial intelligence researchers, currently based at Anthropic, following a decade as a core scientist at Google DeepMind. Raised in rural Austria, Schrittwieser’s journey from an adolescent fascination with game programming to the vanguard of AI research exemplifies the discipline’s blend of curiosity, mathematical rigour, and engineering prowess. He studied computer science at the Vienna University of Technology, before interning at Google.

Schrittwieser was a central contributor to several historic machine learning milestones, most notably:

 
  • AlphaGo, the first program to defeat a world champion at Go, combining deep neural networks with Monte Carlo Tree Search.
  • AlphaGo Zero and AlphaZero, which generalised the approach to achieve superhuman performance without human examples, through self-play—demonstrating true generality in reinforcement learning.
  • MuZero (as lead author), solving the challenge of mastering environments without even knowing the rules in advance, by enabling the system to learn its own internal, predictive world models—an innovation bringing RL closer to complex, real-world domains.
  • Later work includes AlphaCode (code synthesis), AlphaTensor (algorithmic discovery), and applied advances in Gemini and AlphaProof.

At Anthropic, Schrittwieser is at the frontier of research into scaling laws, reinforcement learning, autonomous agents, and novel techniques for alignment and safety in next-generation AI. True to his pragmatic ethos, he prioritises what directly raises capability and reliability, and advocates for careful, data-led extrapolation rather than speculation.

 

Theoretical Backstory: Exponential AI Progress and Key Thinkers

Schrittwieser’s remarks situate him within a tradition of AI theorists and builders focused on scaling laws, reinforcement learning (RL), and emergent capabilities:

Leading Theorists and Historical Perspective

Name
Notable Ideas and Contributions
Relevance to Quote
Demis Hassabis
Founder of DeepMind; architect of the AlphaGo programme. Emphasised general intelligence and the power of RL plus planning.
Schrittwieser’s mentor and DeepMind leader. Pioneered RL paradigms beyond games.
David Silver
Developed many of the breakthroughs underlying AlphaGo, AlphaZero, MuZero. Advanced RL and model-based search methods.
Collaborator with Schrittwieser; together, demonstrated practical scaling of RL.
Richard Sutton
Articulated reinforcement learning’s centrality: “The Bitter Lesson” (general methods, scalable computation, not handcrafted). Advanced temporal difference methods and RL theory.
Mentioned by Schrittwieser as a thought leader shaping the RL paradigm at scale.
Alex Ray, Jared Kaplan, Sam McCandlish, OpenAI Scaling Team
Quantified AI’s “scaling laws”: empirical tendencies for model performance to improve smoothly with compute, data, and parameter scaling.
Schrittwieser echoes this data-driven, incrementalist philosophy.
Ilya Sutskever
Co-founder of OpenAI; central to deep learning breakthroughs, scaling, and forecasting emergent capabilities.
OpenAI’s work on benchmarks (GDP-Val) and scaling echoes these insights.

These thinkers converge on several key observations directly reflected in Schrittwieser’s view:

  • Exponential Capability Curves: Consistent advances in performance often surprise those outside the labs due to our poor intuitive grasp of exponentiality—what Schrittwieser terms a repeated “failure to understand the exponential”.
  • Scaling Laws and Reinforcement Learning: Improvements are not just about larger models, but ever-better training, more reliable reinforcement learning, agentic architecture, and robust reward systems—developments Schrittwieser’s work epitomises.
  • Novelty and Emergence: Historically, theorists doubted whether neural models could go beyond sophisticated mimicry; the “Move 37” moment (AlphaGo’s unprecedented move in Go) was a touchstone for true machine creativity, a theme Schrittwieser stresses remains highly relevant today.
  • Bubbles, Productivity, and Market Cycles: Mainstream financial and social narratives may oscillate dramatically, but real capability growth—observable in benchmarks and direct use—has historically marched on undeterred by speculative excesses.
 

Synthesis: Why the Perspective Matters

The quote foregrounds a gap between external perceptions and insider realities. Pioneers like Schrittwieser and his cohort stress that transformative change will not follow a smooth, linear or hype-driven curve, but an exponential, data-backed progression—one that may defy conventional intuition, but is already reshaping productivity and the structure of work.

This moment is not about “irrational exuberance”, but rather the compounding product of theoretical insight, algorithmic audacity, and relentless engineering: the engine behind the next wave of economic and social transformation.

read more
Quote: Andrej Karpathy – Ex-OpenAI, Ex-Tesla AI

Quote: Andrej Karpathy – Ex-OpenAI, Ex-Tesla AI

“AI is so wonderful because there have been a number of seismic shifts where the entire field has suddenly looked a different way. I’ve maybe lived through two or three of those. I still think there will continue to be some because they come with almost surprising regularity.” – Andrej Karpathy – Ex-OpenAI, Ex-Tesla AI

Andrej Karpathy, one of the most recognisable figures in artificial intelligence, has spent his career at the epicentre of the field’s defining moments in both research and large-scale industry deployment.

Karpathy’s background is defined by deep technical expertise and a front-row seat to AI’s rapid evolution. Having completed his PhD at Stanford and held pivotal research positions, he worked alongside Geoffrey Hinton at the University of Toronto during the early surge of deep learning. His career encompasses key roles at Tesla, where he led the Autopilot vision team, and at OpenAI, contributing to some of the world’s most prominent large language models and generative AI systems. This vantage point has allowed him to participate in, and reflect upon, the discipline’s “seismic shifts”.

Karpathy’s narrative has been shaped by three inflection points:

  • The emergence of deep neural networks from a niche field to mainstream AI, spearheaded by the success of AlexNet and the subsequent shift of the research community toward neural architectures.
  • The drive towards agent-based systems, with early enthusiasm for reinforcement learning (RL) and game-based environments (such as Atari and Go). Karpathy himself was cautious about the utility of games as the true path to intelligence, focusing instead on agents acting within the real digital world.
  • The rise of large language models (LLMs)—transformers trained on vast internet datasets, shifting the locus of AI from task-specific systems to general-purpose models with the ability to perform a broad suite of tasks, and in-context learning.

His reflection on these ‘regular’ paradigm shifts arises from lived experience: “I’ve maybe lived through two or three of those. I still think there will continue to be some because they come with almost surprising regularity.” These moments recalibrate assumptions, redirect research priorities, and set new benchmarks for capability. Karpathy’s practical orientation—building “useful things” rather than targeting biological intelligence or pure AGI—shapes his approach to both innovation and scepticism about hype.

Context of the Quote
In his conversation with podcaster Dwarkesh Patel, Karpathy elaborates on the recurring nature of breakthroughs. He contrasts AI’s rapid, transformative leaps with other scientific fields, noting that in machine learning, scaling up data, compute, and novel architectures can yield abrupt improvements—yet each wave often triggers both excessive optimism and later recalibration. A major point he raises is the lack of linearity: the field does not “smoothly” approach AGI, but rather proceeds via discontinuities, often catalysed by new ideas or techniques that were previously out of favour or overlooked.

Karpathy relates how, early in his career, neural networks were a marginal interest and large-scale “representation learning” was only beginning to be considered viable by a minority in the community. With the advent of AlexNet, the landscape shifted overnight, rapidly making previous assumptions obsolete. Later, the pursuit of RL-driven agents led to a phase where entire research agendas were oriented toward gameplay and synthetic environments—another phase later superseded by the transformer revolution and language models. Karpathy reflects candidly on earlier missteps, as well as the discipline’s collective tendency to over- or under-predict the timetable and trajectory of progress.

Leading Theorists and Intellectual Heritage
The AI revolutions Karpathy describes are inseparable from the influential figures and ideas that have shaped each phase:

  • Geoffrey Hinton: Hailed as the “godfather of AI”, Hinton was instrumental in deep learning’s breakthrough, advancing techniques for training multilayered neural networks and championing representation learning against prevailing orthodoxy.
  • Yann LeCun: Developed convolutional neural networks (CNNs), foundational for computer vision and the 2010s wave of deep learning success.
  • Yoshua Bengio: Co-architect of the deep learning movement and a key figure in developing unsupervised and generative models.
  • Richard Sutton: Principal proponent of reinforcement learning, Sutton articulated the value of “animal-like” intelligence: learning from direct interaction with environments, reward, and adaptation. Sutton’s perspective frequently informs debates about the relationship between model architectures and living intelligence, encouraging a focus on agents and lifelong learning.

Karpathy’s own stance is partly a pragmatic response to this heritage: rather than pursuing analogues of biological brains, he views the productive path as building digital “ghosts”—entities that learn by imitation and are shaped by patterns in data, rather than evolutionary processes.

Beyond individual theorists, the field’s quantum leaps are rooted in a culture of intellectual rivalry and rapid intellectual cross-pollination:

  • The convolutional and recurrent networks of the 2010s pushed the boundaries of what neural networks could do.
  • The development and scaling of transformer-based architectures (as in Google’s “Attention is All You Need”) dramatically changed both natural language processing and the structure of the field itself.
  • The introduction of algorithms for in-context learning and large-scale unsupervised pre-training marked a break with hand-crafted representation engineering.

The Architecture of Progress: Seismic Shifts and Pragmatic Tension
Karpathy’s insight is that these shifts are not just about faster hardware or bigger datasets; they reflect the field’s unique ecology—where new methods can rapidly become dominant and overturn accumulated orthodoxy. The combination of open scientific exchange, rapid deployment, and intense commercialisation creates fertile ground for frequent realignment.

His observation on the “regularity” of shifts also signals a strategic realism: each wave brings both opportunity and risk. New architectures (such as transformers or large reinforcement learning agents) frequently overshoot expectations before their real limitations become clear. Karpathy remains measured on both promise and limitation—anticipating continued progress, but cautioning against overpredictions and hype cycles that fail to reckon with the “march of nines” needed to reach true reliability and impact.

Closing Perspective
The context of Karpathy’s quote is an AI ecosystem that advances not through steady accretion, but in leaps—each driven by conceptual, technical, and organisational realignments. As such, understanding progress in AI demands both technical literacy and historical awareness: the sharp pivots that have marked past decades are likely to recur, with equally profound effects on how intelligence is conceived, built, and deployed.

read more
Quote: Jonathan Ross – CEO Groq

Quote: Jonathan Ross – CEO Groq

“The countries that control compute will control AI. You cannot have compute without energy.” – Jonathan Ross – CEO Groq

Jonathan Ross stands at the intersection of geopolitics, energy economics, and technological determinism. As founder and CEO of Groq, the Silicon Valley firm challenging Nvidia’s dominance in AI infrastructure, Ross articulated a proposition of stark clarity during his September 2025 appearance on Harry Stebbings’ 20VC podcast: “The countries that control compute will control AI. You cannot have compute without energy.”

This observation transcends technical architecture. Ross is describing the emergence of a new geopolitical currency—one where computational capacity, rather than traditional measures of industrial might, determines economic sovereignty and strategic advantage in the 21st century. His thesis rests on an uncomfortable reality: artificial intelligence, regardless of algorithmic sophistication or model architecture, cannot function without the physical substrate of compute. And compute, in turn, cannot exist without abundant, reliable energy.

The Architecture of Advantage

Ross’s perspective derives from direct experience building the infrastructure that powers modern AI. At Google, he initiated what became the Tensor Processing Unit (TPU) project—custom silicon that allowed the company to train and deploy machine learning models at scale. This wasn’t academic research; it was the foundation upon which Google’s AI capabilities were built. When Amazon and Microsoft attempted to recruit him in 2016 to develop similar capabilities, Ross recognised a pattern: the concentration of advanced AI compute in too few hands represented a strategic vulnerability.

His response was to establish Groq in 2016, developing Language Processing Units optimised for inference—the phase where trained models actually perform useful work. The company has since raised over $3 billion and achieved a valuation approaching $7 billion, positioning itself as one of Nvidia’s most credible challengers in the AI hardware market. But Ross’s ambitions extend beyond corporate competition. He views Groq’s mission as democratising access to compute—creating abundant supply where artificial scarcity might otherwise concentrate power.

The quote itself emerged during a discussion about global AI competitiveness. Ross had been explaining why European nations, despite possessing strong research talent and model development capabilities (Mistral being a prominent example), risk strategic irrelevance without corresponding investment in computational infrastructure and energy capacity. A brilliant model without compute to run it, he argued, will lose to a mediocre model backed by ten times the computational resources. This isn’t theoretical—it’s the lived reality of the current AI landscape, where rate limits and inference capacity constraints determine what services can scale and which markets can be served.

The Energy Calculus

The energy dimension of Ross’s statement carries particular weight. Modern AI training and inference require extraordinary amounts of electrical power. The hyperscalers—Google, Microsoft, Amazon, Meta—are each committing tens of billions of dollars annually to AI infrastructure, with significant portions dedicated to data centre construction and energy provision. Microsoft recently announced it wouldn’t make certain GPU clusters available through Azure because the company generated higher returns using that compute internally rather than renting it to customers. This decision, more than any strategic presentation, reveals the economic value density of AI compute.

Ross draws explicit parallels to the early petroleum industry: a period of chaotic exploration where a few “gushers” delivered extraordinary returns whilst most ventures yielded nothing. In this analogy, compute is the new oil—a fundamental input that determines economic output and strategic positioning. But unlike oil, compute demand doesn’t saturate. Ross describes AI demand as “insatiable”: if OpenAI or Anthropic received twice their current inference capacity, their revenue would nearly double within a month. The bottleneck isn’t customer appetite; it’s supply.

This creates a concerning dynamic for nations without indigenous energy abundance or the political will to develop it. Ross specifically highlighted Europe’s predicament: impressive AI research capabilities undermined by insufficient energy infrastructure and regulatory hesitance around nuclear power. He contrasted this with Norway’s renewable capacity (80% wind utilisation) or Japan’s pragmatic reactivation of nuclear facilities—examples of countries aligning energy policy with computational ambition. The message is uncomfortable but clear: technical sophistication in model development cannot compensate for material disadvantage in energy and compute capacity.

Strategic Implications

The geopolitical dimension becomes more acute when considering China’s position. Ross noted that whilst Chinese models like DeepSeek may be cheaper to train (through various optimisations and potential subsidies), they remain more expensive to run at inference—approximately ten times more costly per token generated. This matters because inference, not training, determines scalability and market viability. China can subsidise AI deployment domestically, but globally—what Ross terms the “away game”—cost structure determines competitiveness. Countries cannot simply construct nuclear plants at will; energy infrastructure takes decades to build.

This asymmetry creates opportunity for nations with existing energy advantages. The United States, despite higher nominal costs, benefits from established infrastructure and diverse energy sources. However, Ross’s framework suggests this advantage is neither permanent nor guaranteed. Control over compute requires continuous investment in both silicon capability and energy generation. Nations that fail to maintain pace risk dependency—importing not just technology, but the capacity for economic and strategic autonomy.

The corporate analogy proves instructive. Ross predicts that every major AI company—OpenAI, Anthropic, Google, and others—will eventually develop proprietary chips, not necessarily to outperform Nvidia technically, but to ensure supply security and strategic control. Nvidia currently dominates not purely through superior GPU architecture, but through control of high-bandwidth memory (HBM) supply chains. Building custom silicon allows organisations to diversify supply and avoid allocation constraints that might limit their operational capacity. What applies to corporations applies equally to nations: vertical integration in compute infrastructure is increasingly a prerequisite for strategic autonomy.

The Theorists and Precedents

Ross’s thesis echoes several established frameworks in economic and technological thought, though he synthesises them into a distinctly contemporary proposition.

Harold Innis, the Canadian economic historian, developed the concept of “staples theory” in the 1930s and 1940s—the idea that economies organised around the extraction and export of key commodities (fur, fish, timber, oil) develop institutional structures, trade relationships, and power dynamics shaped by those materials. Innis later extended this thinking to communication technologies in works like Empire and Communications (1950) and The Bias of Communication (1951), arguing that the dominant medium of a society shapes its political and social organisation. Ross’s formulation applies Innisian logic to computational infrastructure: the nations that control the “staples” of the AI economy—energy and compute—will shape the institutional and economic order that emerges.

Carlota Perez, the Venezuelan-British economist, provided a framework for understanding technological revolutions in Technological Revolutions and Financial Capital (2002). Perez identified how major technological shifts (steam power, railways, electricity, mass production, information technology) follow predictable patterns: installation phases characterised by financial speculation and infrastructure building, followed by deployment phases where the technology becomes economically productive. Ross’s observation about current AI investment—massive capital expenditure by hyperscalers, uncertain returns, experimental deployment—maps cleanly onto Perez’s installation phase. The question, implicit in his quote, is which nations will control the infrastructure when the deployment phase arrives and returns become tangible.

W. Brian Arthur, economist and complexity theorist, articulated the concept of “increasing returns” in technology markets through works like Increasing Returns and Path Dependence in the Economy (1994). Arthur demonstrated how early advantages in technology sectors compound through network effects, learning curves, and complementary ecosystems—creating winner-take-most dynamics rather than the diminishing returns assumed in classical economics. Ross’s emphasis on compute abundance follows this logic: early investment in computational infrastructure creates compounding advantages in AI capability, which drives economic returns, which fund further compute investment. Nations entering this cycle late face escalating barriers to entry.

Joseph Schumpeter, the Austrian-American economist, introduced the concept of “creative destruction” in Capitalism, Socialism and Democracy (1942)—the idea that economic development proceeds through radical innovation that renders existing capital obsolete. Ross explicitly invokes Schumpeterian dynamics when discussing the risk that next-generation AI chips might render current hardware unprofitable before it amortises. This uncertainty amplifies the strategic calculus: nations must invest in compute infrastructure knowing that technological obsolescence might arrive before economic returns materialise. Yet failing to invest guarantees strategic irrelevance.

William Stanley Jevons, the 19th-century English economist, observed what became known as Jevons Paradox in The Coal Question (1865): as technology makes resource use more efficient, total consumption typically increases rather than decreases because efficiency makes the resource more economically viable for new applications. Ross applies this directly to AI compute, noting that as inference becomes cheaper (through better chips or more efficient models), demand expands faster than costs decline. This means the total addressable market for compute grows continuously—making control over production capacity increasingly valuable.

Nicholas Georgescu-Roegen, the Romanian-American economist, pioneered bioeconomics and introduced entropy concepts to economic analysis in The Entropy Law and the Economic Process (1971). Georgescu-Roegen argued that economic activity is fundamentally constrained by thermodynamic laws—specifically, that all economic processes dissipate energy and cannot be sustained without continuous energy inputs. Ross’s insistence that “you cannot have compute without energy” is pure Georgescu-Roegen: AI systems, regardless of algorithmic elegance, are bound by physical laws. Compute is thermodynamically expensive—training large models requires megawatts, inference at scale requires sustained power generation. Nations without access to abundant energy cannot sustain AI economies, regardless of their talent or capital.

Mancur Olson, the American economist and political scientist, explored collective action problems and the relationship between institutional quality and economic outcomes in works like The Rise and Decline of Nations (1982). Olson demonstrated how established interest groups can create institutional sclerosis that prevents necessary adaptation. Ross’s observations about European regulatory hesitance and infrastructure underinvestment reflect Olsonian dynamics: incumbent energy interests, environmental lobbies, and risk-averse political structures prevent the aggressive nuclear or renewable expansion required for AI competitiveness. Meanwhile, nations with different institutional arrangements (or greater perceived strategic urgency) act more decisively.

Paul Romer, the American economist and Nobel laureate, developed endogenous growth theory, arguing in works like “Endogenous Technological Change” (1990) that economic growth derives from deliberate investment in knowledge and technology rather than external factors. Romer’s framework emphasises the non-rivalry of ideas (knowledge can be used by multiple actors simultaneously) but the rivalry of physical inputs required to implement them. Ross’s thesis fits perfectly: AI algorithms can be copied and disseminated, but the computational infrastructure to deploy them at scale cannot. This creates a fundamental asymmetry that determines economic power.

The Historical Pattern

History provides sobering precedents for resource-driven geopolitical competition. Britain’s dominance in the 19th century rested substantially on coal abundance that powered industrial machinery and naval supremacy. The United States’ 20th-century ascendance correlated with petroleum access and the industrial capacity to refine and deploy it. Oil-dependent economies in the Middle East gained geopolitical leverage disproportionate to their population or industrial capacity purely through energy reserves.

Ross suggests we are witnessing the emergence of a similar dynamic, but with a critical difference: AI compute is both resource-intensive (requiring enormous energy) and productivity-amplifying (making other economic activity more efficient). This creates a multiplicative effect where compute advantages compound through both direct application (better AI services) and indirect effects (more efficient production of goods and services across the economy). A nation with abundant compute doesn’t just have better chatbots—it has more efficient logistics, agricultural systems, manufacturing processes, and financial services.

The “away game” concept Ross introduced during the podcast discussion adds a critical dimension. China, despite substantial domestic AI investment and capabilities, faces structural disadvantages in global competition because international customers cannot simply replicate China’s energy subsidies or infrastructure. This creates opportunities for nations with more favourable cost structures or energy profiles, but only if they invest in both compute capacity and energy generation.

The Future Ross Envisions

Throughout the podcast, Ross painted a vision of AI-driven abundance that challenges conventional fears of technological unemployment. He predicts labour shortages, not mass unemployment, driven by three mechanisms: deflationary pressure (AI makes goods and services cheaper), workforce opt-out (people work less as living costs decline), and new industry creation (entirely new job categories emerge, like “vibe coding”—programming through natural language rather than formal syntax).

This optimistic scenario depends entirely on computational abundance. If compute remains scarce and concentrated, AI benefits accrue primarily to those controlling the infrastructure. Ross’s mission with Groq—creating faster deployment cycles (six months versus two years for GPUs), operating globally distributed data centres, optimising for cost efficiency rather than margin maximisation—aims to prevent that concentration. But the same logic applies at the national level. Countries without indigenous compute capacity will import AI services, capturing some productivity benefits but remaining dependent on external providers for the infrastructure that increasingly underpins economic activity.

The comparison Ross offers—LLMs as “telescopes of the mind”—is deliberately chosen. Galileo’s telescope revolutionised human understanding but required specific material capabilities to construct and use. Nations without optical manufacturing capacity could not participate in astronomical discovery. Similarly, nations without computational and energy infrastructure cannot participate fully in the AI economy, regardless of their algorithmic sophistication or research talent.

Conclusion

Ross’s statement—”The countries that control compute will control AI. You cannot have compute without energy”—distils a complex geopolitical and economic reality into stark clarity. It combines Innisian materialism (infrastructure determines power), Schumpeterian dynamism (innovation renders existing capital obsolete), Jevonsian counterintuition (efficiency increases total consumption), and Georgescu-Roegen’s thermodynamic constraints (economic activity requires energy dissipation).

The implications are uncomfortable for nations unprepared to make the necessary investments. Technical prowess in model development provides no strategic moat if the computational infrastructure to deploy those models remains controlled elsewhere. Energy abundance, or the political will to develop it, becomes a prerequisite for AI sovereignty. And AI sovereignty increasingly determines economic competitiveness across sectors.

Ross occupies a unique vantage point—neither pure academic nor disinterested observer, but an operator building the infrastructure that will determine whether his prediction proves correct. Groq’s valuation and customer demand suggest the market validates his thesis. Whether nations respond with corresponding urgency remains an open question. But the framework Ross articulates will likely define strategic competition for the remainder of the decade: compute as currency, energy as prerequisite, and algorithmic sophistication as necessary but insufficient for competitive advantage.

read more
Quote: J.W. Stephens – Author

Quote: J.W. Stephens – Author

“Be the person your dog thinks you are!” – J.W. Stephens – Author

The quote “Be the person your dog thinks you are!” represents a profound philosophical challenge wrapped in disarming simplicity. It invites us to examine the gap between our idealised selves and our everyday reality through the lens of unconditional canine devotion. This seemingly light-hearted exhortation carries surprising depth when examined within the broader context of authenticity, aspiration and the moral psychology of personal development.

The Author and the Quote’s Origins

J.W. Stephens, a seventh-generation native Texan, has spent considerable time travelling and living across various locations in Texas and internationally. Whilst the search results provide limited biographical detail about this particular author, the quote itself reveals a distinctively American sensibility—one that combines practical wisdom with accessible moral instruction. The invocation of dogs as moral exemplars reflects a cultural tradition deeply embedded in American life, where the human-canine bond serves as both comfort and conscience.

The brilliance of Stephens’ formulation lies in its rhetorical structure. By positioning the dog’s perception as the aspirational standard, the quote accomplishes several objectives simultaneously: it acknowledges our frequent moral shortcomings, suggests that we already possess knowledge of higher standards, and implies that achieving those standards is within reach. The dog becomes both witness and ideal reader—uncritical yet somehow capable of perceiving our better nature.

The quote functions as what philosophers might term a “regulative ideal”—not a description of what we are, but a vision of what we might become. Dogs, in their apparent inability to recognise human duplicity or moral inconsistency, treat their owners as wholly trustworthy, infinitely capable, and fundamentally good. This perception, whether accurate or illusory, creates a moral challenge: can we rise to meet it?

Philosophical Foundations: Authenticity and the Divided Self

The intellectual lineage underpinning this seemingly simple maxim extends deep into Western philosophical tradition, touching upon questions of authenticity, self-knowledge, and moral psychology that have preoccupied thinkers for millennia.

Søren Kierkegaard (1813-1855) stands as perhaps the most important theorist of authenticity in Western philosophy. The Danish philosopher argued that modern life creates a condition he termed “despair”—not necessarily experienced as anguish, but as a fundamental disconnection from one’s true self. Kierkegaard distinguished between the aesthetic, ethical, and religious stages of existence, arguing that most people remain trapped in the aesthetic stage, living according to immediate gratification and social conformity rather than choosing themselves authentically. His concept of “becoming who you are” anticipates Stephens’ formulation, though Kierkegaard’s vision is considerably darker and more demanding. For Kierkegaard, authentic selfhood requires a “leap of faith” and acceptance of radical responsibility for one’s choices. The dog’s unwavering faith in its owner might serve, in Kierkegaardian terms, as a model of the absolute commitment required for authentic existence.

Jean-Paul Sartre (1905-1980) developed Kierkegaard’s insights in a secular, existentialist direction. Sartre’s notion of “bad faith” (mauvaise foi) describes the human tendency to deceive ourselves about our freedom and responsibility. We pretend we are determined by circumstances, social roles, or past choices when we remain fundamentally free. Sartre argued that consciousness is “condemned to be free”—we cannot escape the burden of defining ourselves through our choices. The gap between who we are and who we claim to be constitutes a form of self-deception Sartre found both universal and contemptible. Stephens’ quote addresses precisely this gap: the dog sees us as we might be, whilst we often live as something less. Sartre would likely appreciate the quote’s implicit demand that we accept responsibility for closing that distance.

Martin Heidegger (1889-1976) approached similar territory through his concept of “authenticity” (Eigentlichkeit) versus “inauthenticity” (Uneigentlichkeit). For Heidegger, most human existence is characterised by “fallenness”—an absorption in the everyday world of “das Man” (the “They” or anonymous public). We live according to what “one does” rather than choosing our own path. Authentic existence requires confronting our own mortality and finitude, accepting that we are “beings-toward-death” who must take ownership of our existence. The dog’s perspective, unburdened by social conformity and living entirely in the present, might represent what Heidegger termed “dwelling”—a mode of being that is at home in the world without falling into inauthenticity.

The Psychology of Self-Perception and Moral Development

Moving from continental philosophy to empirical psychology, several theorists have explored the mechanisms by which we maintain multiple versions of ourselves and how we might reconcile them.

Carl Rogers (1902-1987), the founder of person-centred therapy, developed a comprehensive theory of the self that illuminates Stephens’ insight. Rogers distinguished between the “real self” (who we actually are) and the “ideal self” (who we think we should be). Psychological health, for Rogers, requires “congruence”—alignment between these different self-concepts. When the gap between real and ideal becomes too wide, we experience anxiety and employ defence mechanisms to protect our self-image. Rogers believed that unconditional positive regard—accepting someone fully without judgment—was essential for psychological growth. The dog’s perception of its owner represents precisely this unconditional acceptance, creating what Rogers termed “conditions of worth” that are entirely positive. Paradoxically, this complete acceptance might free us to change precisely because we feel safe enough to acknowledge our shortcomings.

Albert Bandura (born 1925) developed social learning theory and the concept of self-efficacy, which bears directly on Stephens’ formulation. Bandura argued that our beliefs about our capabilities significantly influence what we attempt and accomplish. When we believe others see us as capable (as dogs manifestly do), we are more likely to attempt difficult tasks and persist through obstacles. The dog’s unwavering confidence in its owner might serve as what Bandura termed “vicarious experience”—seeing ourselves succeed through another’s eyes increases our own self-efficacy beliefs. Moreover, Bandura’s later work on moral disengagement explains how we rationalise behaviour that conflicts with our moral standards. The dog’s perspective, by refusing such disengagement, might serve as a corrective to self-justification.

Carol Dweck (born 1946) has explored how our beliefs about human qualities affect achievement and personal development. Her distinction between “fixed” and “growth” mindsets illuminates an important dimension of Stephens’ quote. A fixed mindset assumes that qualities like character, intelligence, and moral worth are static; a growth mindset sees them as developable through effort. The dog’s perception suggests a growth-oriented view: it sees potential rather than limitation, possibility rather than fixed character. The quote implies that we can become what the dog already believes us to be—a quintessentially growth-minded position.

Moral Philosophy and the Ethics of Character

The quote also engages fundamental questions in moral philosophy about the nature of virtue and how character develops.

Aristotle (384-322 BCE) provides the foundational framework for understanding character development in Western thought. His concept of eudaimonia (often translated as “flourishing” or “the good life”) centres on the cultivation of virtues through habituation. For Aristotle, we become virtuous by practising virtuous actions until they become second nature. The dog’s perception might serve as what Aristotle termed the “great-souled man’s” self-regard—not arrogance but appropriate recognition of one’s potential for excellence. However, Aristotle would likely caution that merely aspiring to virtue is insufficient; one must cultivate the practical wisdom (phronesis) to know what virtue requires in specific circumstances and the habituated character to act accordingly.

Immanuel Kant (1724-1804) approached moral philosophy from a radically different angle, yet his thought illuminates Stephens’ insight in unexpected ways. Kant argued that morality stems from rational duty rather than inclination or consequence. The famous categorical imperative demands that we act only according to maxims we could will to be universal laws. Kant’s moral agent acts from duty, not because they feel like it or because they fear consequences. The gap between our behaviour and the dog’s perception might be understood in Kantian terms as the difference between acting from inclination (doing good when convenient) and acting from duty (doing good because it is right). The dog, in its innocence, cannot distinguish these motivations—it simply expects consistent goodness. Rising to meet that expectation would require developing what Kant termed a “good will”—the disposition to do right regardless of inclination.

Lawrence Kohlberg (1927-1987) developed a stage theory of moral development that explains how moral reasoning evolves from childhood through adulthood. Kohlberg identified six stages across three levels: pre-conventional (focused on rewards and punishment), conventional (focused on social approval and law), and post-conventional (focused on universal ethical principles). The dog’s expectation might be understood as operating at a pre-conventional level—it assumes goodness without complex reasoning. Yet meeting that expectation could require post-conventional thinking: choosing to be good not because others are watching but because we have internalised principles of integrity and compassion. The quote thus invites us to use a simple, pre-moral faith as leverage for developing genuine moral sophistication.

Contemporary Perspectives: Positive Psychology and Virtue Ethics

Recent decades have seen renewed interest in character and human flourishing, providing additional context for understanding Stephens’ insight.

Martin Seligman (born 1942), founder of positive psychology, has shifted psychological focus from pathology to wellbeing. His PERMA model identifies five elements of flourishing: Positive emotion, Engagement, Relationships, Meaning, and Accomplishment. The human-dog relationship exemplifies several of these elements, particularly the relationship component. Seligman’s research on “learned optimism” suggests that how we explain events to ourselves affects our wellbeing and achievement. The dog’s relentlessly optimistic view of its owner might serve as a model of the explanatory style Seligman advocates—one that sees setbacks as temporary and successes as reflective of stable, positive qualities.

Christopher Peterson (1950-2012) and Martin Seligman collaborated to identify character strengths and virtues across cultures, resulting in the Values in Action (VIA) classification. Their research identified 24 character strengths organised under six core virtues: wisdom, courage, humanity, justice, temperance, and transcendence. The quote implicitly challenges us to develop these strengths not because doing so maximises utility or fulfils duty, but because integrity demands that our actions align with our self-understanding. The dog sees us as possessing these virtues; the challenge is to deserve that vision.

Alasdair MacIntyre (born 1929) has argued for recovering Aristotelian virtue ethics in modern life. MacIntyre contends that the Enlightenment project of grounding morality in reason alone has failed, leaving us with emotivism—the view that moral judgments merely express feelings. He advocates returning to virtue ethics situated within narrative traditions and communities of practice. The dog-owner relationship might be understood as one such practice—a context with implicit standards and goods internal to it (loyalty, care, companionship) that shape character over time. Becoming worthy of the dog’s trust requires participating authentically in this practice rather than merely going through the motions.

The Human-Animal Bond as Moral Mirror

The specific invocation of dogs, rather than humans, as moral arbiters merits examination. This choice reflects both cultural realities and deeper philosophical insights about the nature of moral perception.

Dogs occupy a unique position in human society. Unlike wild animals, they have co-evolved with humans for thousands of years, developing sophisticated abilities to read human gestures, expressions, and intentions. Yet unlike humans, they appear incapable of the complex social calculations that govern human relationships—judgement tempered by self-interest, conditional approval based on social status, or critical evaluation moderated by personal advantage.

Emmanuel Levinas (1906-1995) developed an ethics based on the “face-to-face” encounter with the Other, arguing that the face of the other person makes an ethical demand on us that precedes rational calculation. Whilst Levinas focused on human faces, his insight extends to our relationships with dogs. The dog’s upturned face, its evident trust and expectation, creates an ethical demand: we are called to respond to its vulnerability and faith. The dog cannot protect itself from our betrayal; it depends entirely on our goodness. This radical vulnerability and trust creates what Levinas termed the “infinite responsibility” we bear toward the Other.

The dog’s perception is powerful precisely because it is not strategic. Dogs do not love us because they have calculated that doing so serves their interests (though it does). They do not withhold affection to manipulate behaviour (though behavioural conditioning certainly plays a role in the relationship). From the human perspective, the dog’s devotion appears absolute and uncalculating. This creates a moral asymmetry: the dog trusts completely, whilst we retain the capacity for betrayal or manipulation. Stephens’ quote leverages this asymmetry, suggesting that we should honour such trust by becoming worthy of it.

Practical Implications: From Aspiration to Action

The quote’s enduring appeal lies partly in its practical accessibility. Unlike philosophical treatises on authenticity or virtue that can seem abstract and demanding, Stephens offers a concrete, imaginable standard. Most dog owners have experienced the moment of returning home to exuberant welcome, seeing themselves reflected in their dog’s unconditional joy. The gap between that reflection and one’s self-knowledge of moral compromise or character weakness becomes tangible.

Yet the quote’s simplicity risks trivialising genuine moral development. Becoming “the person your dog thinks you are” is not achieved through positive thinking or simple willpower. It requires sustained effort, honest self-examination, and often painful acknowledgment of failure. The philosophical traditions outlined above suggest several pathways:

The existentialist approach demands radical honesty about our freedom and responsibility. We must acknowledge that we choose ourselves moment by moment, that no external circumstance determines our character, and that self-deception about this freedom represents moral failure. The dog’s trust becomes a call to authentic choice.

The Aristotelian approach emphasises habituation and practice. We must identify the virtues we lack, create situations that require practising them, and persist until virtuous behaviour becomes natural. The dog’s expectation provides motivation for this long-term character development.

The psychological approach focuses on congruence and self-efficacy. We must reduce the gap between real and ideal self through honest self-assessment and incremental change, using the dog’s confidence as a source of belief in our capacity to change.

The virtue ethics approach situates character development within practices and traditions. The dog-owner relationship itself becomes a site for developing virtues like responsibility, patience, and compassion through daily engagement.

The Quote in Contemporary Context

Stephens’ formulation resonates particularly in an era characterised by anxiety about authenticity. Social media creates pressure to curate idealised self-presentations whilst simultaneously exposing the gap between image and reality. Political and institutional leaders frequently fail to live up to professed values, creating cynicism about whether integrity is possible or even desirable. In this context, the dog’s uncomplicated faith offers both comfort and challenge—comfort that somewhere we are seen as fundamentally good, challenge that we might actually become so.

The quote also speaks to contemporary concerns about meaning and purpose. In a secular age lacking consensus on ultimate values, the question “How should I live?” lacks obvious answers. Stephens bypasses theological and philosophical complexities by offering an existentially grounded response: live up to the best version of yourself as reflected in uncritical devotion. This moves the question from abstract principle to lived relationship, from theoretical ethics to embodied practice.

Moreover, the invocation of dogs rather than humans as moral mirrors acknowledges a therapeutic insight: sometimes we need non-judgmental acceptance before we can change. The dog provides that acceptance automatically, creating psychological safety within which development becomes possible. In an achievement-oriented culture that often ties worth to productivity and success, the dog’s valuation based simply on existence—you are wonderful because you are you—offers profound relief and, paradoxically, motivation for growth.

The quote ultimately works because it short-circuits our elaborate mechanisms of self-justification. We know we are not as good as our dogs think we are. We know this immediately and intuitively, without needing philosophical argument. The quote simply asks: what if you were? What if you closed that gap? The question haunts precisely because the answer seems simultaneously impossible and within reach—because we have glimpsed that better self in our dog’s eyes and cannot quite forget it.

read more
Quote: Jensen Huang – CEO Nvidia

Quote: Jensen Huang – CEO Nvidia

“Oftentimes, if you reason about things from first principles, what’s working today incredibly well — if you could reason about it from first principles and ask yourself on what foundation that first principle is built and how that would change over time — it allows you to hopefully see around corners.” – Jensen Huang – CEO Nvidia

Jensen Huang’s quote was delivered in the context of an in-depth dialogue with institutional investors on the trajectory of Nvidia, the evolution of artificial intelligence, and strategies for anticipating and shaping the technological future.

Context of the Quote

The quote was made during an interview at a Citadel Securities event in October 2025, hosted by Konstantine Buhler, a partner at Sequoia Capital. The dialogue’s audience consisted of leading institutional investors, all seeking avenues for sustainable advantage or ‘edge’. The conversation explored the founding moments of Nvidia in the early 1990s, through the reinvention of the graphics processing unit (GPU), the creation of new computing markets, and the subsequent rise of Nvidia as the platform underpinning the global AI boom. The question of how to ‘see around corners’ — to anticipate technology and industry shifts before they crystallise for others — was at the core of the discussion. Huang’s answer, invoking first-principles reasoning, linked Nvidia’s success to its ability to continually revisit and challenge foundational assumptions, and to methodically project how they will be redefined by progress in science and technology.

Jensen Huang: Profile and Approach

Jensen Huang, born in Tainan, Taiwan in 1963, immigrated to the United States as a child, experiencing the formative challenges of cultural dislocation, financial hardship, and adversity. He obtained his undergraduate degree in electrical engineering from Oregon State University and a master’s from Stanford University. After working at AMD and LSI Logic, he co-founded Nvidia in 1993 at 30, reportedly at a Denny’s restaurant. From the outset, the company faced daunting odds — neither established market nor assured funding, and frequent existential risk in the initial years.

Huang is distinguished not only by technical fluency — he is deeply involved in hardware and software architecture — but also by an ability to translate complexity for diverse audiences. He eschews corporate formality in favour of trademark leather jackets and a focus on product. His leadership style is marked by humility, a willingness to bet on emerging ideas, and what he describes as “urgent innovation” born of early near-failure. This disposition has been integral to Nvidia’s progress, especially as the company repeatedly “invented markets” and defined entirely new categories, such as accelerated computing and AI infrastructure.

By 2024, Nvidia became the world’s most valuable public company, with its GPUs foundational to gaming, scientific computing, and, critically, the rise of AI. Huang’s awards — from the IEEE Founder’s Medal to listing among Time Magazine’s 100 most influential — underscore his reputation as a technologist and strategic thinker. He is widely recognised for being able to establish technical direction well before it becomes market consensus, an approach reflected in the quote.

First-Principles Thinking: Theoretical Foundations

Huang’s endorsement of “first principles” echoes a method of problem-solving and innovation associated with thinkers as diverse as Aristotle, Isaac Newton, and, in the modern era, entrepreneurs and strategists such as Elon Musk. The essence of first-principles thinking is to break down complex systems to their most fundamental truths — concepts that cannot be deduced from anything simpler — and to reason forward from those axioms, unconstrained by traditional assumptions, analogies, or received wisdom.

  • Aristotle was the first to coin the term “first principles”, distinguishing knowledge derived from irreducible foundational truths from knowledge obtained through analogy or precedent.
  • René Descartes advocated for systematic doubt and logical rebuilding of knowledge from foundational elements.
  • Richard Feynman, the physicist, was famous for urging students to “understand from first principles”, encouraging deep understanding and avoidance of rote memorisation or mere pattern recognition.
  • Elon Musk is often cited as a contemporary example, applying first-principles thinking to industries as varied as automotive (Tesla), space (SpaceX), and energy. Musk has described the technique as “boiling things down to the most fundamental truths and then reasoning up from there,” directly influencing not just product architectures but also cost models and operational methods.

Application in Technology and AI

First-principles thinking is particularly powerful in periods of technological transition:

  • In computing, first principles were invoked by Carver Mead and Lynn Conway, who reimagined the semiconductor industry in the 1970s by establishing the foundational laws for microchip design, known as Mead-Conway methodology. This approach was cited by Huang as influential for predicting the physical limitations of transistor miniaturisation and motivating Nvidia’s focus on accelerated computing.
  • Clayton Christensen, cited by Huang as an influence, introduced the idea of disruptive innovation, arguing that market leaders must question incumbent logic and anticipate non-linear shifts in technology. His books on disruption and innovation strategy have shaped how leaders approach structural shifts and avoid the “innovator’s dilemma”.
  • The leap from von Neumann architectures to parallel, heterogenous, and ultimately AI-accelerated computing frameworks — as pioneered by Nvidia’s CUDA platform and deep learning libraries — was possible because leaders at Nvidia systematically revisited underlying assumptions about how computation should be structured for new workloads, rather than simply iterating on the status quo.
  • The AI revolution itself was catalysed by the “deep learning” paradigm, championed by Geoffrey Hinton, Yann LeCun, and Andrew Ng. Each demonstrated that previous architectures, which had reached plateaus, could be superseded by entirely new approaches, provided there was willingness to reinterpret the problem from mathematical and computational fundamentals.

Backstory of the Leading Theorists

The ecosystem that enabled Nvidia’s transformation is shaped by a series of foundational theorists:

  • Mead and Conway: Their 1979 textbook and methodologies codified the “first-principles” approach in chip design, allowing for the explosive growth of Silicon Valley’s fabless innovation model.
  • Gordon Moore: Moore’s Law, while originally an empirical observation, inspired decades of innovation, but its eventual slow-down prompted leaders such as Huang to look for new “first principles” to govern progress, beyond mere transistor scaling.
  • Clayton Christensen: His disruption theory is foundational in understanding why entire industries fail to see the next shift — and how those who challenge orthodoxy from first principles are able to “see around corners”.
  • Geoffrey Hinton, Yann LeCun, Andrew Ng: These pioneers directly enabled the deep learning revolution by returning to first principles on how learning — both human and artificial — could function at scale. Their work with neural networks, widely doubted after earlier “AI winters”, was vindicated with landmark results like AlexNet (2012), enabled by Nvidia GPUs.

Implications

Jensen Huang’s quote is neither idle philosophy nor abstract advice — it is a methodology proven repeatedly by his own journey and by the history of technology. It is a call to scrutinise assumptions, break complex structures to their most elemental truths, and reconstruct strategy consciously from the bedrock of what is not likely to change, but also to ask: on what foundation do these principles rest, and how will these foundations themselves evolve.

Organisations and individuals who internalise this approach are equipped not only to compete in current markets, but to invent new ones — to anticipate and shape the next paradigm, rather than reacting to it.

read more
Quote: Andrej Karpathy – Ex-OpenAI, Ex-Tesla AI

Quote: Andrej Karpathy – Ex-OpenAI, Ex-Tesla AI

“What I think we have to do going forward…is figure out ways to remove some of the knowledge and to keep what I call this cognitive core. It’s this intelligent entity that is stripped from knowledge but contains the algorithms and contains the magic of intelligence and problem-solving and the strategies of it and all this stuff.” – Andrej Karpathy – Ex-OpenAI, Ex-Tesla AI

Andrej Karpathy’s observation about the need to “strip away knowledge whilst retaining the cognitive core” represents one of the most penetrating insights into contemporary artificial intelligence development. Speaking on Dwarkesh Patel’s podcast in October 2025, Karpathy—formerly a leading figure at both OpenAI and Tesla’s Autopilot programme—articulated a fundamental tension at the heart of modern AI: the current generation of large language models have become prodigious memorisers, yet this very capability may be constraining their potential for genuine intelligence.

The Paradox of Pre-training

To comprehend Karpathy’s thesis, one must first understand the architecture of contemporary AI systems. Large language models are trained on vast corpora—often 15 trillion tokens or more—through a process called pre-training. During this phase, models learn to predict the next token in a sequence, effectively compressing the entire internet into their neural networks. Karpathy describes this compressed representation as only “0.07 bits per token” for a model like Llama 3 70B, highlighting the extraordinary degree of compression occurring.

This compression serves two distinct functions, which Karpathy carefully delineates. First, models accumulate factual knowledge—the content of Wikipedia articles, the specifics of historical events, the details of scientific papers. Second, and more crucially, they develop what Karpathy terms “algorithmic patterns”—the capacity for in-context learning, the ability to recognise and complete patterns, the fundamental mechanisms of reasoning itself.

The problem, as Karpathy sees it, is that contemporary models have become too adept at the former whilst the latter remains the true seat of intelligence. When a model can regurgitate passages verbatim or recite obscure facts, it demonstrates remarkable memory. But this same capability creates what he calls a “distraction”—the model becomes reliant on its hazy recollections of training data rather than developing robust reasoning algorithms that could operate independently of specific factual knowledge.

The Cognitive Core Concept

Karpathy’s proposed solution is to isolate and preserve what he terms the “cognitive core”—an intelligent entity stripped of encyclopaedic knowledge but retaining the fundamental algorithms of problem-solving, the strategies of thought, and what he describes as “the magic of intelligence.” This concept represents a profound shift in how we conceptualise artificial intelligence.

Consider the analogy to human cognition. Humans are remarkably poor memorisers compared to AI systems. Present a human with a random sequence of numbers, and they’ll struggle after seven or eight digits. Yet this apparent limitation forces humans to develop robust pattern-recognition capabilities and abstract reasoning skills. We’re compelled to “see the forest for the trees” precisely because we cannot memorise every individual tree.

Karpathy suggests that AI systems would benefit from similar constraints. A model with less memory but stronger reasoning capabilities would be forced to look up factual information whilst maintaining sophisticated algorithms for processing that information. Such a system would more closely resemble human intelligence—not in its limitations, but in the way those limitations drive the development of generalisable cognitive strategies.

The implications extend beyond mere technical architecture. Karpathy envisions cognitive cores as compact as one billion parameters—potentially even smaller—that could operate as genuine reasoning engines rather than glorified databases. These systems would “know that they don’t know” when confronted with factual questions, prompting them to retrieve information whilst applying sophisticated analysis. The result would be AI that thinks more than it remembers, that reasons rather than recites.

From Evolution to Engineering: The Path Not Taken

Karpathy’s perspective on AI development diverges sharply from what he calls the “Richard Sutton viewpoint”—the notion that we should build AI systems analogous to biological intelligence, learning from scratch through reinforcement learning in the manner of animals. Instead, Karpathy argues we’re building what he evocatively describes as “ghosts” or “spirit entities”—ethereal intelligences that emerge from imitating human-generated text rather than evolving through environmental interaction.

This distinction illuminates a crucial divergence in AI philosophy. Biological intelligence, as embodied in animals, emerges from evolution operating over millions of years, with vast amounts of capability “baked in” to neural circuitry. A zebra foal runs within minutes of birth not through reinforcement learning but through evolutionary encoding. Humans similarly arrive with substantial cognitive machinery pre-installed, with lifetime learning representing maturation and refinement rather than learning from first principles.

By contrast, contemporary AI systems learn through what Karpathy terms “crappy evolution”—pre-training on internet documents serves as a compressed, accelerated alternative to evolutionary optimisation. This process creates entities fundamentally different from biological intelligence, optimised for different tasks through different mechanisms. The current approach imitates the products of human intelligence (text, code, conversations) rather than replicating the developmental process that creates intelligence.

The Limits of Current Learning Paradigms

Karpathy’s critique extends to reinforcement learning, which he describes with characteristic bluntness as “terrible.” His concerns illuminate deep problems in how AI systems currently learn from experience. In reinforcement learning, a model generates hundreds of solution attempts, and those that arrive at correct answers have every intermediate step up-weighted, whilst failed attempts are down-weighted. Karpathy calls this “sucking supervision through a straw”—extracting minimal learning signal from vast amounts of computational work.

The fundamental issue is noise. When a solution works, not every step along the way was necessarily correct or optimal. The model may have taken wrong turns, pursued dead ends, or stumbled upon the answer despite flawed reasoning. Yet reinforcement learning broadcasts the final reward across the entire trajectory, reinforcing both good and bad reasoning indiscriminately. Karpathy notes that “you may have gone down the wrong alleys until you arrived at the right solution,” yet every mistaken step gets marked as something to do more of.

Humans, by contrast, engage in sophisticated post-hoc analysis. After solving a problem, we reflect on which approaches worked, which didn’t, and why. We don’t simply label an entire problem-solving session as “good” or “bad”—we dissect our reasoning, identify productive and unproductive strategies, and refine our approach. Current AI systems lack this reflective capacity entirely.

This limitation connects to broader questions about how AI systems might achieve continual learning—the ability to genuinely learn from ongoing experience rather than requiring massive retraining. Karpathy suggests that humans engage in a nightly “distillation phase” during sleep, processing the day’s experiences and integrating insights into long-term knowledge. AI systems have no equivalent mechanism. They simply restart from the same state each time, unable to evolve based on individual experiences.

Model Collapse and the Entropy Problem

A subtle but critical concern in Karpathy’s analysis is what he terms “model collapse”—the tendency of AI systems to produce outputs that occupy “a very tiny manifold of the possible space of thoughts.” Ask ChatGPT to tell a joke repeatedly, and you’ll receive the same three jokes. Request reflection on a topic multiple times, and you’ll observe striking similarity across responses. The models are “silently collapsed,” lacking the entropy and diversity that characterises human thought.

This phenomenon creates profound challenges for synthetic data generation, a technique labs use to create additional training material. If models generate training data for themselves or subsequent models, this collapsed distribution gradually dominates the training corpus. Training on one’s own outputs creates a dangerous feedback loop—each generation becomes less diverse, more stereotyped, more “collapsed” than the last. Karpathy suggests this may not even be a solvable problem, noting that humans similarly “collapse over time,” becoming more rigid and less creative as they age, revisiting the same thoughts and patterns with decreasing learning rates.

The contrast with children is illuminating. Young minds, not yet “overfitted” to the world, produce shocking, creative, unexpected responses precisely because they haven’t collapsed into standard patterns of thought. This freshness, this maintenance of high entropy in cognitive processes, may be essential to genuine intelligence. Yet our current training paradigms actively work against it, rewarding convergence towards common patterns and penalising deviation.

The Decade of Agents: Why Progress Takes Time

When Karpathy states this will be “the decade of agents” rather than “the year of agents,” he draws on hard-won experience from five years leading Tesla’s Autopilot programme. His insights into why artificial intelligence deployment takes far longer than demonstrations suggest carry particular weight given this background.

The central concept is what Karpathy calls “the march of nines.” Getting something to work 90% of the time—the level typically showcased in demonstrations—represents merely the first nine in “99.9%.” Each additional nine requires equivalent effort. During his tenure at Tesla, the team progressed through perhaps two or three nines over five years. More crucially, numerous nines remain before self-driving cars achieve true autonomy at scale.

This pattern isn’t unique to autonomous vehicles. Karpathy argues it applies across safety-critical domains, including software engineering. When code errors can leak millions of Social Security numbers or create critical security vulnerabilities, the cost of failure becomes prohibitively high. The demo-to-product gap widens dramatically. What works impressively in controlled conditions fails in countless edge cases when confronting reality’s full complexity.

Waymo’s experience illustrates this challenge. Despite providing “perfect drives” as early as 2014, the company still operates limited deployments requiring elaborate teleoperation infrastructure and supervision. Humans haven’t been removed; they’ve been rendered invisible, beaming in remotely to handle edge cases. The technology lives in a “pulled-back future”—functional but not yet economical, capable but not yet scalable.

Contemporary AI agents face analogous challenges. Whilst Claude and GPT-5 Pro demonstrate remarkable capabilities, they remain what Karpathy characterises as “elementary school students”—savants with perfect memory but lacking robust reasoning across all necessary dimensions. They’re “cognitively deficient” in ways users intuitively recognise even if they can’t articulate precisely what’s missing.

The Software Engineering Puzzle

Perhaps no domain better illustrates the puzzling contours of current AI capabilities than software engineering. Karpathy notes, somewhat ruefully, that whilst these systems were meant to enable “any economically valuable task,” API revenue remains “dominated by coding.” This supposedly general intelligence overwhelmingly excels at one specific domain.

This concentration isn’t accidental. Code enjoys unique properties that make it ideal for current AI architectures. Software development has always operated through text—terminals, editors, version control systems all manipulate textual representations. LLMs, trained on internet text, encounter code as a native format. Moreover, decades of infrastructure exist for handling code textually: diff tools for showing changes, IDEs for navigation, testing frameworks for verification.

Contrast this with domains lacking such infrastructure. Creating presentations involves spatial arrangement and visual design—there’s no “diff” for slides that elegantly shows modifications. Many knowledge work tasks involve physical documents, in-person interactions, or tacit knowledge that resists textual representation. These domains haven’t been pre-optimised for AI interaction in the way software development has.

Yet even in coding, Karpathy remains sceptical of current capabilities for cutting-edge work. When building nanoChat, a repository implementing a complete ChatGPT clone in simplified form, he found AI tools valuable for autocomplete and handling familiar patterns but inadequate for novel architectural decisions. The models kept trying to impose standard approaches when he deliberately chose non-standard implementations. They couldn’t comprehend his custom solutions, constantly suggesting deprecated APIs and bloating code with unnecessary defensive programming.

This points to a deeper truth: current models excel at reproducing common patterns from their training data but struggle with code “that has never been written before”—precisely the domain of frontier AI research itself. The recursive self-improvement that some forecast, where AI systems rapidly enhance their own capabilities, founders on this limitation. Models can accelerate work within established paradigms but cannot yet pioneer truly novel approaches.

The Trajectory of Intelligence Explosion

Karpathy’s perspective on potential intelligence explosions diverges sharply from both pessimistic and optimistic extremes. He sees AI not as a discrete, alien technology but as a continuation of computing’s evolution—part of an ongoing automation trend stretching back through compilers, high-level programming languages, and computer-aided design tools. From this view, the “intelligence explosion” has already been occurring for decades, visible in the exponential GDP growth curve that represents accumulated automation across countless domains.

This framing leads to counterintuitive predictions. Rather than expecting AI to suddenly accelerate economic growth from 2% annually to 20%, Karpathy suggests it will enable continued progress along the existing exponential trajectory. Just as computers, the internet, and mobile phones transformed society without producing visible discontinuities in aggregate growth statistics, AI will diffuse gradually across industries, maintaining rather than disrupting established growth patterns.

This gradualism doesn’t imply insignificance. The compounding effects of sustained exponential growth produce extraordinary transformation over time. But it does suggest that simple extrapolations from impressive demonstrations to imminent superintelligence misunderstand how technology integrates into society. There will be no discrete moment when “AGI” arrives and everything changes. Instead, we’ll experience continuous advancement in capabilities, continuous expansion of automation, and continuous adaptation of society to new technological possibilities.

The analogy to the Industrial Revolution proves instructive. That transformation didn’t occur through a single breakthrough but through cascading improvements across multiple technologies and practices, gradually shifting society from 0.2% annual growth to 2%. Similarly, AI’s impact will emerge through countless incremental deployments, each automating specific tasks, enabling new workflows, and creating feedback loops that accelerate subsequent progress.

The Human Element: Education in an AI Future

Karpathy’s work on Eureka, his educational initiative, reveals his deepest concerns about AI’s trajectory. He fears not that AI will fail but that “humanity gets disempowered by it,” relegated to the sidelines like the portly, passive citizens of WALL-E. His solution lies in radically reimagining education around the principle that “pre-AGI education is useful; post-AGI education is fun.”

The analogy to fitness culture illuminates this vision. Nobody needs physical strength to manipulate heavy objects—we have machines for that. Yet gyms proliferate because exercise serves intrinsic human needs: health, aesthetics, the satisfaction of physical mastery. Similarly, even in a world where AI handles most cognitive labour, humans will pursue learning for its inherent rewards: the pleasure of understanding, the status of expertise, the deep satisfaction of mental cultivation.

But achieving this vision requires solving a technical problem: making learning genuinely easy and rewarding. Currently, most people abandon learning because they encounter material that’s too difficult or too trivial, bouncing between frustration and boredom. Karpathy describes the experience of working with an expert language tutor who maintained a perfect calibration—always presenting challenges at the edge of current capability, never boring, never overwhelming. This created a state where “I was the only constraint to learning,” with knowledge delivery perfectly optimised.

Replicating this experience at scale represents what Karpathy sees as education’s great technical challenge. Current AI tutors, despite their sophistication, remain far from this standard. They can answer questions but cannot probe understanding, identify gaps, or sequence material to create optimal learning trajectories. The capability exists in exceptional human tutors; the challenge lies in encoding it algorithmically.

Yet Karpathy sees this challenge as tractable. Just as AI has transformed coding through autocomplete and code generation, it will eventually transform education through personalised, responsive tutoring. When learning becomes “trivial”—not in the sense of requiring no effort but in the sense of encountering no artificial obstacles—humans will pursue it enthusiastically. Not everyone will become an expert in everything, but the ceiling on human capability will rise dramatically as the floor on accessibility descends.

The Physics of Understanding: Karpathy’s Pedagogical Philosophy

Karpathy’s approach to teaching reveals principles applicable far beyond AI. His background in physics instilled what he describes as finding “first-order terms”—identifying the essential, dominant factors in any system whilst recognising that second and third-order effects exist but matter less. This habit of abstraction, of seeing spherical cows where others see only messy complexity, enables the creation of minimal, illustrative examples that capture phenomena’s essence.

MicroGrad exemplifies this approach perfectly. In 100 lines of Python, Karpathy implements backpropagation—the fundamental algorithm underlying all neural network training. Everything else in modern deep learning frameworks, he notes, is “just efficiency”—optimisations for speed, memory management, numerical stability. But the intellectual core, the actual mechanism by which networks learn, fits in 100 comprehensible lines. This distillation makes the previously arcane accessible.

The broader principle involves “untangling knowledge”—reorganising understanding so each concept depends only on what precedes it. This creates “ramps to knowledge” where learners never encounter gaps or leaps that would require them to take claims on faith. The famous transformer tutorial embodies this, beginning with a simple bigram model (literally a lookup table) and progressively adding components, each motivated by solving a specific limitation of what came before.

This approach contrasts sharply with the standard academic practice of presenting solutions before establishing problems, or introducing abstractions before concrete examples. Karpathy sees such approaches as, in his words, “a dick move”—they rob learners of the opportunity to grapple with challenges themselves, to develop intuition about what solutions might work, and to appreciate why particular approaches succeed where alternatives fail. The pedagogical crime isn’t challenging students; it’s presenting answers without first establishing questions.

Leading Theorists: The Intellectual Lineage

Richard Sutton and the Bitter Lesson

Richard Sutton, a pioneering reinforcement learning researcher, articulated what became known as “the bitter lesson”—the observation that simple, scalable methods leveraging computation consistently outperform approaches incorporating human knowledge or structural assumptions. His perspective suggests that the path to artificial general intelligence lies through learning algorithms powerful enough to discover structure from experience, much as evolution discovered biological intelligence.

Sutton’s famous assertion that “if you got to the squirrel, you’d be most of the way to AGI” reflects this view. Animal intelligence, in his framework, represents the core achievement—the fundamental learning algorithms that enable organisms to navigate environments, solve problems, and adapt to challenges. Human language and culture, whilst impressive, represent relatively minor additions to this foundation.

Karpathy respectfully dissents. His “we’re building ghosts, not animals” formulation captures the divergence: current AI systems don’t replicate the learning processes that create biological intelligence. They imitate the products of human intelligence (text, code, reasoning traces) rather than replicating its developmental origins. This distinction matters profoundly for predicting AI’s trajectory and understanding its capabilities and limitations.

Geoffrey Hinton and the Neural Network Renaissance

Geoffrey Hinton, often termed the “godfather of AI,” pioneered the neural network approaches that underpin contemporary systems. His persistence through decades when neural networks were unfashionable, his development of backpropagation techniques, and his later work on capsule networks and other architectures established the foundation for today’s large language models.

Karpathy studied directly under Hinton at the University of Toronto, experiencing firsthand the intellectual ferment as deep learning began its ascent to dominance. Hinton’s influence appears throughout Karpathy’s thinking—the emphasis on learning from data rather than hand-crafted rules, the focus on representation learning, the conviction that scale and simplicity often trump elaborate architectural innovations.

Yet Karpathy’s view extends beyond his mentor’s. Where Hinton focused primarily on perception (particularly computer vision), Karpathy grapples with the full scope of intelligence—reasoning, planning, continual learning, multi-agent interaction. His work synthesises Hinton’s foundational insights with broader questions about cognitive architecture and the nature of understanding itself.

Yann LeCun and Convolutional Networks

Yann LeCun’s development of convolutional neural networks in 1989 represented the first successful application of gradient descent to real-world pattern recognition. His work on handwritten digit recognition established core principles: the power of hierarchical feature learning, the importance of translation invariance, the value of specialised architectures for specific domains.

Karpathy’s reconstruction of LeCun’s 1989 network, time-travelling 33 years of algorithmic improvements, reveals his appreciation for this lineage. He found that pure algorithmic advances—modern optimisers, better architectures, regularisation techniques—could halve error rates. But achieving further gains required more data and more computation. This trinity—algorithms, data, compute—advances in lockstep, with no single factor dominating.

This lesson shapes Karpathy’s predictions about AI’s future. He expects continued progress across all three dimensions, with the next decade bringing better algorithms, vaster datasets, more powerful hardware, and more efficient software. But no breakthrough in any single dimension will produce discontinuous acceleration. Progress emerges from the intersection of many incremental improvements.

The Broader Intellectual Context

The debate Karpathy engages extends beyond specific individuals to fundamental questions about intelligence itself. Does intelligence arise primarily from general learning algorithms (the Sutton view) or from accumulated structure and innate mechanisms (the evolutionary perspective)? Can we build intelligence by imitating its products (the current LLM approach) or must we replicate its developmental processes? Will artificial intelligence remain fundamentally tool-like, augmenting human capability, or evolve into genuinely autonomous agents pursuing their own goals?

These questions connect to century-old debates in psychology and cognitive science between behaviourists emphasising learning and nativists emphasising innate structure. They echo discussions in evolutionary biology about the relative roles of genetic determination and developmental plasticity. They parallel arguments in philosophy of mind about whether intelligence requires embodiment or can exist as pure information processing.

Karpathy’s position threads between extremes. He acknowledges both the power of learning from data and the necessity of architectural structure. He recognises both the distinctiveness of AI systems and their illuminating analogies to biological intelligence. He balances optimism about AI’s potential with realism about current limitations and the difficulty of translating demonstrations into robust, deployed systems.

The Cognitive Core in Context: A New Paradigm for Intelligence

The concept of a cognitive core stripped of factual knowledge represents more than a technical proposal—it’s a reconceptualisation of what intelligence fundamentally is. Rather than viewing intelligence as encompassing both reasoning algorithms and accumulated knowledge, Karpathy proposes treating these as separate, with reasoning capability as the essence and factual knowledge as external resources to be accessed rather than internalised.

This separation mirrors certain aspects of human cognition whilst diverging in others. Humans do maintain a distinction between knowing how to think and knowing specific facts—we can reason about novel situations without direct experience, applying general problem-solving strategies learned in one domain to challenges in another. Yet our factual knowledge isn’t purely external; it shapes the very structure of our reasoning, creating rich semantic networks that enable rapid, intuitive judgement.

The proposal to strip AI systems down to cognitive cores involves accepting tradeoffs. Such systems would need to perform external lookups for factual information, introducing latency and dependency on knowledge bases. They would lack the pattern-matching capabilities that arise from vast memorisation, potentially missing connections between superficially unrelated domains. They might struggle with tasks requiring seamless integration of many small facts, where lookup costs would dominate processing time.

Yet the gains could prove transformative. A genuine cognitive core—compact, efficient, focused on algorithmic reasoning rather than fact retrieval—could operate in settings where current models fail. Edge deployment becomes feasible when models don’t require storing terabytes of parameters. Personalisation becomes practical when core reasoning engines can be fine-tuned or adapted without retraining on entire knowledge corpora. Interpretability improves when reasoning processes aren’t obscured by retrieval of memorised patterns.

Most profoundly, genuine cognitive cores might avoid the collapse and loss of entropy that plagues current models. Freed from the burden of maintaining consistency with vast memorised datasets, such systems could explore more diverse solution spaces, generate more varied outputs, and maintain the creative flexibility that characterises human cognition at its best.

Implications for the Decade Ahead

Karpathy’s decade-long timeline for agentic AI reflects hard-earned wisdom about technology deployment. His experience with autonomous vehicles taught him that impressive demonstrations represent merely the beginning of a long productisation journey. Each additional “nine” of reliability—moving from 90% to 99% to 99.9% accuracy—requires comparable effort. Safety-critical domains demand many nines before deployment becomes acceptable.

This reality shapes expectations for AI’s economic impact. Rather than sudden disruption, we’ll witness gradual diffusion across domains with varying characteristics. Tasks that are repetitive, well-defined, purely digital, and allowing high error rates will automate first. Call centre work exemplifies this profile—short interaction horizons, clear success criteria, limited context requirements, tolerance for occasional failures that human supervisors can catch.

More complex knowledge work will resist automation longer. Radiologists, consultants, accountants—professionals whose work involves lengthy timescales, subtle judgements, extensive context, and high costs of error—will see AI augmentation before replacement. The pattern will resemble Waymo’s current state: AI handling routine cases whilst humans supervise, intervene in edge cases, and maintain ultimate responsibility.

This graduated deployment creates an “autonomy slider”—a continuous spectrum from pure human operation through various degrees of AI assistance to eventual full automation. Most jobs won’t flip discretely from human to machine. Instead, they’ll slide along this spectrum as AI capabilities improve and organisations develop confidence in delegation. This process will unfold over years or decades, not months.

The economic implications differ from both optimistic and pessimistic extremes. We won’t see overnight mass unemployment—the gradual nature of deployment, the persistence of edge cases requiring human judgement, and society’s adaptation through creating new roles all mitigate disruption. But neither will we see disappointing underutilisation—the compound effect of many small automations across countless tasks will produce genuine transformation.

The Path Forward: Research Priorities

Karpathy’s analysis suggests several critical research directions for developing robust, capable AI systems. First, developing methods to isolate cognitive cores from memorised knowledge whilst maintaining reasoning capability. This might involve novel training objectives that penalise rote memorisation whilst rewarding generalisation, or architectural innovations that separate knowledge storage from reasoning mechanisms.

Second, creating effective continual learning systems that can distil experience into lasting improvements without catastrophic forgetting or model collapse. This requires moving beyond simple fine-tuning toward something more akin to the reflection and consolidation humans perform during sleep—identifying patterns in experience, extracting lessons, and integrating insights whilst maintaining diversity.

Third, advancing beyond current reinforcement learning to richer forms of learning from experience. Rather than broadcasting sparse reward signals across entire trajectories, systems need sophisticated credit assignment that identifies which reasoning steps contributed to success and which didn’t. This might involve explicit review processes where models analyse their own problem-solving attempts, or meta-learning approaches that learn how to learn from experience.

Fourth, developing multi-agent systems with genuine culture—shared knowledge bases that agents collectively maintain and evolve, self-play mechanisms that drive capability improvement through competition, and organisational structures that enable collaboration without centralized control. Current systems remain fundamentally solitary; genuine agent economies will require breakthroughs in coordination and communication.

Fifth, and perhaps most ambitiously, maintaining entropy in AI systems—preventing the collapse toward stereotyped outputs that currently plagues even frontier models. This might involve explicit diversity penalties, adversarial training to prevent convergence, or inspiration from biological systems that maintain variation through mechanisms like mutation and recombination.

Conclusion: Intelligence as Engineering Challenge

Andrej Karpathy’s vision of the cognitive core represents a mature perspective on artificial intelligence—neither breathlessly optimistic about imminent superintelligence nor dismissively pessimistic about current limitations. He sees AI as an engineering challenge rather than a mystical threshold, requiring patient work across multiple dimensions rather than awaiting a single breakthrough.

This perspective derives from direct experience with the messy reality of deploying AI systems at scale. Self-driving cars that work perfectly in demonstrations still require years of refinement before handling edge cases reliably. Coding agents that generate impressive solutions for common problems still struggle with novel architectural challenges. Educational AI that answers questions adequately still falls far short of expert human tutors’ adaptive responsiveness.

Yet within these limitations lies genuine progress. Models continue improving along multiple dimensions simultaneously. Infrastructure for deploying and managing AI systems grows more sophisticated. Understanding of these systems’ capabilities and constraints becomes more nuanced. The path forward is visible, even if it stretches further than optimists anticipated.

The concept of stripping knowledge to reveal the cognitive core captures this mature vision perfectly. Rather than pursuing ever-larger models memorising ever-more data, we might achieve more capable intelligence through subtraction—removing the crutch of memorisation to force development of robust reasoning algorithms. Like humans compelled to abstract and generalise because we cannot remember everything, AI systems might benefit from similar constraints.

This vision offers hope not for sudden transformation but for steady progress—the kind that compounds over decades into revolutionary change. It suggests that the hard technical problems of intelligence remain tractable whilst acknowledging their genuine difficulty. Most importantly, it positions humans not as passive observers of AI’s ascent but as active participants in shaping its development and ensuring its integration enhances rather than diminishes human flourishing.

The decade ahead will test these ideas. We’ll discover whether cognitive cores can be effectively isolated, whether continual learning mechanisms can be made robust, whether the demo-to-product gap can be bridged across diverse domains. The answers will shape not just the trajectory of AI technology but the future of human society in an increasingly automated world. Karpathy’s contribution lies in framing these questions with clarity, drawing on hard-won experience to guide expectations, and reminding us that the most profound challenges often yield to patient, disciplined engineering rather than waiting for miraculous breakthroughs.

read more

Download brochure

Introduction brochure

What we do, case studies and profiles of some of our amazing team.

Download

Our latest podcasts on Spotify

Sign up for our newsletters - free

Global Advisors | Quantified Strategy Consulting