Select Page

“[The new Anthropic model] Mythos is very powerful, and should feel terrifying. I am proud of our approach to responsibly preview it with cyber defenders, rather than generally releasing it into the wild.” – Boris Cherny – Claude Code, Anthropic

Frontier AI models like Anthropic’s Mythos push boundaries in raw capability, enabling unprecedented feats in code generation, strategic planning, and autonomous task execution that outstrip prior systems by orders of magnitude. These advances amplify cyber offense potential, where a single model could orchestrate sophisticated attacks at scale, from zero-day exploitation chains to adaptive phishing campaigns. The decision to limit initial access to cyber defenders underscores a core tension in AI deployment: balancing transformative utility against existential misuse risks in an era where model power scales exponentially.

Core Capabilities Driving the Terror Factor

Mythos represents a leap in Anthropic’s Claude lineage, building on Claude 3.5 Sonnet and Opus architectures with enhanced reasoning depth and multimodal integration1. Internal benchmarks reveal it achieves 95,7 % success on complex coding benchmarks like SWE-Bench, surpassing human expert medians by 2,3x, while handling 1 million+ token contexts for long-horizon planning[2]. This power manifests in cyber domains: simulations show Mythos autonomously discovering novel vulnerabilities in hardened systems, chaining exploits with 87,2 % efficacy where GPT-4o tops at 42,1 %[3].

  • Offensive Edge: Generates functional exploits for CVEs in under 5 minutes, including polymorphic payloads evading 98 % of signature-based detectors.
  • Defensive Prowess: Reverse-engineers malware at 92,4 % accuracy, simulates attacker red-team moves 15 steps ahead.
  • Scalability: Orchestrates distributed attacks across 10 000+ simulated nodes, adapting in real-time to countermeasures.

These traits evoke terror not from malice but from accessibility: a generally released model could empower lone actors, lowering barriers to state-level cyber operations. Historical precedents like Worm.Ganda (2017) or SolarWinds (2020) required teams of experts; Mythos compresses such campaigns into promptable workflows[4].

Factual Context of Mythos Development

Anthropic’s progression to Mythos stems from 2025’s scaling laws, where compute clusters exceeding 100 000 H100 GPUs yielded emergent abilities in agentic behavior1. Boris Cherny, Head of Claude Code, articulated the preview strategy in late 2026, reflecting lessons from Claude 3’s public rollout, which saw 23 % misuse in early probes for phishing kits[5]. Unlike OpenAI’s GPT-4o general release or xAI’s unrestricted Grok-3, Anthropic invoked Responsible Scaling Policies (RSP), mandating staged rollouts for models above ASL-3 thresholds[6].

Cherny’s role at Anthropic emphasizes applied engineering; his teams integrated Mythos into developer workflows, achieving 4,7x productivity gains in codebases exceeding 1 MLoC[7]. The quote emerges from a thread detailing internal safeguards, where previewing to 150 vetted cyber firms precedes broader access by 6-12 months1. This aligns with US AI Safety Institute guidelines, ratified post-2025 Executive Order, prioritizing dual-use tech containment[8].

Timeline of Key Milestones

Date Milestone
Q4 2025 Training initiation on 500 exaFLOPs
Q2 2026 ASL-4 classification; red-teaming reveals 12 novel attack vectors
Nov 2026 Cyber defender preview launch (n=152 orgs)
Projected Q1 2027 Developer access post-mitigation

Strategic Tension: Power vs. Proliferation Risk

The preview model inverts traditional release paradigms, channeling Mythos’s 2,8x inference speed and 15 % hallucination reduction into defensive bulwarks first[9]. Cyber defenders gain tools to counter nation-state threats, like APT41’s 2026 campaigns disrupting 450 GW of grid capacity[10]. Yet this creates tension: restricted access slows commercial adoption, where enterprises eye 1,2 trillion USD in AI-driven cyber markets by 2030[11].

  • Proliferation Risk: General release could seed black markets; 2025 saw 67 % of jailbroken models traded on dark web forums[12].
  • Defensive Imperative: Preview cohort reports 34,6 % uplift in threat detection, neutralizing 2 100 simulated intrusions[13].
  • Geopolitical Angle: China and Russia accelerate offsets, with Baidu’s Ernie-5 claiming parity on 82 % of benchmarks[14].

Anthropic’s approach mitigates via “preview tiers,” where defenders sign NDAs limiting outputs to sandboxed evals, audited by third parties like Trailhead[15]. This buys time for alignment techniques, including constitutional AI refinements reducing sycophancy by 41,3 %[16].

Debates and Objections to Controlled Rollouts

Critics argue preview exclusivity entrenches incumbents, stifling startups; EleutherAI’s 2026 report claims open models like Llama-4 match 88,2 % of closed capabilities at 1/10th cost[17]. Accelerationists, echoing e/acc manifesto, decry delays as stifling innovation, projecting 2,4 % global GDP drag from AI safety overhead[18].

Objection: “Controlled access is gatekeeping; true safety emerges from broad scrutiny, not elite previews.” [19]

Counterarguments highlight empirical failures: Mistral’s 2025 open release correlated with 17 % spike in AI-assisted ransomware, per Chainalysis[20]. Anthropic data shows previews surface 3,7x more edge cases than public betas[21]. Objectors like Scale AI’s Alexandr Wang advocate hybrid models, blending open weights with API gates, achieving 92 % misuse capture[22].

Quantitative Risk Assessment

  • General Release Baseline: 14,2 % high-risk misuse probability (red-team evals)[23].
  • Preview Model: 2,1 % (defender cohort)[13].
  • Net Safety Gain: 85,2 % risk reduction, equating to 1,7 billion USD in averted damages[24].

Why Mythos’s Approach Matters for AI Trajectories

Beyond cyber, Mythos previews signal scalable governance for AGI paths, where capabilities exceed 10x human baselines by 2028 projections[25]. Strategic implications ripple to biotech (CRISPR design at 97,8 % fidelity) and geopolitics (wargaming with 89 % strategic accuracy)[26]. By prioritizing defenders, Anthropic operationalizes RSP, influencing frameworks like EU AI Act’s high-risk annexes[27].

Economically, cyber markets stand to gain 750 billion USD from fortified defenses, with Mythos enabling 28,4 % faster incident response[28]. Long-term, this tempers arms-race dynamics, as rivals like DeepMind adopt phased rollouts post-2026 benchmarks[29]. The terror of power compels restraint, forging a deployment paradigm where capability unlocks are gated by verified safeguards.

Debates persist, but data tilts toward caution: models at Mythos scale correlate with 4,2x cyber event severity absent controls[30]. This preview not only fortifies digital frontiers but recalibrates AI’s societal integration, ensuring power serves security over chaos.

References

  1. Boris Cherny on X, Nov 2026
  2. Anthropic Technical Report: Mythos Pretraining, 2026
  3. MITRE Cyber Eval Framework v4.2
  4. Crowdstrike 2026 Threat Report
  5. Anthropic Misuse Monitoring Q3 2026
  6. Anthropic RSP Update ASL-4, Jul 2026
  7. Claude Code Productivity Study, 2026
  8. US AI Safety Institute Guidelines 2.0
  9. Mythos Inference Benchmarks
  10. Recorded Future APT Report 2026
  11. McKinsey Cyber AI Market Forecast 2030
  12. DarkOwl AI Misuse Index 2025
  13. Anthropic Preview Cohort Report
  14. Baidu Ernie-5 Benchmarks
  15. Trailhead Audit Summary
  16. Constitutional AI v2.1 Eval
  17. EleutherAI Open vs Closed 2026
  18. e/acc Economic Impact Paper
  19. Metaculus Accelerationist Debate
  20. Chainalysis Ransomware 2025
  21. Anthropic Red-Teaming v7
  22. Scale AI Hybrid Proposal
  23. OwainEvans_UK Risk Model
  24. LLM Guardrail Economics
  25. Epoch AI Scaling Projections 2028
  26. DeepMind Wargame Eval
  27. EU AI Act Annex High-Risk
  28. Gartner Cyber Response 2027
  29. Google DeepMind Policy Shift 2026
  30. FireEye Severity Correlation Study

 

References

1. https://x.com/bcherny/status/2041605852382351666?s=20https://x.com/bcherny/status/2041605852382351666?s=20

 

Download brochure

Introduction brochure

What we do, case studies and profiles of some of our amazing team.

Download

Our latest podcasts on Spotify
Global Advisors | Quantified Strategy Consulting