“Our spend went up 7x the first day [when Anthropic switched it over to token-based pricing in May] and I’m like, oh sh*t, we created a monster. [Large language model] companies have been subsidising all of our usage and now no longer. User-based pricing shelters you.” – Carter Busse – Workato chief information officer

Enterprise adoption of generative AI has exposed a structural tension between enthusiasm for ubiquitous assistants and the brute economics of large language model computation. Organisations raced to roll out AI copilots to every knowledge worker, often under seemingly generous, user-based licensing that made usage feel close to free at the margin. That changed abruptly once providers shifted to pricing rooted in the true cost driver: tokens processed per request. The sudden visibility of usage-based bills has forced chief information officers to confront whether they have built a durable productivity platform or an uncontrolled cost engine running on someone else’s balance sheet1.

From enthusiasm to sticker shock

Workato offers a stark example of this pivot. As an integration and automation platform, it was a natural early adopter of agentic AI across internal workflows and customer-facing automation2,26. Once generative tools were rolled out widely, adoption surged: Workato reported more than tenfold growth in LLM usage and a material revenue uplift tied to AI-powered features11. For a time, costs were muted because major model vendors, including Anthropic, relied on user- or seat-based pricing and flat enterprise arrangements that decoupled consumption from marginal spend1,7. The economic signal reaching the CIO was weak: expanding access felt low-risk because the incremental cost of another prompt, conversation, or agent run was effectively zero to the business.

The turning point came when Anthropic moved its enterprise customers onto token-based billing for Claude usage. Instead of paying primarily per user, Workato began paying directly for the volume of tokens consumed across all its internal and product workloads1,22. Overnight, cost visibility flipped. A pattern of generous experimentation, long conversations, and proliferating internal tools translated into a first-day cost that was roughly seven times prior levels, revealing just how much latent demand had been masked by the previous pricing model5. What had looked like measured adoption was, financially, a rapidly scaling compute obligation that had not been governed as such.

How user-based pricing acts as a de facto subsidy

User-based pricing can be thought of as a coarse-grained hedge against the variability of generative workloads. Under a flat per-seat model, a provider implicitly averages heavy and light users, leaving the supplier to absorb volatility in usage and peak demand. For enterprise buyers, the value proposition is predictability: once licences are purchased, finance teams can forecast AI costs with the same tools they use for other SaaS budgets, regardless of whether an individual employee sends ten prompts per month or ten thousand.

This arrangement amounts to a cross-subsidy. Heavy users and intensive automation workloads consume far more underlying tokens and compute than light users, but are billed identically so long as they fit under the plan’s qualitative usage limits. The supplier is effectively underwriting the risk that a subset of customers exploit the flat pricing to build high-duty cycles, long-context workloads, or agent frameworks that keep models running continuously4,25. For a while, competitive dynamics encouraged this behaviour: vendors prioritised adoption and growth metrics, accepting that early-stage monetisation might lag behind actual compute costs.

Once models became central to core business processes and total token volumes began to soar, the imbalance became untenable. Shifting from seat-based to token-based billing is the supplier’s way of converting an averaged, opaque cost structure into one where revenue tracks the primary cost driver: the number and type of tokens processed. Instead of subsidising heavy users through broad user categories, providers charge each organisation in line with its actual compute footprint.

The mechanics of token-based pricing shocks

Token-based billing operates on a simple arithmetic relationship: total bill equals token volume multiplied by the rate per token, differentiated between input and output. In its current enterprise API pricing, Anthropic charges separate rates for input tokens – the text, documents, and context supplied to Claude – and output tokens – the model’s generated responses. For many flagship models, published rates cluster around 3 to 5 dollars per million input tokens and 15 to 25 dollars per million output tokens, with more advanced models commanding higher prices1,16. Other vendors such as OpenAI and Google follow broadly similar structures, though with widely varying rates across model tiers3,9,24.

The shock arises because enterprise buyers often underestimate both the volume of tokens and how quickly compounding factors magnify usage. First, conversational use encourages verbosity. Users ask broad questions, paste large documents, and accept multi-paragraph answers. Each interaction consumes both input and output tokens, and for many models, output tokens are priced several times higher than input tokens3,16. Second, long-context capabilities enable prompts that include extensive histories, knowledge bases, or email threads. Once the context window stretches into hundreds of thousands or even a million tokens, a single request can carry a cost multiple orders of magnitude larger than a simple chat, especially if premium modes for long context or fast inference are triggered10,16.

Third, agentic workflows – a particular focus for Workato – chain multiple model calls together. An agent tasked with, say, triaging an IT ticket will interpret the request, query knowledge bases, draft responses, perhaps call downstream tools, and refine recommendations, each step incurring additional tokens2,26. Where a human sees one business action – resolve a ticket – the billing system sees a series of separate model invocations. If this pattern is replicated thousands of times per day across customer support, sales operations, and back-office processes, total monthly token usage can explode without any single interaction appearing excessive.

Under user-based pricing, these dynamics were effectively invisible. Under token-based billing, they manifest instantly in the invoice. A sevenfold jump in spend on day one is a symptom of previously hidden intensity, not a sudden behavioural change. The organisation did not dramatically alter how it used AI; it simply began paying in line with the true cost structure underpinning its workloads.

Why providers must move away from implicit subsidies

On the supplier side, there are structural reasons why AI companies can no longer sustain broad, implicit subsidies at scale. Training and serving frontier models require massive investment in specialised hardware, energy, and engineering. Reports suggest that revenues at leading LLM firms now run into tens of billions of dollars annually, but those revenues are tightly coupled to equally significant capital and operating expenditures on GPUs and data centre infrastructure1,13. If enterprise customers consume millions or billions of tokens under flat-price contracts, the provider bears the risk that actual compute costs exceed the effective revenue per token.

Moreover, token pricing has become a competitive battleground. In 2026, published API prices span a range of roughly six hundred times between the cheapest small models and the most advanced reasoning systems3,9,27. Some entrants aggressively discount to gain share, while incumbents experiment with premium tiers for speed, long context, or jurisdiction-specific inference10,16. Maintaining cross-subsidies under these conditions becomes strategically dangerous: it obscures whether a model’s economics are genuinely sustainable or propped up by temporarily cheap capital and investor tolerance for losses.

Switching to usage-based billing restores economic discipline. Revenue becomes a near-linear function of tokens processed, allowing capacity planning, data centre investment, and R&D schedules to be benchmarked against projected token volumes rather than abstract user counts. It also creates room for finely tuned price discrimination: different rates for input versus output, surcharges for fast modes or extended context, regional multipliers, and discounts for batch processing or caching, all of which Anthropic and its peers now deploy at scale1,10,16.

User-based pricing as a psychological and governance shield

From the enterprise perspective, per-user pricing offers more than just financial predictability; it provides a psychological and operational shield that encourages experimentation. Employees are more likely to integrate AI into daily work when they know they are not triggering metered charges with every prompt. Citizen developers within a platform such as Workato can prototype agentic workflows, automate routine tasks, and iterate on internal tools without negotiating budget allocations for each new integration2,17. The absence of visible marginal cost fosters the kind of bottom-up innovation that many digital leaders seek to cultivate.

However, that same shelter can delay the establishment of governance mechanisms commensurate with the technology’s power. When usage feels free, few teams invest in monitoring token consumption, optimising prompt length, or choosing models appropriate to each workload. Security and compliance reviews might focus on data handling and hallucination risk, while financial controls lag behind. In such an environment, the shift to token-based billing functions like a sudden exposure of hidden leverage: what looked like a manageable pilot proves to be a complex portfolio of high-throughput workloads with no cost controls.

References

1. “‘We created a monster’: companies rein in AI usage as costs strain budgets”https://www.ft.com/content/1d37cc08-e0aa-45a4-a45d-4ad282529314

2. AI demand is inflated – only Anthropic is being realistic – CNBC – 2026-04-17 – https://www.cnbc.com/2026/04/17/ai-tokens-anthropic-openai-nvidia.html

3. Beyond the AI Hype with Workato Chief Information Officer Carter … – 2025-10-16 – https://www.youtube.com/watch?v=mPBrDLYfDqE

4. LLM API Pricing Comparison In 2026: Every Major Model, Ranked – 2026-05-11 – https://www.cloudzero.com/blog/llm-api-pricing-comparison/

5. Anthropic Kills Claude’s All-You-Can-Eat Pricing Plan – Dapta – 2026-04-08 – https://dapta.ai/blog-posts/ai-news-week-14-anthropic-claude-pricing/

6. ‘We created a monster’: companies rein in AI usage as costs strain … – 2026-06-19 – https://www.resetera.com/threads/%E2%80%98we-created-a-monster%E2%80%99-companies-rein-in-ai-usage-as-costs-strain-budgets.1555507/

7. How should AI be priced? | TSE – Toulouse School of Economics – 2026-03-20 – https://www.tse-fr.eu/how-should-ai-be-priced

8. Anthropic Pricing 2026: Plans, Costs & Real Spend – CheckThat.ai – 2026-05-22 – https://checkthat.ai/brands/anthropic/pricing

9. Empowerment With Automation and AI | Workatohttps://www.workato.com/podcast/carter-busse

10. LLM API Pricing Comparison (2025): OpenAI, Gemini, Claude – 2025-10-31 – https://intuitionlabs.ai/articles/llm-api-pricing-comparison-2025

11. Explaining Anthropic billing changes in 2026: Fast mode … – LinkedIn – 2026-02-24 – https://www.linkedin.com/pulse/explaining-anthropic-billing-changes-2026-fast-mode-pricing-liveanu-iwede

12. Workato’s AI Journey: 1010% LLM Adoption & $1M Revenue Boost – 2026-04-30 – https://www.linkedin.com/posts/carterbusse_enterprisemcp-agenticai-cio-activity-7455632099157544960-w_an

13. 8 Types of API Pricing Models – Zuplo – 2026-02-26 – https://zuplo.com/blog/8-types-of-api-pricing-models

14. The AI Token Pricing Crisis Behind OpenAI and Anthropic’s … – 2026-05-22 – https://www.investing.com/analysis/the-ai-token-pricing-crisis-behind-openai-and-anthropics-revenue-race-200680777

15. Meet Carter Busse | Atomicwork’s Wall of AI Champions – 2022-01-20 – https://www.atomicwork.com/ai-champions/carter-busse

16. LLM Pricing: Top 15+ Providers Compared – AIMultiple – 2026-05-08 – https://aimultiple.com/llm-pricing

17. Anthropic API Pricing: Claude Opus 4.8 Costs Explained – Amnic – 2026-06-16 – https://amnic.com/blogs/anthropic-api-pricing

18. Empowering Citizen Developers and Reshaping Business with AI … – 2024-11-07 – https://ciopod.com/podcasts/empowering-citizen-developers-and-reshaping-business-with-ai-with-carter-busse-of-workato/

19. Pricing Models for LLM Apps – Shlok’s Substack – 2023-12-08 – https://shloked.substack.com/p/pricing-models-for-llm-apps

20. AI Prices Are About to Shock Everyone – YouTube – 2026-06-14 – https://www.youtube.com/watch?v=i9Bq076wZj8

21. CIOs Predict 2026: MCP, Agents, and AI Governance | Carter Busse … – 2025-12-18 – https://www.linkedin.com/posts/carterbusse_bigideas2026-mcp-activity-7407512880348061696-YtwZ

22. API Monetization 101: Creating Pricing Strategy with AI – YouTube – 2026-02-26 – https://www.youtube.com/watch?v=eCol0ZDRq1A

23. Anthropic Changes Pricing to Bill Firms Based on AI Use as … – 2026-04-14 – https://www.theinformation.com/articles/anthropic-changes-pricing-bill-firms-based-ai-use-amid-compute-crunch

24. Carter Busse, CIO of future of work unicorn Workato, shares why it’s … – 2023-01-19 – https://peoplereign.io/carter-busse-cio-of-future-of-work-unicorn-workato-shares-why-its-hard-to-own-technology-at-a-technology-company/

25. API Pricing – OpenAI – 2026-04-09 – https://openai.com/api/pricing/

26. How Anthropic’s Claude pricing change pushed me to find a … – Reddit – 2026-03-30 – https://www.reddit.com/r/claude/comments/1s7ttl2/how_anthropics_claude_pricing_change_pushed_me_to/

27. Workato crafts a practical playbook for agentic AI deployment – 2025-12-09 – https://www.channeldive.com/news/workato-cio-agentic-ai-automation/807458/

28. API pricing is in freefall. What’s the actual case for running local now … – 2026-01-28 – https://www.reddit.com/r/LocalLLaMA/comments/1qp6rm5/api_pricing_is_in_freefall_whats_the_actual_case/

29. Carter Busse, Author at Workato – 2021-08-01 – https://www.workato.com/the-connector/author/carter-busse/

30. Why API Monetization Is the Next Pricing Frontier in the AI Age – 2026-01-12 – https://www.lek.com/insights/tmt/us/ei/seats-calls-why-api-monetization-next-pricing-frontier-ai-age

 

Global Advisors | Quantified Strategy Consulting
error: Content is protected !!