Home>News>Tools
ToolsMonday, April 20, 2026·8 min read

Claude Token Counter, now with model comparisons

AD
AI Agents Daily
Curated by AI Agents Daily team · Source: Simon Willison
Claude Token Counter, now with model comparisons
Why This Matters

Simon Willison upgraded his Claude Token Counter tool on April 20, 2026 to let developers compare token counts across multiple Claude models side by side. The update reveals that Claude Opus 4.7 uses up to 46 percent more tokens than Opus 4.6 for text and up to 201 percent more f...

Simon Willison, writing on his personal weblog at simonwillison.net, published a short but practically significant announcement on April 20, 2026: his Claude Token Counter tool now supports side-by-side model comparisons. What sounds like a minor feature update turns out to be a window into something that directly affects developer budgets. Claude Opus 4.7 is, according to Willison, the first Claude model to ship with a changed tokenizer, and the cost implications of that change are measurable and immediate.

Why This Matters

Tokenizer changes are not cosmetic. When Anthropic quietly ships a new tokenizer in Opus 4.7, every enterprise running Claude at scale has to recalculate their cost models from scratch. Opus 4.7 carries the same pricing as Opus 4.6 at $5 per million input tokens and $25 per million output tokens, but Willison's data shows real-world text inflation at 1.46x and image inflation at 3.01x. For a company spending $50,000 a month on Claude API calls, that difference is not a rounding error. Willison's tool is now essential infrastructure for any team evaluating whether to upgrade.

Stay ahead in AI agents

Daily briefing from 50+ sources. Free, 5-minute read.

The Full Story

Willison built the Claude Token Counter as a browser-based utility that calls Anthropic's token counting API to show exactly how many tokens a given piece of text or image will consume. The original version was useful, but it only gave you a single number for a single model. The upgraded version, shipped via a pull request on April 20, 2026, adds checkboxes for four current Claude models: Opus 4.7, Opus 4.6, Sonnet 4.6, and Haiku 4.5. You paste your content, select which models to compare, and get a results table showing token counts and a multiplier relative to the lowest count in your selection.

The motivation for adding comparison support came directly from Anthropic's own documentation for Opus 4.7, which states plainly: "Opus 4.7 uses an updated tokenizer that improves how the model processes text. The tradeoff is that the same input can map to more tokens, roughly 1.0 to 1.35 times depending on the content type." Willison decided to test that claim with a real-world prompt rather than accepting the range at face value.

He used the Opus 4.7 system prompt, a publicly documented string he had previously extracted and posted to his GitHub research repository, as his test input. The result was more dramatic than Anthropic's official estimate. The Opus 4.7 tokenizer consumed 7,335 tokens for that prompt compared to 5,039 tokens under Opus 4.6, a ratio of 1.46x. That already sits above the upper bound of the 1.35x figure Anthropic cited in their migration guide, at least for that specific prompt.

The image results were starker. Willison uploaded a 3,456 by 2,234 pixel PNG file weighing 3.7 megabytes, which is well within the new Opus 4.7 image limit of 2,576 pixels on the long edge at roughly 3.75 megapixels. Opus 4.7 counted 4,744 tokens for that image. Opus 4.6 counted 1,578 tokens for the same file. That is a 3.01x difference, which makes sense when you consider that Opus 4.7 accepts images more than three times the resolution of prior Claude models and needs more tokens to represent that additional visual information.

The tool itself is straightforward and open to anyone at tools.simonwillison.net. It also displays a note when selected models share the same tokenizer, which covers Opus 4.5, Sonnet 4.6, and Haiku 4.5 in this context. Only the Opus 4.7 and Opus 4.6 comparison produces genuinely different numbers, which is exactly the comparison most developers will care about right now.

Key Details

  • Willison published the upgrade on April 20, 2026 via pull request 269 on his GitHub tools repository.
  • The tool covers 4 current Claude models: claude-opus-4-7, claude-opus-4-6, claude-sonnet-4-6, and claude-haiku-4-5.
  • The Opus 4.7 system prompt test returned 7,335 tokens on Opus 4.7 versus 5,039 tokens on Opus 4.6, a 1.46x ratio.
  • The image test on a 3.7MB PNG returned 4,744 tokens on Opus 4.7 versus 1,578 tokens on Opus 4.6, a 3.01x ratio.
  • Anthropic's official migration documentation estimated a 1.0 to 1.35x token increase; Willison's text test exceeded that at 1.46x.
  • Both Opus 4.7 and Opus 4.6 carry identical pricing at $5 per million input tokens and $25 per million output tokens.
  • Opus 4.7 supports images up to 2,576 pixels on the long edge, more than double the prior 1,568 pixel limit.

What's Next

Development teams running workloads heavy in images or long system prompts should run their specific prompts through Willison's tool before committing to an Opus 4.7 migration, since the token inflation varies substantially by content type and may exceed Anthropic's published estimates. Anthropic will likely update its migration documentation as more developers publish real-world tokenization data that falls outside the 1.0 to 1.35x range. Willison's tool now offers the most direct way to generate that data without guesswork. For more AI tools like this one, the pattern of developer-built measurement utilities closing gaps left by vendor documentation is accelerating.

How This Compares

OpenAI has not changed the core tokenizer for GPT-4o between minor versions, and its tiktoken library gives developers stable, predictable token counts across model updates. That stability has been a quiet selling point for teams building cost-sensitive pipelines. Anthropic's decision to ship a new tokenizer in Opus 4.7 breaks that predictability, and unlike OpenAI's public tiktoken library, Anthropic's tokenization details require an API call to measure. Willison's tool bridges that gap, but the underlying friction remains a real consideration when evaluating AI news about model migration decisions.

Google's Gemini 2.0 series also changed its tokenization approach between versions, and the developer community reaction there was similar: confusion, followed by a wave of community-built measurement tools. The difference is that Google shipped updated documentation relatively quickly. Anthropic's 1.0 to 1.35x estimate in their Opus 4.7 announcement already appears conservative given Willison's 1.46x text result and 3.01x image result, which suggests the documentation may need revision.

What makes Willison's contribution particularly useful is that it is reproducible and transparent. He is not relying on a benchmark suite or a synthetic prompt. He is running the actual Opus 4.7 system prompt through the official token counting API and publishing the screenshot. That kind of concrete, first-party measurement is exactly what the broader developer community needs when a model vendor ships a foundational change without fully quantifying its scope. For teams looking to build their own measurement workflows, the guides section at AI Agents Daily covers API integration patterns for tools like this.

FAQ

Q: What is a tokenizer and why does changing it matter? A: A tokenizer is the component that breaks your text into small chunks called tokens before a language model processes them. Changing the tokenizer means the same sentence produces a different number of tokens, which directly changes your API bill since providers like Anthropic charge per token. A higher token count for identical input means higher costs even when the model price per million tokens stays the same.

Q: How much more expensive is Claude Opus 4.7 compared to Opus 4.6? A: Anthropic kept the price identical at $5 per million input tokens and $25 per million output tokens. However, because Opus 4.7 uses a new tokenizer that produces more tokens from the same input, real-world costs are higher. Willison's testing found text inputs cost about 46 percent more and image inputs cost about 201 percent more under Opus 4.7 compared to Opus 4.6 at the same nominal price.

Q: How do I use Willison's Claude Token Counter tool? A: The tool is free and available at tools.simonwillison.net/claude-token-counter. You paste text or upload an image, check the model boxes you want to compare, and click the Count Tokens button. The results table shows the token count for each selected model and a multiplier column showing how each model compares to the lowest count in your selection.

Willison's update is a small tool change with large practical implications for any team currently planning a Claude Opus 4.6 to 4.7 migration. The gap between Anthropic's published estimates and real-world measurements from community testing is a signal worth paying attention to as enterprise adoption of Opus 4.7 ramps up through mid-2026. Subscribe to the AI Agents Daily weekly newsletter for daily updates on AI agents, tools, and automation.

Our Take

This story matters because it signals a shift in how AI agents are being adopted across the industry. For builders evaluating their AI stack, this is worth watching closely.

Post Share

Get stories like this daily

Free briefing. Curated from 50+ sources. 5-minute read every morning.

Share this article Post on X Share on LinkedIn