Maki – the efficient coder (AI agent)
A developer named Tony Continton built Maki, a lightweight terminal-based AI coding agent written in Rust, after hitting token limits on existing tools. It cuts API costs by roughly 40% and runs about twice as fast as competing agents by using smart context compression techniques...
According to the project's official site at maki.sh, which surfaced on Hacker News this week, Tony Continton created Maki out of direct frustration with the token ceilings imposed by popular coding agents. The tool is open source, available on GitHub at github.com/tontinton/maki, and installable with a single curl command. It is not a SaaS product, not a browser extension, and not another Electron app. It is a native binary that runs entirely in your terminal, and it was built in Rust with a terminal user interface running at 60 frames per second.
Why This Matters
Token costs are the hidden tax on every developer using AI coding tools today, and nobody is solving it aggressively enough. Continton is claiming a 40% cost reduction through context compression, which is not a rounding error when you are running an agent across a large codebase for hours. The 2x speed improvement matters just as much, because latency kills the flow state that makes AI-assisted coding worthwhile. If these benchmarks hold up under real-world conditions, Maki represents a serious challenge to heavier, more resource-hungry agents like Cursor's background agent or the Anthropic-backed Claude Code.
Daily briefing from 50+ sources. Free, 5-minute read.
The Full Story
Continton built Maki because he kept smashing into the hourly and weekly token limits that most AI coding tools impose. Those limits exist because sending large amounts of code context to a language model is expensive, and most agents are not particularly clever about what they include in each request. The standard approach is roughly: send everything that might be relevant, let the model figure it out, and bill the user for the overhead. Continton decided to attack that overhead directly.
The core innovation is something Maki calls an index. When you point Maki at a codebase, it parses source files across 15 programming languages and reduces them to structural skeletons, meaning imports, type definitions, and function signatures with their line ranges. It does not send full file contents unless it needs to. According to Continton's own measurements, this index approach adds about 59 tokens per turn in overhead but saves approximately 224 tokens per turn on file reads, netting a savings of 165 tokens per turn. Given that file reads account for roughly 65% of all token usage in a typical coding session, that math compounds quickly across a long task.
The second major efficiency mechanism is a sandboxed Python interpreter built directly into the agent. When Maki needs to perform research tasks, such as grepping across files or chaining reads together, it runs those operations inside this interpreter using Python's asyncio to parallelize them. Crucially, the intermediate results never enter the main conversation context. Only the final, distilled answer gets passed back to the model. That alone eliminates a substantial amount of token bloat that other agents accumulate by dumping raw command output into the chat history.
Maki also implements a tiered model selection system for subagents. The agent assigns each subtask a strength level, weak, medium, or strong, and routes it to an appropriately capable model. Grep-heavy research tasks go to cheaper, faster models like Claude Haiku. Architecture decisions go to more powerful models like Claude Opus 4. This dynamic routing means you are not paying Opus prices to run a file search. Each subagent gets its own isolated chat window, and you can flip between them using keyboard shortcuts.
The user experience philosophy is unusually transparent. The status bar always displays the current token count, cumulative cost, and active model. Nothing is hidden. The tool also includes a lean system prompt that Continton has deliberately kept short, and when context grows too large, Maki automatically compacts history by stripping images, removing thinking blocks, and summarizing older turns rather than crashing or silently dropping context.
Key Details
- Maki reduces API token costs by approximately 40% compared to standard coding agents, per Continton's benchmarks.
- The tool runs approximately 2x faster than existing alternatives in the same benchmarks.
- The index feature covers 15 programming languages and nets a savings of 165 tokens per turn.
- File reads account for roughly 65% of all token consumption in a typical coding session, making the index optimization the highest-leverage feature.
- The TUI runs at 60 frames per second using a native Rust binary, with syntax highlighting offloaded to a background thread pool.
- The project is hosted at github.com/tontinton/maki and installable via a single shell command.
- As of the Hacker News submission, the post had 4 points and 1 comment, suggesting it is very early in its public life.
What's Next
Maki is brand new and the Hacker News traction has been minimal so far, which means the benchmarks have not yet been stress-tested by a large community of developers with diverse codebases. The key milestone to watch is whether those 40% cost reduction numbers hold across projects with different language mixes, file sizes, and task types. If Continton keeps the project active and the community starts contributing language support and additional tool integrations, Maki could build a real following among cost-conscious developers who have grown tired of burning through API budgets on bloated agents.
How This Compares
The AI coding agent space already has serious competitors. Cursor's background agent, Claude Code from Anthropic, and Aider are the most relevant points of comparison. Aider is probably the closest analog because it is also open source, terminal-native, and designed for developers who want control over their toolchain. Aider has built a substantial community and supports a wide range of models, but it does not implement the kind of aggressive context compression that Maki is centering its identity . Claude Code, which Anthropic released earlier this year, is a direct terminal agent that also runs without a GUI. It is powerful but it is tightly coupled to Anthropic's own infrastructure and pricing, and it does not give users the same transparency into cost and token usage that Maki builds into its status bar by default.
The broader pattern here is that as AI tools mature, a wave of efficiency-focused alternatives is emerging to challenge the first-generation heavyweights. The first generation of coding agents competed on capability. This second wave is competing on cost per useful output, and that is a smarter battlefield. Maki is betting that developers who have already accepted AI coding assistance will now start optimizing for the economics of it, and that bet looks reasonable given how quickly API costs accumulate on long-running agentic tasks. You can follow related AI news to track how this competitive dynamic evolves.
FAQ
Q: What programming languages does Maki support for code indexing? A: Maki parses codebases in 15 programming languages and reduces them to structural skeletons containing imports, type definitions, and function signatures. The project site does not list each language by name, but the 15-language claim comes directly from Continton's own documentation on the maki.sh features page.
Q: How do I install Maki on my machine? A: Installation requires running a single shell command: curl -fsSL https://maki.sh/install.sh | sh. This pulls a native binary to your system. There is no Node.js runtime, no Python environment to configure, and no package manager dependency. Check the guides section for walkthroughs on setting up terminal-based AI coding agents.
Q: Does Maki work with models other than Claude? A: The project's documentation shows Claude models in the status bar examples, including claude-opus-4-6 specifically, but the tiered subagent system references Haiku-tier and Opus-tier models generically. Full details on multi-provider support are not yet documented on the public site, and the GitHub repository would be the best place to check current model compatibility.
Maki is an early-stage project with big claims and a clear point of view, and the developer community will either validate those benchmarks or expose their limits over the coming weeks. The core idea, that coding agents should be relentlessly efficient with context rather than token-profligate, is the right idea at the right time. Subscribe to the AI Agents Daily weekly newsletter for daily updates on AI agents, tools, and automation.
Get stories like this daily
Free briefing. Curated from 50+ sources. 5-minute read every morning.




