LLMTuesday, April 21, 2026·8 min read

CrabTrap: An LLM-as-a-judge HTTP proxy to secure agents in production

AD
AI Agents Daily
Curated by AI Agents Daily team · Source: HN LLM
CrabTrap: An LLM-as-a-judge HTTP proxy to secure agents in production
Why This Matters

Brex, the corporate fintech company, has released CrabTrap, an open-source HTTP proxy that uses a language model to inspect and approve or block requests made by AI agents before those requests reach external services. This matters because autonomous agents connecting to payment ...

Pedro Franceschi, co-founder at Brex, announced the release of CrabTrap on social media, and the project quickly surfaced on Hacker News where it gathered 108 upvotes and 36 comments within nine hours of posting. The project is available on GitHub alongside documentation published through Brex's engineering journal, making it immediately accessible to any team running AI agents in production.

Why This Matters

The AI agent security problem is not theoretical anymore. Companies are connecting autonomous agents to payment processors, infrastructure tools, and live databases right now, and most of them are doing it with no semantic safety layer between the agent and the service. CrabTrap is one of the first open-source, production-ready solutions that directly addresses this gap, and the fact that it comes from a regulated fintech company rather than an AI research lab gives it credibility that a research paper cannot match. The AI tools ecosystem has hundreds of agent frameworks but almost no standardized runtime safety tooling, and this project starts filling that void.

Stay ahead in AI agents

Daily briefing from 50+ sources. Free, 5-minute read.

The Full Story

Brex built CrabTrap to solve a problem the company faces every day: AI agents that need to make HTTP calls to external services in a financial environment where a single bad request can trigger an unauthorized transaction, a data breach, or a compliance violation. The solution they landed on is an HTTP proxy that sits between the agent and whatever service it wants to talk to, intercepting every outbound request before it completes.

The mechanics work like this. When an agent decides it needs to call an API endpoint, that request does not go directly to the destination. Instead, CrabTrap intercepts it and passes the full context of the request to an LLM judge, including the target URL, the HTTP method, the request body, and any parameters attached to the call. The LLM then evaluates whether the request is safe, appropriate, and consistent with the expected behavior of the agent. If the LLM approves, the request goes through. If it does not, CrabTrap can block it or flag it for review.

This is the LLM-as-a-judge pattern applied to network security, and it is a genuinely smart architectural choice for a specific class of problem. Traditional API security tools work on identity and permission, checking whether a given key or token is authorized to access a particular endpoint. What they cannot do is understand intent or context. A database credential either has write access or it does not, but no existing auth system can tell you whether a specific write operation makes business sense given what the agent was supposed to be doing. An LLM can at least approximate that kind of reasoning.

CrabTrap is designed to be configurable. Organizations can adjust the system prompt fed to the judge LLM, swap in different models depending on their latency and cost requirements, and set different tolerance thresholds for different deployment contexts. A team running a low-stakes content automation agent might configure CrabTrap primarily for logging and alerting. A team running an agent with access to financial settlement systems would likely want every request explicitly approved before it executes.

The open-source release is a meaningful signal from Brex. Financial services companies do not typically open-source internal security infrastructure. By making CrabTrap publicly available, Brex is betting that standardizing production agent safety is better for the industry than hoarding the solution as a competitive advantage. That is the right call, and it puts pressure on other organizations building agent platforms to match that level of transparency.

Key Details

  • CrabTrap was announced by Pedro Franceschi, co-founder of Brex, via social media in 2025.
  • The Hacker News submission received 108 upvotes and 36 comments within 9 hours of posting.
  • The project is open source and available on GitHub with documentation from Brex's engineering journal.
  • The proxy intercepts HTTP requests at the network layer and evaluates them using a configurable LLM judge.
  • Brex operates in financial services, an industry where unauthorized agent actions can have immediate and measurable financial and legal consequences.
  • The system supports configurable system prompts and multiple LLM models to match different risk tolerance levels.

What's Next

Expect other fintech and enterprise software companies to either adopt CrabTrap directly or build competing implementations inspired by it, because the underlying problem is universal to any organization deploying autonomous agents against live systems. Watch for integrations with popular agent frameworks like LangGraph and AutoGen, which would significantly accelerate adoption among developers who are already using those tools. The open-source community will likely push contributions around performance optimization and support for additional LLM providers within the next few months given the immediate interest on Hacker News.

How This Compares

The closest conceptual predecessor to CrabTrap is the work Anthropic has done on Constitutional AI, where a second model evaluates the outputs of a primary model against a set of principles before those outputs are acted on. CrabTrap applies that same two-model oversight structure but shifts it from the output generation layer to the network layer, which is a more practical insertion point for production systems that do not control the underlying agent model. Most teams deploying agents are not building their own models from scratch, so an infrastructure-level control plane like CrabTrap is more actionable than anything that requires modifying the model itself.

Compare this to runtime monitoring tools like Arize AI and Langfuse, which also observe agent behavior in production. Those platforms are primarily observability tools, built to help teams understand what agents are doing after the fact. CrabTrap operates in the critical path of execution, blocking requests in real time rather than alerting teams to problems that have already occurred. That is a fundamentally different security posture, and a more aggressive one.

The broader AI news context here is that formal agent safety tooling is about 18 months behind where agent deployment is in practice. OpenAI, Google DeepMind, and academic institutions have published extensively on agent safety as a research problem, but production teams have been building their own ad hoc guardrails or, more often, skipping them entirely. CrabTrap is notable not because it solves every agent safety problem but because it ships a concrete, deployable answer to a concrete, immediate risk. Research papers do not secure production systems. Proxies .

FAQ

Q: What is an HTTP proxy and why use one for AI agents? A: An HTTP proxy is a server that sits between two systems and forwards network requests between them. Using one for AI agents means every API call the agent makes passes through an intermediary that can inspect, approve, modify, or block that call before it reaches its destination, giving your team a control point that does not require changing the agent itself.

Q: Can an LLM actually make reliable security decisions in real time? A: LLMs are not perfect security tools and can be fooled by adversarial inputs or unusual edge cases, but they add a meaningful layer of semantic reasoning that rule-based systems lack. CrabTrap is not designed to replace traditional access controls but to supplement them with context-aware filtering, which provides measurable value even if the LLM judge is not correct 100 percent of the time.

Q: Is CrabTrap only useful for fintech companies? A: No. Any team running AI agents that make external API calls faces the same class of risk that CrabTrap addresses, including companies in healthcare, legal tech, e-commerce, and developer tooling. The fintech context at Brex shaped the urgency of the solution, but the guides for deploying it would apply across industries wherever agents interact with consequential external systems.

CrabTrap is the kind of practical, immediately deployable contribution the agent safety conversation has been missing, and its open-source release from a credible production environment gives it a leg up over purely academic proposals. As agent deployments scale across industries in 2025 and beyond, the demand for runtime safety infrastructure will only accelerate. Subscribe to the AI Agents Daily weekly newsletter for daily updates on AI agents, tools, and automation.

Our Take

This story matters because it signals a shift in how AI agents are being adopted across the industry. We are tracking this development closely and will report on follow-up impacts as they emerge.

Post Share

Get stories like this daily

Free briefing. Curated from 50+ sources. 5-minute read every morning.

Share this article Post on X Share on LinkedIn