AI agents have API keys, shell access, and internet connections. When one gets compromised through prompt injection or a poisoned MCP tool, secrets leave through HTTP requests before anyone notices. An AI firewall is the enforcement point between the agent and the network. It scans everything that crosses the wire.
The question is whether that firewall should be a SaaS product you can’t inspect, or open source software you control.
What counts as an AI firewall
The term gets used loosely. Three categories of tools call themselves AI firewalls or AI security layers, and they work at different points in the stack.
Network-level inspection (proxy firewalls). A separate process sits between the agent and the internet. All HTTP, MCP, and WebSocket traffic flows through it. The proxy scans requests and responses for credential leaks, prompt injection, SSRF attempts, and tool poisoning. Because it runs as its own process, prompt injection targeting the agent doesn’t directly compromise the proxy. Full isolation requires network controls (firewall rules or container networking) that prevent the agent from bypassing the proxy or killing the process. Pipelock and GitHub’s agent workflow firewall operate here, though GitHub’s tool does domain allowlisting without content scanning.
Inference guardrails. Libraries that classify prompts and completions at the model layer. They check whether the model’s input or output is “safe” according to a policy. NeMo Guardrails and LlamaFirewall operate here. They don’t see HTTP traffic or MCP messages. A guardrail can approve a prompt while the agent exfiltrates credentials in the next HTTP request.
Agent-side hooks. Code that intercepts tool calls inside the agent’s runtime before execution. Claude Code’s permission system and Sage work this way. Hooks see tool names and arguments but not network traffic. They run inside the agent’s process, which means a successful injection can potentially bypass them.
These are not interchangeable. Each layer sees different data and stops different attacks. A real security posture uses more than one. But the network layer is the hardest for an attacker to circumvent because it runs outside the agent’s trust boundary.
Why open source matters for security
A security tool you can’t read is a security tool you can’t trust. This applies doubly to AI firewalls because they sit on the critical path of every agent action.
Auditability. You can read the scanning logic, the DLP patterns, the injection detection rules, and the logging behavior. No wondering whether the vendor added telemetry in the last update or whether a “block” decision actually blocks.
No vendor lock-in. SaaS firewalls hold your policy configuration, your audit logs, and your integration. If the vendor raises prices, gets acquired, or shuts down, you start over. Open source tools run on your infrastructure. You own the data.
Self-hosted by default. Your agent traffic never leaves your network. Secrets, prompts, and tool call arguments stay on machines you control. For regulated industries, this is not optional.
Community review. More eyes on the scanning pipeline means bugs and bypasses get found faster. Published detection rules can be reviewed by anyone, not just the vendor’s internal team.
Open source options compared
Seven open source projects address AI agent security. They solve different problems at different layers.
| Tool | Layer | Language | Content Scanning | MCP Support | Deployment |
|---|---|---|---|---|---|
| Pipelock | Network proxy | Go | DLP (48 patterns), injection (25 patterns), SSRF, encoding evasion (6-pass normalization) | Bidirectional (stdio, HTTP, WebSocket) with tool poisoning + rug-pull detection | Single binary, Docker, K8s companion proxy |
| LlamaFirewall | Inference | Python | PromptGuard classifier, AlignmentCheck, CodeShield | Not documented in public docs | Python library |
| NeMo Guardrails | Inference | Python | Colang-based dialog policy, topical control | Not documented in public docs | Python library |
| Trylon Gateway | Gateway | N/A | Self-hosted firewall for LLM apps | Gateway routing | Container |
| Agent Wall | Network | Node.js | MCP-focused security firewall | MCP-based AI agents | Library / sidecar |
| DefenseClaw | Application sidecar | Python | Policy enforcement, access control | MCP server governance | Sidecar container |
| agentgateway | Gateway | Rust | Not documented in public docs | A2A and MCP routing | Binary, Docker |
What this means in practice:
- LlamaFirewall and NeMo Guardrails are inference-layer tools. They’re strong at classifying unsafe model outputs but don’t see network traffic.
- Trylon Gateway is an LLM-app firewall focused on the application layer rather than agent egress.
- Agent Wall is the closest peer to Pipelock on the MCP side and the most recent entrant. Both target MCP-based agents; Pipelock adds full HTTP, WebSocket, and A2A coverage plus mediator-signed action receipts.
- DefenseClaw governs MCP server access but is not a content-inspecting proxy.
- agentgateway routes agent-to-agent and agent-to-tool traffic but content scanning is not documented in public docs.
Pipelock is the only option in the table that combines content-inspecting DLP, prompt injection detection, MCP protocol scanning (tool poisoning + rug-pull + chain detection), SSRF protection, and mediator-signed action receipts (cryptographic evidence per decision, with a published cross-language verifier) at the network layer. That’s not a knock on the others. They solve different problems.
Proxy vs in-process
The architectural split matters more than the feature list.
Proxy (separate process). The firewall runs as its own process. The agent’s traffic routes through it via HTTPS_PROXY or MCP wrapping. The agent and the firewall don’t share memory. Prompt injection targeting the agent doesn’t directly affect the proxy’s rules or logs. With proper network isolation (container networking, firewall rules), the agent can’t route around the proxy either. The tradeoff: added latency per request (typically 1-5ms for Pipelock) and an extra process to manage.
In-process (library/SDK). The firewall runs inside the agent’s process as an imported library. Lower latency because there’s no inter-process communication. But the agent and the firewall share memory. A sophisticated injection that gains code execution can disable the library, modify its rules, or suppress its logs. The security boundary is the process, and both the attacker and the defender are inside it.
For security-critical deployments, the proxy model wins. The latency cost is negligible compared to LLM inference time (which dominates every agent request). The isolation guarantee is not.
Self-hosted deployment
Open source AI firewalls should be easy to deploy wherever your agents run. Pipelock ships as a single Go binary with zero runtime dependencies.
Single binary. Download, run. No Python, no Node, no container runtime required. Works on Linux, macOS, and Windows.
Docker. Pull and run with two environment variables.
docker run -p 8888:8888 ghcr.io/luckypipewrench/pipelock:latest
Kubernetes companion proxy. In v2.2.0, pipelock init sidecar --inject-spec generates an enforced companion proxy topology with a separate proxy Deployment, Service, NetworkPolicies, and bound workload identity. Use this when you want the cluster to enforce that the agent pod can only reach Pipelock. The full guide is Pipelock Kubernetes companion proxy.
Same-pod sidecar. Pipelock can still run as a same-pod sidecar when you want the fastest rollout, but that is a soft deployment pattern because the agent shares the pod network namespace and still relies on HTTPS_PROXY cooperation.
CI pipeline. Run pipelock scan in CI to check MCP configs, tool descriptions, and policy files before deployment. Catches tool poisoning and policy drift before production.
Getting started with Pipelock
Install and run in under a minute.
# Install via Homebrew
brew install luckyPipewrench/tap/pipelock
# Set up Claude Code with hooks and MCP proxy
pipelock claude setup
# Or proxy any agent's HTTP traffic
export HTTPS_PROXY=http://127.0.0.1:8888
pipelock run
Pipelock scans all traffic for 48 credential patterns, prompt injection (with 6-pass encoding normalization), SSRF, DNS rebinding, and MCP tool poisoning. Every decision lands in the flight recorder audit log, and enabling a signing key adds mediator-signed action receipts on top.
Further reading
- What is an Agent Firewall?: the architecture behind network-layer agent security
- Generative AI Firewall: the broader vendor category and where open source fits
- AI Runtime Security: the runtime threat surface beyond the network layer
- Pipelock: full feature list, deployment options, and framework coverage
- AI Agent Security: three security layers explained
- Pipelock vs LlamaFirewall: network proxy vs inference guardrails
- Pipelock vs DefenseClaw: network proxy vs application sidecar
- Pipelock on GitHub