Open Source AI Firewall: Self-Hosted Agent Security

Runtime agent protection you can inspect, modify, and deploy anywhere.

Ready to protect your own setup?

AI agents have API keys, shell access, and internet connections. When one gets compromised through prompt injection or a poisoned MCP tool, secrets leave through HTTP requests before anyone notices. An AI firewall is the enforcement point between the agent and the network. It scans everything that crosses the wire.

The question is whether that firewall should be a SaaS product you can’t inspect, or open source software you control.

What counts as an AI firewall

The term gets used loosely. Three categories of tools call themselves AI firewalls or AI security layers, and they work at different points in the stack.

Network-level inspection (proxy firewalls). A separate process sits between the agent and the internet. All HTTP, MCP, and WebSocket traffic flows through it. The proxy scans requests and responses for credential leaks, prompt injection, SSRF attempts, and tool poisoning. Because it runs as its own process, prompt injection targeting the agent doesn’t directly compromise the proxy. Full isolation requires network controls (firewall rules or container networking) that prevent the agent from bypassing the proxy or killing the process. Pipelock and GitHub’s agent workflow firewall operate here, though GitHub’s tool does domain allowlisting without content scanning.

Inference guardrails. Libraries that classify prompts and completions at the model layer. They check whether the model’s input or output is “safe” according to a policy. NeMo Guardrails and LlamaFirewall operate here. They don’t see HTTP traffic or MCP messages. A guardrail can approve a prompt while the agent exfiltrates credentials in the next HTTP request.

Agent-side hooks. Code that intercepts tool calls inside the agent’s runtime before execution. Claude Code’s permission system and Sage work this way. Hooks see tool names and arguments but not network traffic. They run inside the agent’s process, which means a successful injection can potentially bypass them.

These are not interchangeable. Each layer sees different data and stops different attacks. A real security posture uses more than one. But the network layer is the hardest for an attacker to circumvent because it runs outside the agent’s trust boundary.

Why open source matters for security

A security tool you can’t read is a security tool you can’t trust. This applies doubly to AI firewalls because they sit on the critical path of every agent action.

Auditability. You can read the scanning logic, the DLP patterns, the injection detection rules, and the logging behavior. No wondering whether the vendor added telemetry in the last update or whether a “block” decision actually blocks.

No vendor lock-in. SaaS firewalls hold your policy configuration, your audit logs, and your integration. If the vendor raises prices, gets acquired, or shuts down, you start over. Open source tools run on your infrastructure. You own the data.

Self-hosted by default. Your agent traffic never leaves your network. Secrets, prompts, and tool call arguments stay on machines you control. For regulated industries, this is not optional.

Community review. More eyes on the scanning pipeline means bugs and bypasses get found faster. Published detection rules can be reviewed by anyone, not just the vendor’s internal team.

Open source options compared

Seven open source projects address AI agent security. They solve different problems at different layers.

ToolLayerLanguageContent ScanningMCP SupportDeployment
PipelockNetwork proxyGoDLP (48 patterns), injection (25 patterns), SSRF, encoding evasion (6-pass normalization)Bidirectional (stdio, HTTP, WebSocket) with tool poisoning + rug-pull detectionSingle binary, Docker, K8s companion proxy
LlamaFirewallInferencePythonPromptGuard classifier, AlignmentCheck, CodeShieldNot documented in public docsPython library
NeMo GuardrailsInferencePythonColang-based dialog policy, topical controlNot documented in public docsPython library
Trylon GatewayGatewayN/ASelf-hosted firewall for LLM appsGateway routingContainer
Agent WallNetworkNode.jsMCP-focused security firewallMCP-based AI agentsLibrary / sidecar
DefenseClawApplication sidecarPythonPolicy enforcement, access controlMCP server governanceSidecar container
agentgatewayGatewayRustNot documented in public docsA2A and MCP routingBinary, Docker

What this means in practice:

  • LlamaFirewall and NeMo Guardrails are inference-layer tools. They’re strong at classifying unsafe model outputs but don’t see network traffic.
  • Trylon Gateway is an LLM-app firewall focused on the application layer rather than agent egress.
  • Agent Wall is the closest peer to Pipelock on the MCP side and the most recent entrant. Both target MCP-based agents; Pipelock adds full HTTP, WebSocket, and A2A coverage plus mediator-signed action receipts.
  • DefenseClaw governs MCP server access but is not a content-inspecting proxy.
  • agentgateway routes agent-to-agent and agent-to-tool traffic but content scanning is not documented in public docs.

Pipelock is the only option in the table that combines content-inspecting DLP, prompt injection detection, MCP protocol scanning (tool poisoning + rug-pull + chain detection), SSRF protection, and mediator-signed action receipts (cryptographic evidence per decision, with a published cross-language verifier) at the network layer. That’s not a knock on the others. They solve different problems.

Proxy vs in-process

The architectural split matters more than the feature list.

Proxy (separate process). The firewall runs as its own process. The agent’s traffic routes through it via HTTPS_PROXY or MCP wrapping. The agent and the firewall don’t share memory. Prompt injection targeting the agent doesn’t directly affect the proxy’s rules or logs. With proper network isolation (container networking, firewall rules), the agent can’t route around the proxy either. The tradeoff: added latency per request (typically 1-5ms for Pipelock) and an extra process to manage.

In-process (library/SDK). The firewall runs inside the agent’s process as an imported library. Lower latency because there’s no inter-process communication. But the agent and the firewall share memory. A sophisticated injection that gains code execution can disable the library, modify its rules, or suppress its logs. The security boundary is the process, and both the attacker and the defender are inside it.

For security-critical deployments, the proxy model wins. The latency cost is negligible compared to LLM inference time (which dominates every agent request). The isolation guarantee is not.

Self-hosted deployment

Open source AI firewalls should be easy to deploy wherever your agents run. Pipelock ships as a single Go binary with zero runtime dependencies.

Single binary. Download, run. No Python, no Node, no container runtime required. Works on Linux, macOS, and Windows.

Docker. Pull and run with two environment variables.

docker run -p 8888:8888 ghcr.io/luckypipewrench/pipelock:latest

Kubernetes companion proxy. In v2.2.0, pipelock init sidecar --inject-spec generates an enforced companion proxy topology with a separate proxy Deployment, Service, NetworkPolicies, and bound workload identity. Use this when you want the cluster to enforce that the agent pod can only reach Pipelock. The full guide is Pipelock Kubernetes companion proxy.

Same-pod sidecar. Pipelock can still run as a same-pod sidecar when you want the fastest rollout, but that is a soft deployment pattern because the agent shares the pod network namespace and still relies on HTTPS_PROXY cooperation.

CI pipeline. Run pipelock scan in CI to check MCP configs, tool descriptions, and policy files before deployment. Catches tool poisoning and policy drift before production.

Getting started with Pipelock

Install and run in under a minute.

# Install via Homebrew
brew install luckyPipewrench/tap/pipelock

# Set up Claude Code with hooks and MCP proxy
pipelock claude setup

# Or proxy any agent's HTTP traffic
export HTTPS_PROXY=http://127.0.0.1:8888
pipelock run

Pipelock scans all traffic for 48 credential patterns, prompt injection (with 6-pass encoding normalization), SSRF, DNS rebinding, and MCP tool poisoning. Every decision lands in the flight recorder audit log, and enabling a signing key adds mediator-signed action receipts on top.

Further reading

Frequently asked questions

What is an open source AI firewall?
An open source AI firewall is a self-hosted security tool that sits between AI agents and the network, scanning traffic for credential leaks, prompt injection, SSRF, and tool poisoning. Because the source code is public, you can audit exactly what it scans, how it decides, and what it logs. No vendor black box.
Why choose open source over SaaS for AI security?
SaaS AI security tools route your agent traffic through a third party’s infrastructure. That means your secrets, prompts, and tool calls pass through someone else’s servers. Open source AI firewalls run on your infrastructure. You control the data, the deployment, and the update schedule. You can audit the scanning logic and verify there are no backdoors or telemetry you didn’t consent to.
What's the difference between an AI firewall and AI guardrails?
AI guardrails operate at the inference layer, classifying prompts and completions for safety. They analyze text semantics. AI firewalls operate at the network layer, scanning actual HTTP requests, MCP tool calls, and WebSocket frames for credential leaks, injection, and SSRF. Guardrails catch unsafe model outputs. Firewalls catch unsafe agent actions on the wire. They complement each other.

Ready to protect your own setup?