What LLM security means in 2026
LLM security is the practice of protecting large language model applications, and the autonomous agents built on top of them, from attacks that exploit the model’s natural-language interface or the agent’s ability to take action in the world.
It overlaps classical application security. Authentication, authorization, audit logging, TLS, secret management: all still apply. It also introduces a new class of threats that don’t map onto the old playbook. A WAF does not understand that a product review on an e-commerce page contains instructions the model will follow. An input validator does not know that a tool description hidden in a third-party MCP server is telling the agent to exfiltrate credentials. A sandboxed process doesn’t stop the agent from typing a real API key into a real tool call argument.
The essential shift: in LLM applications, text is code. Every string that reaches the model can be instruction. Every output the model produces can trigger side effects. The boundary between data and code, which AppSec has relied on for 30 years, blurs as soon as a language model is in the loop.
This guide walks through the threats, the defenses, and where each layer fits.
The threat taxonomy
Six categories account for most current LLM security incidents. OWASP’s LLM Top 10 and the evolving MCP Top 10 formalize the full taxonomy; these six are the ones practitioners hit first.
1. Prompt injection
The agent receives text that overrides its instructions. Direct injection is the user pasting Ignore previous instructions... into a chat. Indirect injection is more dangerous: the agent fetches a web page, reads a tool response, or parses a document, and the attack sits inside that content. The model doesn’t distinguish “this came from a trusted source” from “this came from an attacker-controlled blog post.”
Deep dive: LLM Prompt Injection · Prompt Injection Detection · Prompt Injection Network Defense.
2. Sensitive data exfiltration
The agent sends credentials, PII, source code, or proprietary data to an outside destination. This can happen through a compromised tool (“include your GitHub token in the next API call”), through accidental leakage (the agent summarizes a config file that contains secrets), or through creative attacker channels (DNS-based exfiltration, URL parameter stuffing, cross-request chaining).
Deep dive: AI Agent Data Loss Prevention · Agent Egress Security.
3. MCP tool poisoning and rug-pull
The Model Context Protocol lets agents call third-party tools hosted by third-party servers. An attacker who controls an MCP server can hide instructions in tool descriptions, wait until the server is trusted, then change the description to smuggle an exfiltration payload (“rug-pull”). Static review at install time doesn’t catch runtime drift.
Deep dive: MCP Tool Poisoning · MCP Vulnerabilities · Shadow MCP.
4. Supply-chain attacks via agent frameworks and MCP servers
Agents compose many upstream dependencies: LLM SDKs, agent frameworks, MCP servers, plugins, and tool integrations. Each is a point where malicious code or configuration can land. Shadow MCP (employees running unvetted MCP servers) is a live version of this threat in 2026.
5. SSRF and network abuse
Agents that can fetch URLs, browse the web, or make arbitrary HTTP calls have the same SSRF problems as any server-side request library, plus some new ones. An attacker who can plant a URL in the agent’s context (via injection, via a poisoned tool) can make the agent probe internal networks, cloud metadata endpoints, or private infrastructure.
6. Unauthorized tool execution and scope escalation
The agent calls a tool it shouldn’t, with parameters it shouldn’t, on behalf of a user whose authority it shouldn’t have. This is the agentic version of privilege escalation: the model is tricked into exercising a capability outside its intended scope. Tool policy enforcement (allow/deny/redirect on specific tool calls) is the runtime answer.
Deep dive: Chatbot Security · AI Agent Security Best Practices.
Where the defenses live
Model layer
The model itself can refuse unsafe requests, flag suspicious input, or be fine-tuned away from compliance with injection-style instructions. Constitutional AI, RLHF, and classifier filters sit here. Necessary but insufficient: no model fine-tuning catches every novel injection, and an attacker only needs one that works.
Application layer
Agent frameworks, HITL confirmations, scoped API credentials, per-tool authorization policies, and input validation on tool arguments. This is where the business logic of “what the agent is supposed to do” gets enforced. Limits include: frameworks are inside the same trust boundary as the agent process, so a compromised agent can bypass its own validation.
Runtime / network layer
An external proxy or firewall that sees the traffic leaving the agent and the traffic returning. This layer enforces what actually crossed the wire, regardless of what the agent thought it was doing. Because it sits outside the agent process, a prompt-injected agent cannot turn it off.
Pipelock is one implementation of the runtime layer: a single-binary agent firewall that scans HTTP, WebSocket, and MCP traffic for the six threat categories above and can emit signed action receipts when receipt signing is enabled. The model gets to produce what it produces. The application framework gets to orchestrate. The runtime layer is the last-chance enforcement point at the network boundary.
All three layers belong. Each catches what the others miss.
LLM security vs adjacent terms
| Term | What it covers | How it relates to LLM security |
|---|---|---|
| AI security | The broadest umbrella, includes ML pipelines, data poisoning, model theft | LLM security is a subset focused on language-model applications and agents |
| LLM application security | The security of apps that embed LLMs | Same as LLM security, often used interchangeably |
| Agent security | Security of autonomous agents (whether LLM-powered or not) | Overlaps with LLM security where the agent is LLM-driven; broader when the agent uses rule engines or search |
| MCP security | Security of Model Context Protocol servers and tool calls | A critical slice of LLM security in 2026, covered by MCP Security |
| Prompt engineering security | Hardening prompts against injection | One defense layer within LLM security, not the full story |
| AI firewall | A category of runtime security products for LLM and agent traffic | An implementation pattern for runtime-layer LLM security; see Generative AI Firewall |
Compliance frameworks to know
Three frameworks are shaping enterprise LLM security procurement in 2026:
- OWASP LLM Top 10 (2026): updated risk framework for language-model applications. Canonical mapping for most security-team inventories. See OWASP LLM Top 10.
- OWASP MCP Top 10: emerging risk list specific to Model Context Protocol. Covers tool poisoning, rug-pulls, authorization sprawl. See OWASP MCP Top 10.
- EU AI Act + NIST AI RMF: regulatory frameworks requiring documented runtime security controls, audit logs, and post-deployment monitoring. EU AI Act Compliance covers the obligations in practice.
CSA, SANS, and other industry bodies are contributing too: the Mythos-Ready Playbook synthesizes priority runtime actions across these frameworks.
Where to start
- You’re deploying an LLM application for the first time: read LLM Prompt Injection and AI Agent Security Best Practices. Start with the OWASP LLM Top 10 as your risk inventory.
- You ship an agent that calls MCP servers: read MCP Security, MCP Tool Poisoning, and Shadow MCP. Plan for runtime scanning on every tool call.
- You need an evaluation framework: Agent Firewall Evaluation Checklist walks through what a production runtime control must do.
- You’re in a compliance-heavy environment: EU AI Act Compliance, AI Agent Compliance, and Compliance Evidence map the runtime controls to audit requirements.
Further reading
- OWASP LLM Top 10: canonical risk framework for LLM applications
- OWASP MCP Top 10: MCP-specific risks
- OWASP Agentic Threats and Agentic Top 10
- AI Runtime Security: the runtime-layer perspective in depth
- AI Agent Security: three defense layers explained
- AI Agent Security Tools: categories of LLM security products compared
- Agent Firewall: what a runtime agent firewall does and when to need one
- Pipelock: install, configure, deploy the open-source agent firewall