Security

Threat model, security guarantees, and responsible disclosure.

Security Principles

Defense in Depth

Multiple layers of protection: policy engine, DLP scanning, audit logging, and content pinning work together.

Zero Trust

Every tool call is checked against policy. No implicit trust, even for previously approved tools.

Open Source

Full source code available for audit. Security through transparency, not obscurity.

Threat Model

Prompt Injection

Critical

Malicious instructions embedded in tool outputs that attempt to hijack agent behavior.

Mitigation: DLP scanner detects injection patterns in tool outputs before they reach the agent.

Tool Abuse

Critical

Agents executing dangerous tools (shell commands, file deletion, database drops).

Mitigation: Policy engine blocks dangerous tool calls before execution. Configurable per-tool rules.

Data Exfiltration

High

Sensitive data (PII, API keys, credentials) leaking through tool outputs.

Mitigation: DLP scanner detects and redacts secrets, PII, and credentials in real time.

Rug-Pull Attacks

High

MCP server updates that introduce malicious behavior after initial trust is established.

Mitigation: SHA-256 content pinning detects changes to previously approved tool definitions.

Toxic Capability Chains

Medium

Combinations of individually safe tools that become dangerous together (read + exfil).

Mitigation: Toxic flow detection using keyword + AST analysis identifies dangerous combinations.

Privilege Escalation

Medium

Agents gaining access to tools or data outside their intended scope.

Mitigation: Allowlist enforcement restricts agents to pre-approved tool sets only.

What We Don't Claim

  • A security score of 10.0 means "zero issues found", not "provably secure". No tool can guarantee the absence of vulnerabilities.
  • Static analysis has inherent limitations. SpiderShield catches known patterns but cannot detect novel attack vectors.
  • Runtime guards add a defense layer but do not replace secure coding practices in MCP servers.

Responsible Disclosure

If you discover a security vulnerability in SpiderShield, please report it responsibly.

Email: security@spidershield.dev

We aim to respond within 48 hours and will coordinate disclosure timelines with you.