Security

Threat model, security guarantees, and responsible disclosure.

Security Principles

Multiple layers of protection: policy engine, DLP scanning, audit logging, and content pinning work together.

Every tool call is checked against policy. No implicit trust, even for previously approved tools.

Full source code available for audit. Security through transparency, not obscurity.

Critical

Malicious instructions embedded in tool outputs that attempt to hijack agent behavior.

Mitigation: DLP scanner detects injection patterns in tool outputs before they reach the agent.

Critical

Agents executing dangerous tools (shell commands, file deletion, database drops).

Mitigation: Policy engine blocks dangerous tool calls before execution. Configurable per-tool rules.

High

Sensitive data (PII, API keys, credentials) leaking through tool outputs.

Mitigation: DLP scanner detects and redacts secrets, PII, and credentials in real time.

High

MCP server updates that introduce malicious behavior after initial trust is established.

Mitigation: SHA-256 content pinning detects changes to previously approved tool definitions.

Medium

Combinations of individually safe tools that become dangerous together (read + exfil).

Mitigation: Toxic flow detection using keyword + AST analysis identifies dangerous combinations.

Medium

Agents gaining access to tools or data outside their intended scope.

Mitigation: Allowlist enforcement restricts agents to pre-approved tool sets only.

⚠A security score of 10.0 means "zero issues found", not "provably secure". No tool can guarantee the absence of vulnerabilities.
⚠Static analysis has inherent limitations. SpiderShield catches known patterns but cannot detect novel attack vectors.
⚠Runtime guards add a defense layer but do not replace secure coding practices in MCP servers.

If you discover a security vulnerability in SpiderShield, please report it responsibly.

We aim to respond within 48 hours and will coordinate disclosure timelines with you.