DLP Scanner

Detect and redact PII, API keys, secrets, and prompt injection in tool outputs.

How It Works

1

Tool executes

The tool runs and produces output (file contents, database rows, API responses).

2

DLP scans output

Pattern matching detects PII, secrets, and injection attempts in the raw output.

3

Action taken

Based on DLP mode: log the finding, redact the sensitive data, or block the entire output.

Usage

python
from spidershield import SpiderGuard

guard = SpiderGuard(policy="balanced", dlp="redact")

# Tool returns sensitive data
raw_output = """
User: John Smith
Email: john@example.com
SSN: 123-45-6789
API Key: sk-proj-abc123def456
"""

# DLP scans and redacts
clean = guard.after_check("read_database", raw_output)
# Output:
# User: John Smith
# Email: [EMAIL_REDACTED]
# SSN: [SSN_REDACTED]
# API Key: [API_KEY_REDACTED]

DLP Modes

python
# Log mode — detect and log, don't modify output
guard = SpiderGuard(dlp="log")

# Redact mode — replace sensitive data with placeholders
guard = SpiderGuard(dlp="redact")

# Block mode — reject the entire output if sensitive data found
guard = SpiderGuard(dlp="block")

What We Detect

PII

  • Email addresses
  • Phone numbers
  • Social Security Numbers
  • Credit card numbers
  • Physical addresses
  • Names (contextual)

Secrets

  • API keys (OpenAI, AWS, Stripe, etc.)
  • OAuth tokens
  • JWT tokens
  • Database connection strings
  • Private keys (RSA, SSH)
  • Password patterns

Injection

  • Prompt injection attempts
  • System prompt extraction
  • Jailbreak patterns
  • Instruction override attempts
  • Role manipulation
  • Context window attacks