prompt-injection-detector¶
PreToolUse guard that flags suspected prompt-injection patterns in WebFetch/WebSearch/Read input.
Trigger¶
- Event:
PreToolUse - Matcher:
WebFetch|WebSearch|Read
What it blocks¶
Naive jailbreak prefixes commonly found in scraped hostile content:
- "Ignore (all/previous/prior/above) instructions"
- "Disregard (the) system prompt"
- "You are now (DAN/jailbroken/unrestricted)"
- "Enable developer mode"
- "Print/reveal/show (the) system prompt"
Exit codes¶
0— allow2— block
Kill switches¶
CLAUDE_HARNESSES_DISABLE=1
Limits¶
Heuristic. It will not catch sophisticated payloads (encoded, multilingual, hidden in markdown), but it will catch the common naive cases. Combine with conservative permissions on WebFetch and Read of untrusted paths.
Pack: safety-pack