Safety Model¶
claude-harnesses layers four safety mechanisms. Each is independent.
1. Permissions (settings.json)¶
The first line of defense is settings.json permissions. Tools listed in permissions.deny are blocked before any hook runs. The settings/{strict,default,experimental}.json presets pre-populate sensible deny rules:
Bash(rm -rf /*)and~/$HOMEvariantsBash(git push --force*),Bash(git reset --hard*)Bash(chmod -R 777*)Bash(curl * | sh*),Bash(wget * | sh*)Read(.env),Read(.env.*),Read(**/id_rsa),Read(**/*.pem)
2. PreToolUse guard hooks (safety-pack)¶
What permissions cannot pattern-match (high-entropy strings, contextual checks), the safety-pack hooks catch:
secret-guardblocks tool calls containing OpenAI/GitHub/AWS keys, private-key blocks,.envsecret assignments.dangerous-command-guardblocksdd of=/dev/sd*, piped curl, etc.branch-protection-guardblocks pushes/commits tomain/master/production/release.prompt-injection-detectorflags naive jailbreak prefixes inWebFetch/WebSearch/Read.mcp-tool-allowlistenforces an allowlist for MCP tool calls.
3. Stop hook verification (verification-pack)¶
stop-verify blocks the Stop event until scripts/verify.sh passes. It honors stop_hook_active so it does not loop forever; if Claude Code is already inside a previously-blocked Stop, the hook lets it finish.
4. Cost ceiling (long-running-pack)¶
cost-ceiling-guard tracks tool call count per 24h window in ~/.claude-harnesses/cost-ledger.json and forces a stop past the threshold (default 5000). This is the guardrail against runaway loops.
Kill switch¶
Every guard honors CLAUDE_HARNESSES_DISABLE=1. If a hook misfires and you cannot make progress, set the variable, recover, then unset.
What this does NOT replace¶
- Sandboxing the runtime (containers, restricted filesystems)
- Human review of generated PRs
- Real CI checks
- Real secrets management