Warden
beta · v2.14.0
beta · v2.14.0
Documentation

How It Works

Warden sits between your AI assistant and your codebase. Every tool call — bash commands, file reads, file writes — passes through Warden before it reaches your environment.


The Hook Flow

When your AI agent issues a tool call:

  1. The hook fires — Claude Code, Gemini CLI, and Codex CLI support hook scripts that run before and after tool calls. Warden registers itself as those hooks during installation.
  2. Warden evaluates — the call is checked against compiled safety and quality patterns in a single fast pass. Session context is factored in.
  3. A decision is made:
    • Pass — the command proceeds silently.
    • Deny — the command is blocked with an explanation and alternative.
    • Teach — the command runs, and the agent receives a targeted hint.
    • Apply — the command is rewritten to a safer or more efficient form.
    • Require structure — the agent is asked to restructure its approach.
  4. Post-processing — after the command runs, Warden can compress verbose output, update session tracking, and detect patterns like loops or drift.

Four Capabilities

Safety & Enforcement (deterministic)

The first thing Warden checks: should this be blocked?

Pattern matching against compiled known-dangerous commands, hallucinated flags, credential exposure, and unsafe operations. The result is deterministic: same input, same output, every time.

Safety enforcement cannot be bypassed by prompt injection, context manipulation, or the agent’s own reasoning. The check happens outside the model’s context window, on the actual command.

Session Guidance (heuristic)

Warden tracks session health: is the session drifting?

It monitors focus, loop patterns, verification debt (edits without tests), and drift from the original goal. When signals indicate degradation, Warden emits a targeted advisory. When the session is healthy, it stays completely silent.

Session guidance is heuristic — it improves sessions but degrades gracefully when signals are ambiguous. Most of this state is stored out-of-band, not in the model’s context window.

Cross-Session Learning

When a session ends, Warden processes what happened: what’s worth remembering?

Repair patterns, project conventions, working-set rankings, and compact resume summaries are stored locally. When the next session starts — or after context compaction — this knowledge is injected so the agent doesn’t lose hard-won context.

Output Efficiency (deterministic where filter rules exist)

Every tool output is evaluated: how much of this does the agent actually need?

A 40,000-token build log becomes a 2,000-token error summary. Passing tests are stripped, progress bars are removed, and only errors, warnings, and the final result are preserved. The agent gets exactly what it needs for the next decision.


How Capabilities Compose

Not every tool call involves all four capabilities. A simple cat README.md hits safety (not blocked) and that’s it. A git push --force hits all four: safety denies it, session tracking notes the attempt, learning records it as a dead end, and if it had run, the output would be compressed.

In a typical session:

  • 90%+ of tool calls are silently allowed — safety check passes, no advisory needed, no compression required.
  • 5-8% get output compressed — build logs, test suites, install output.
  • 1-3% trigger an advisory — verification debt, focus drift, suboptimal tool choice.
  • <1% are denied — dangerous commands, credential exposure, hallucinated flags.

The goal is invisibility. When a session is healthy, Warden adds zero visible overhead. When a session starts degrading, it intervenes with the minimum effective correction.