Kill Switch Design for AI Agents (Stop Writes Fast) + Code

When your agent starts doing damage, you need a kill switch that actually stops it: global + per-tenant toggles, tool-level disable, and safe shutdown semantics.
On this page
  1. Problem-first intro
  2. Why this fails in production
  3. 1) Teams build “pause” buttons that don’t pause anything
  4. 2) Kill switches that aren’t checked in the tool gateway leak
  5. 3) “Stop the run” is not enough
  6. 4) Scope matters: global vs per-tenant
  7. Implementation example (real code)
  8. Real failure case (incident-style, with numbers)
  9. Trade-offs
  10. When NOT to use
  11. Copy-paste checklist
  12. Safe default config snippet (JSON/YAML)
  13. FAQ (3–5)
  14. Related pages (3–6 links)
Interactive flow
Scenario:
Step 1/3: Execution

Action is proposed as structured data (tool + args).

Problem-first intro

Your agent is doing the wrong thing.

Not “the answer is a bit off”. Wrong thing as in:

  • sending duplicate emails
  • creating tickets in bulk
  • hammering an API until you get rate-limited

And now the important part: you don’t have time to “fix the prompt and redeploy”.

You need a kill switch that:

  • works right now
  • is auditable (who flipped it, when, why)
  • stops the side effects, not just the UI

If your kill switch only lives in the frontend, it’s not a kill switch. It’s a placebo. If your kill switch is an env var, it’s a deploy. Incidents don’t wait for deploys.

Why this fails in production

1) Teams build “pause” buttons that don’t pause anything

Common anti-design:

  • UI hides the button
  • API still runs the agent loop
  • tool gateway still executes writes

If tool calls still go through, you didn’t stop the incident. You renamed it.

2) Kill switches that aren’t checked in the tool gateway leak

If you check the kill switch:

  • in one route
  • but not in background jobs
  • and not in the tool gateway

…you will miss a path.

3) “Stop the run” is not enough

In-flight tool calls exist:

  • long HTTP calls
  • browser sessions
  • queue workers already executing

Your kill switch needs semantics:

  • stop new runs
  • stop new tool calls
  • optionally force-cancel in-flight work (best-effort)

4) Scope matters: global vs per-tenant

You don’t want to stop the whole product because one tenant is triggering a loop. You want:

  • global switch (nuclear)
  • per-tenant switch (surgical)
  • per-tool disable list (e.g., “no browser today”)

Implementation example (real code)

This pattern:

  • reads kill switch state from a shared store (pseudo)
  • checks it in two places: loop + tool gateway
  • distinguishes “stop all” vs “disable writes”
PYTHON
from dataclasses import dataclass
from typing import Any


@dataclass(frozen=True)
class KillState:
  stop_all: bool = False
  disable_writes: bool = True
  disabled_tools: set[str] = None


class Killed(RuntimeError):
  pass


def load_kill_state(*, tenant_id: str) -> KillState:
  # Pseudo: Redis/DB/feature-flag service. Must be fast + reliable.
  # Split global + per-tenant state.
  global_state = read_flag("agent_kill_global")  # (pseudo)
  tenant_state = read_flag(f"agent_kill_tenant:{tenant_id}")  # (pseudo)
  disabled_tools = set(read_list("agent_disabled_tools"))  # (pseudo)

  return KillState(
      stop_all=bool(global_state or tenant_state),
      disable_writes=True,
      disabled_tools=disabled_tools,
  )


WRITE_TOOLS = {"email.send", "db.write", "ticket.create", "ticket.close"}


def guard_tool_call(*, kill: KillState, tool: str) -> None:
  if kill.stop_all:
      raise Killed("killed: stop_all")
  if tool in (kill.disabled_tools or set()):
      raise Killed(f"killed: tool_disabled:{tool}")
  if kill.disable_writes and tool in WRITE_TOOLS:
      raise Killed(f"killed: writes_disabled:{tool}")


def run(task: str, *, tenant_id: str, tools) -> dict[str, Any]:
  kill = load_kill_state(tenant_id=tenant_id)
  for _ in range(1000):
      if kill.stop_all:
          return {"status": "stopped", "stop_reason": "killed"}

      action = llm_decide(task)  # (pseudo)
      if action.kind != "tool":
          return {"status": "ok", "answer": action.final_answer}

      guard_tool_call(kill=kill, tool=action.name)
      obs = tools.call(action.name, action.args)  # (pseudo)
      task = update(task, action, obs)  # (pseudo)

  return {"status": "stopped", "stop_reason": "max_steps"}
JAVASCRIPT
const WRITE_TOOLS = new Set(["email.send", "db.write", "ticket.create", "ticket.close"]);

export class Killed extends Error {}

export function loadKillState({ tenantId }) {
// Pseudo: feature-flag store. Must be fast + reliable.
const globalStop = readFlag("agent_kill_global"); // (pseudo)
const tenantStop = readFlag("agent_kill_tenant:" + tenantId); // (pseudo)
const disabledTools = new Set(readList("agent_disabled_tools")); // (pseudo)
return { stopAll: Boolean(globalStop || tenantStop), disableWrites: true, disabledTools };
}

export function guardToolCall({ kill, tool }) {
if (kill.stopAll) throw new Killed("killed: stop_all");
if (kill.disabledTools && kill.disabledTools.has(tool)) throw new Killed("killed: tool_disabled:" + tool);
if (kill.disableWrites && WRITE_TOOLS.has(tool)) throw new Killed("killed: writes_disabled:" + tool);
}

Real failure case (incident-style, with numbers)

We had an agent that drafted and sent follow-up emails. It was behind a “send_email” tool and (oops) no approval gate yet.

A prompt change caused it to interpret “follow up” as “send now”.

Impact in 22 minutes:

  • 117 emails sent (some duplicates)
  • we spent ~4 hours doing customer damage control
  • the model wasn’t “hacked” — it was just wrong, loudly

The kill switch we thought we had was a UI toggle. Background workers ignored it.

Fix:

  1. kill switch enforced in tool gateway (writes disabled)
  2. per-tenant stop (so one tenant doesn’t nuke everyone)
  3. audit log entries when kill state blocks a tool call
  4. incident runbook: flip kill switch first, ask questions second

Trade-offs

  • Kill switches reduce availability during incidents. That’s better than irreversible writes.
  • You have to test the kill path. Untested kill switches fail at the worst time.
  • Shared-state reads add latency; keep it fast and cache briefly (seconds, not minutes).

When NOT to use

  • Don’t use “kill switch” as a substitute for real governance (permissions, approvals, budgets).
  • Don’t build a kill switch that’s only client-side. It will lie to you.
  • Don’t rely on kill switches for “normal” flow. They’re for stopping the bleeding.

Copy-paste checklist

  • [ ] Global kill switch (stop new runs)
  • [ ] Per-tenant kill switch (surgical stop)
  • [ ] Enforced in tool gateway (stops side effects)
  • [ ] Disable writes mode (read-only degrade)
  • [ ] Tool disable list (e.g., “no browser”)
  • [ ] Audit logs for kill blocks + operator actions
  • [ ] Tested runbook: flip, verify, drain, recover

Safe default config snippet (JSON/YAML)

YAML
kill_switch:
  global_flag: "agent_kill_global"
  per_tenant_flag_prefix: "agent_kill_tenant:"
  mode_when_enabled: "disable_writes"
  disabled_tools_key: "agent_disabled_tools"
  cache_ttl_s: 2

FAQ (3–5)

Should the kill switch stop everything or only writes?
Default to disabling writes first. Stopping everything is the nuclear option when you can’t trust the loop at all.
Where do we enforce the kill switch?
In the tool gateway and in the run loop. If it’s not enforced on tool calls, it’s not real.
Can we cache kill state?
Yes, but keep TTL in seconds. Incidents are measured in seconds, not minutes.
Do we need per-tenant kill switches?
If you’re multi-tenant: absolutely. Otherwise one customer’s incident becomes everyone’s outage.

Q: Should the kill switch stop everything or only writes?
A: Default to disabling writes first. Stopping everything is the nuclear option when you can’t trust the loop at all.

Q: Where do we enforce the kill switch?
A: In the tool gateway and in the run loop. If it’s not enforced on tool calls, it’s not real.

Q: Can we cache kill state?
A: Yes, but keep TTL in seconds. Incidents are measured in seconds, not minutes.

Q: Do we need per-tenant kill switches?
A: If you’re multi-tenant: absolutely. Otherwise one customer’s incident becomes everyone’s outage.

Not sure this is your use case?

Design your agent ->
⏱️ 6 min readUpdated Mar, 2026Difficulty: ★★★
Implement in OnceOnly
Budgets + permissions you can enforce at the boundary.
Use in OnceOnly
# onceonly guardrails (concept)
version: 1
budgets:
  max_steps: 25
  max_tool_calls: 12
  max_seconds: 60
  max_usd: 1.00
policy:
  tool_allowlist:
    - search.read
    - http.get
writes:
  require_approval: true
  idempotency: true
controls:
  kill_switch: { enabled: true }
Integrated: production controlOnceOnly
Add guardrails to tool-calling agents
Ship this pattern with governance:
  • Budgets (steps / spend caps)
  • Tool permissions (allowlist / blocklist)
  • Kill switch & incident stop
  • Idempotency & dedupe
  • Audit logs & traceability
Integrated mention: OnceOnly is a control layer for production agent systems.
Author

This documentation is curated and maintained by engineers who ship AI agents in production.

The content is AI-assisted, with human editorial responsibility for accuracy, clarity, and production relevance.

Patterns and recommendations are grounded in post-mortems, failure modes, and operational incidents in deployed systems, including during the development and operation of governance infrastructure for agents at OnceOnly.