Human-in-the-Loop Approvals (Write Gates) + Code

If a tool has irreversible side effects, the agent shouldn’t run it unattended. Approval gates that are fast, auditable, and don’t deadlock your system.
On this page
  1. Problem-first intro
  2. Why this fails in production
  3. 1) Writes have blast radius
  4. 2) “Ask the model to be careful” doesn’t scale
  5. 3) Approval systems can deadlock you if you design them badly
  6. Implementation example (real code)
  7. Real failure case (incident-style, with numbers)
  8. Trade-offs
  9. When NOT to use
  10. Copy-paste checklist
  11. Safe default config snippet (JSON/YAML)
  12. FAQ (3–5)
  13. Related pages (3–6 links)
Interactive flow
Scenario:
Step 1/3: Execution

Action is proposed as structured data (tool + args).

Problem-first intro

Your agent wants to do a write:

  • send an email
  • close a ticket
  • update a CRM record

Sometimes that’s great. Sometimes it’s a 3 AM incident with a “sorry” email campaign.

Human approvals aren’t about distrusting the model. They’re about accepting reality: irreversible side effects deserve a second set of eyes. It’s usually cheaper than undoing 200 bad writes after lunch.

Why this fails in production

1) Writes have blast radius

Reads can be wrong quietly. Writes are wrong loudly.

2) “Ask the model to be careful” doesn’t scale

Prompts degrade. Contexts truncate. Models drift.

Approvals are enforcement, not advice.

3) Approval systems can deadlock you if you design them badly

If approval requests:

  • never expire
  • don’t have a clear owner
  • can’t be canceled

…you’ll end up with stuck runs and angry users.

Approvals need:

  • expiry
  • cancelation
  • auditability
  • and a safe fallback when nobody approves

Implementation example (real code)

This pattern:

  • creates an approval request with args_hash (don’t store raw args if they contain PII)
  • requires an approval token to execute
  • expires and returns a stop reason if nobody approves
PYTHON
import hashlib
import json
import time
from dataclasses import dataclass
from typing import Any


def args_hash(args: dict[str, Any]) -> str:
  raw = json.dumps(args, sort_keys=True, ensure_ascii=False).encode("utf-8")
  return hashlib.sha256(raw).hexdigest()


@dataclass(frozen=True)
class ApprovalRequest:
  id: str
  tool: str
  args_sha: str
  preview: str
  expires_at: float


class ApprovalTimeout(RuntimeError):
  pass


def create_approval(tool: str, args: dict[str, Any]) -> ApprovalRequest:
  rid = f"apr_{int(time.time())}"
  return ApprovalRequest(
      id=rid,
      tool=tool,
      args_sha=args_hash(args),
      preview=f"{tool}({list(args.keys())})",
      expires_at=time.time() + 300,
  )


def require_approval(tool: str, args: dict[str, Any]) -> str:
  req = create_approval(tool, args)
  store_request(req)  # (pseudo)
  notify_human(req)  # (pseudo)

  token = wait_for_token(req.id, timeout_s=300)  # (pseudo)
  if not token:
      raise ApprovalTimeout("approval_timeout")
  return token


def call_write_tool(tool: str, args: dict[str, Any]) -> Any:
  token = require_approval(tool, args)
  return tool_impl(tool, args={**args, "approval_token": token})  # (pseudo)
JAVASCRIPT
import crypto from "node:crypto";

export class ApprovalTimeout extends Error {}

export function argsHash(args) {
const raw = JSON.stringify(args);
return crypto.createHash("sha256").update(raw, "utf8").digest("hex");
}

export function createApproval({ tool, args, ttlS = 300 }) {
const id = "apr_" + Date.now();
return {
  id,
  tool,
  argsSha: argsHash(args),
  preview: tool + "(" + Object.keys(args).join(",") + ")",
  expiresAtMs: Date.now() + ttlS * 1000,
};
}

export async function requireApproval({ tool, args }) {
const req = createApproval({ tool, args });
await storeRequest(req); // (pseudo)
await notifyHuman(req); // (pseudo)
const token = await waitForToken(req.id, { timeoutS: 300 }); // (pseudo)
if (!token) throw new ApprovalTimeout("approval_timeout");
return token;
}

Real failure case (incident-style, with numbers)

We had an agent that triaged inbound emails and sent follow-ups.

It mis-classified one thread and started sending “we fixed it” replies… to the wrong customer.

Impact:

  • 12 emails sent incorrectly before someone noticed
  • ~2 hours of support time to clean up and explain
  • trust hit: people stopped using the agent for a month

Fix:

  1. approvals required for email.send
  2. the approval preview showed recipients + subject + diff
  3. approvals expired after 5 minutes; if nobody approved, the agent returned a draft only

Approvals didn’t make the model smarter. They made mistakes survivable.

Trade-offs

  • Adds latency and friction (good for writes, bad for “instant automation”).
  • Requires UI/ops work (but cheaper than damage control).
  • Approval fatigue is real; scope approvals to high-risk actions only.

When NOT to use

  • Don’t require approvals for read-only tools. You’ll train people to bypass the system.
  • Don’t use approvals as a substitute for permissions/budgets. You still need both.
  • Don’t build approvals without expiry/cancelation. That’s how you get stuck runs.

Copy-paste checklist

  • [ ] Approvals only for irreversible or user-visible writes
  • [ ] Preview content (who/what will change)
  • [ ] Store args hash (avoid raw PII in logs)
  • [ ] Expiry + cancelation
  • [ ] Approval token required to execute
  • [ ] Audit log: who approved what, when
  • [ ] Fallback: draft-only when not approved

Safe default config snippet (JSON/YAML)

YAML
approvals:
  enabled: true
  ttl_seconds: 300
  required_for: ["email.send", "ticket.close", "db.write"]
  fallback_when_not_approved: "draft_only"
logging:
  store_args_hash_only: true

FAQ (3–5)

What should approvals cover?
Irreversible, user-visible, or high-risk writes. Reads usually don’t need approval.
How do we avoid approval fatigue?
Use risk tiers. Auto-approve low-risk writes under tight budgets, and gate high-risk writes.
Should approvals block the whole run?
Prefer returning partial output + a pending approval state. Blocking workers leads to deadlocks.
Do approvals replace audit logs?
No. Approvals are a decision record. Audit logs are what actually happened.

Q: What should approvals cover?
A: Irreversible, user-visible, or high-risk writes. Reads usually don’t need approval.

Q: How do we avoid approval fatigue?
A: Use risk tiers. Auto-approve low-risk writes under tight budgets, and gate high-risk writes.

Q: Should approvals block the whole run?
A: Prefer returning partial output + a pending approval state. Blocking workers leads to deadlocks.

Q: Do approvals replace audit logs?
A: No. Approvals are a decision record. Audit logs are what actually happened.

Not sure this is your use case?

Design your agent ->
⏱️ 5 min readUpdated Mar, 2026Difficulty: ★★★
Implement in OnceOnly
Budgets + permissions you can enforce at the boundary.
Use in OnceOnly
# onceonly guardrails (concept)
version: 1
budgets:
  max_steps: 25
  max_tool_calls: 12
  max_seconds: 60
  max_usd: 1.00
policy:
  tool_allowlist:
    - search.read
    - http.get
writes:
  require_approval: true
  idempotency: true
controls:
  kill_switch: { enabled: true }
Integrated: production controlOnceOnly
Add guardrails to tool-calling agents
Ship this pattern with governance:
  • Budgets (steps / spend caps)
  • Tool permissions (allowlist / blocklist)
  • Kill switch & incident stop
  • Idempotency & dedupe
  • Audit logs & traceability
Integrated mention: OnceOnly is a control layer for production agent systems.
Author

This documentation is curated and maintained by engineers who ship AI agents in production.

The content is AI-assisted, with human editorial responsibility for accuracy, clarity, and production relevance.

Patterns and recommendations are grounded in post-mortems, failure modes, and operational incidents in deployed systems, including during the development and operation of governance infrastructure for agents at OnceOnly.