Reflection Agent Pattern: one-pass quality control

Add one short review pass to catch obvious mistakes before answering, without endless rewrites.
On this page
  1. Pattern Essence
  2. Problem
  3. Solution
  4. How It Works
  5. In Code, It Looks Like This
  6. How It Looks During Runtime
  7. When It Fits - And When It Doesn't
  8. Good Fit
  9. Not a Fit
  10. How It Differs From Self-Critique
  11. When To Use Reflection Among Other Patterns
  12. How To Combine With Other Patterns
  13. In Short
  14. Pros and Cons
  15. FAQ
  16. What Next

Pattern Essence

Reflection Agent is a pattern where after drafting a response, the agent does one quick quality check: it finds obvious issues (unclear wording, contradictions, overconfident claims) and fixes them once before the final answer.

Important: Reflection is a lightweight pre-send check. It is not full validation by rules or policies.

When to use it: when you need a short controlled self-check and one revision before sending.


Reflection does not replace the agent's core workflow.

It adds a short control stage:

  • quickly verify whether the answer is clear, self-consistent, and not missing critical caveats
  • find obvious risks
  • fix only what is needed
  • finish without creating a new infinite loop

Reflection Agent Pattern: one-pass quality control

Problem

Imagine an agent preparing a customer answer for a high-risk request.

The draft is generally correct but has small risks:

  • overly certain wording without basis
  • contradiction between paragraphs
  • missing caveat
  • unclear applicability boundary

Without a control pass, this draft goes directly to the user.

The most dangerous mistakes are often not blatant, but "almost invisible" in normal-looking text.

Consequences:

  • trust in the answer drops
  • false confidence appears
  • risk of wrong decisions increases

That is the problem: even a good draft without review can go out with critical inaccuracies.

Solution

Reflection adds a reflection-policy before final delivery.

Analogy: this is like a tailor before handing over a suit. First they check the fit, then do one precise adjustment. It improves the result, but does not turn the process into endless fittings.

Core principle: one short quality check and one revision, without endless "make it even better" loops.

The agent may propose a change, but policy defines:

  • whether a revision is needed
  • what exactly is allowed to be changed
  • when to stop/escalate

Controlled process:

  1. Draft: generate first version
  2. Review: run one structured check
  3. Decision: ok/revise/escalate
  4. Revision: perform one patch-only edit by fix_plan
  5. Finalize: return final output without a new loop

This gives you:

  • removal of obvious risks before sending
  • noticeable quality gains with small overhead
  • revision control within fix_plan
  • protection from infinite self-edit

Works well if:

  • no_new_facts is enforced
  • changes are patch-only
  • high-risk cases are escalated, not endlessly rewritten

The model may "want" to edit again and again, but reflection-policy is what ends the process within predictable bounds.

How It Works

Diagram

Key rule: reflection must be bounded.

That means:

  • max 1 review pass
  • max 1 revision pass
  • no new facts added
  • revision must be patch-only and only inside fix_plan
  • escalate on high-risk issues
Full flow: Draft β†’ Review β†’ Revise β†’ Finalize

Draft
The agent creates the initial response from available context.

Review
A separate prompt checks the draft against simple points: clarity, logic, internal consistency, and overconfident claims.

Revision
If issues are found, the agent performs one revision from a structured fix_plan.

Finalize
The system returns result or stops the process if risk is too high.

In Code, It Looks Like This

PYTHON
draft = writer.generate(goal, context)

review = reflector.review_once(
    draft=draft,
    rubric=[
        "no_new_facts",
        "preserve_uncertainty",
        "consistency_check",
    ],
)

if review.high_risk:
    return escalate_to_human(review.reason)

if review.ok:
    return draft

revised = writer.revise_once(
    draft=draft,
    fix_plan=review.fix_plan,
    rules=["no_new_facts", "keep_scope"],
)

# optional: verify revision stayed within `fix_plan`
approved = supervisor.review_output_patch(
    original=draft,
    revised=revised,
    allowed_changes=review.fix_plan,
)

return approved

Reflection should not run "until perfect." One review pass, one improvement via revise, and one check that revision stayed inside fix_plan.

How It Looks During Runtime

TEXT
Goal: prepare an accurate answer with correct SLA wording

Draft:
"SLA for all customers is 99.99%."

Review:
- issue: too categorical claim
- issue: plan level not specified
- fix: add condition and source

Revised:
"SLA depends on plan level. For enterprise, it is 99.99% according to support policy."

Full Reflection agent example

PYPython
TSTypeScript Β· soon

When It Fits - And When It Doesn't

Good Fit

SituationWhy Reflection Fits
βœ…Response goes outside and needs extra reviewReflection catches obvious issues before sending to user.
βœ…Need quality gains without major latency penaltyOne bounded pass often gives noticeable quality improvement.
βœ…You have clear quality rubric and stop rulesFormalized criteria make self-review stable and predictable.
βœ…Need more reliable final response without rewrite loopReflection adds a control step without triggering endless rewrites.

Not a Fit

SituationWhy Reflection Does Not Fit
❌Critically low-latency pathEven one extra pass may make response too late.
❌No clear quality rubricWithout criteria, reflection becomes chaotic and does not improve reliably.
❌Need external fact validation via tools or humansReflection does not replace fact-checking with external sources or manual review.

Because reflection adds one more generation pass and increases response time and cost.

How It Differs From Self-Critique

ReflectionSelf-Critique
GoalCatch obvious issues before sendingCheck answer by stricter rules and rewrite if needed
DepthFast quality check: is answer clear, logical, and free of obvious mistakesUsually stricter critique and revision
Result typeok/issues/fix plandetailed risks + required changes + diff orientation
RiskBecoming a second loop without limitsExcessive rewriting and higher cost

Reflection is a quick bounded control run. Self-Critique is a deeper pass with rewriting.

When To Use Reflection Among Other Patterns

Use Reflection when you need to quickly verify an answer before sending: it should be clear, logical, and free of obvious mistakes.

Quick test:

  • if you need to "briefly review and fix before final" -> Reflection
  • if you need to "deeply critique by checklist and rewrite" -> Self-Critique Agent
Comparison with other patterns and examples

Quick cheat sheet:

If the task looks like this...Use
Need a short check before final responseReflection Agent
Need deep criteria-based critique and answer rewritingSelf-Critique Agent
Need to recover process after timeout, exception, or tool failureFallback-Recovery Agent
Need strict policy checks before risky actionGuarded-Policy Agent

Examples:

Reflection: "Before final response, quickly check logic, completeness, and obvious mistakes".

Self-Critique: "Evaluate response by checklist (accuracy, completeness, risks), then rewrite".

Fallback-Recovery: "If API does not respond, do retry -> fallback source -> escalation".

Guarded-Policy: "Before sending data externally, check policy: whether this is allowed".

How To Combine With Other Patterns

  • Reflection + RAG: Reflection checks that conclusions actually match retrieved sources.
  • Reflection + Supervisor: high-risk conclusions are not auto-fixed and are sent for human approval.
  • Reflection + ReAct: after a series of ReAct steps, the agent performs a final check before answering.

In Short

Quick take

Reflection Agent:

  • Performs one control self-check
  • Revises response once
  • Adds no new facts
  • Reduces risk of obvious production mistakes

Pros and Cons

Pros

checks response before sending

fixes obvious mistakes

improves clarity and structure

helps keep required format

Cons

adds one more step and latency

spends more tokens

without boundaries can overcomplicate response

FAQ

Q: Can reflection replace fact-checking and tests?
A: No. This is an additional quality layer. Fact-checking, validation, and policy controls are still required separately.

Q: Why limit number of passes?
A: Otherwise reflection becomes a second loop: latency, cost, and risk of new mistakes increase.

Q: What if reflection finds a high-risk issue?
A: Do not continue automatically. Stop the process or escalate to human review / Supervisor policy.

What Next

Reflection adds bounded quality control before final response.

What should you do when you need stricter critique and a revision process with structured change rules?

⏱️ 10 min read β€’ Updated Mar, 2026Difficulty: β˜…β˜…β˜…
Practical continuation

Pattern implementation examples

Continue with implementation using example projects.

Integrated: production controlOnceOnly
Add guardrails to tool-calling agents
Ship this pattern with governance:
  • Budgets (steps / spend caps)
  • Tool permissions (allowlist / blocklist)
  • Kill switch & incident stop
  • Idempotency & dedupe
  • Audit logs & traceability
Integrated mention: OnceOnly is a control layer for production agent systems.
Author

This documentation is curated and maintained by engineers who ship AI agents in production.

The content is AI-assisted, with human editorial responsibility for accuracy, clarity, and production relevance.

Patterns and recommendations are grounded in post-mortems, failure modes, and operational incidents in deployed systems, including during the development and operation of governance infrastructure for agents at OnceOnly.