Normal path: execute → tool → observe.
Quick take
- “Cite sources” is not enforceable unless citations are verified against captured tool evidence.
- Make citations refer to
source_ids (snapshots), not raw URLs. - Treat “search results” as discovery, not evidence (fetch before citing).
- Fail closed (or degrade) when citations don’t verify.
Problem-first intro
Your agent produces a “well sourced” answer.
Then someone clicks the sources.
One link 404s. Another is unrelated. A third is a PDF the agent clearly didn’t read (it’s 120 pages; the answer came back in 6 seconds).
Congrats — you’ve shipped a credibility bug.
In production this isn’t just embarrassing. It’s expensive:
- Support and trust take a hit (“your docs are fake”).
- Legal/compliance gets involved if you’re citing policies or regulations.
- Your team burns hours doing “citation archaeology” in logs that don’t exist.
This failure mode shows up the moment you ask the model for “sources” without giving it a hard constraint on what counts as a source.
Why this fails in production
Hallucinated citations aren’t magical. They’re a predictable result of how we build agents.
1) The model is optimized to look helpful, not to be auditable
If the prompt says “include sources”, the model will include sources. Even if it has none. It’ll invent something plausible:
- a domain that sounds right
- a URL path that looks real
- a document title that “should exist”
The model isn’t lying “on purpose”. It’s satisfying the shape of the output you asked for.
2) “Search results” are not “evidence”
Many agents do this:
- call
search.read("x") - get a list of titles + URLs
- answer with citations
But the agent didn’t fetch the pages. It doesn’t know the content. It only knows what the search snippet claims the page contains.
If you accept that as evidence, you’ll cite things you never read. Because you didn’t.
3) Evidence gets lost between steps
Even if you fetch pages, evidence often gets dropped:
- tool output isn’t stored, only summarized
- context gets truncated
- a retry reorders results
- a later step overwrites earlier sources
If you can’t trace “this sentence came from this document snapshot”, you don’t have citations. You have decoration.
4) “Cite sources” is a policy. Policies don’t enforce themselves.
You can’t prompt your way into auditability. You need enforcement in code:
- sources must come from tool outputs your system captured
- citations must reference those captured sources
- outputs without valid citations must fail (or degrade)
Here’s the pipeline you actually want:
Implementation example (real code)
The safest pattern we’ve found:
- treat “sources” as IDs, not URLs
- only allow citations that refer to snapshotted tool outputs
- optionally: require a short quote/excerpt hash per citation
from __future__ import annotations
from dataclasses import dataclass
import hashlib
import time
from typing import Any
@dataclass(frozen=True)
class Evidence:
source_id: str
url: str
fetched_at: float
title: str
text_sha256: str
class EvidenceStore:
def __init__(self) -> None:
self._items: dict[str, Evidence] = {}
def add(self, *, url: str, title: str, text: str) -> str:
sha = hashlib.sha256(text.encode("utf-8")).hexdigest()
source_id = f"src_{len(self._items)+1:03d}"
self._items[source_id] = Evidence(
source_id=source_id,
url=url,
fetched_at=time.time(),
title=title,
text_sha256=sha,
)
return source_id
def has(self, source_id: str) -> bool:
return source_id in self._items
def meta(self, source_id: str) -> Evidence:
return self._items[source_id]
def verify_citations(*, cited_source_ids: list[str], store: EvidenceStore) -> None:
missing = [s for s in cited_source_ids if not store.has(s)]
if missing:
raise ValueError(f"invalid citations (unknown source_ids): {missing}")
def answer_with_citations(task: str, *, store: EvidenceStore) -> dict[str, Any]:
# In real code: the model returns structured output.
# Example shape:
# { "answer": "...", "citations": ["src_001", "src_002"] }
out = llm_answer(task) # (pseudo)
verify_citations(cited_source_ids=out["citations"], store=store)
return out
def render_sources(cited_ids: list[str], store: EvidenceStore) -> list[dict[str, str]]:
sources: list[dict[str, str]] = []
for sid in cited_ids:
ev = store.meta(sid)
sources.append(
{
"source_id": sid,
"title": ev.title,
"url": ev.url,
"sha256": ev.text_sha256[:12],
}
)
return sourcesimport crypto from "node:crypto";
export class EvidenceStore {
constructor() {
this.items = new Map();
}
add({ url, title, text }) {
const sha = crypto.createHash("sha256").update(text, "utf8").digest("hex");
const sourceId = "src_" + String(this.items.size + 1).padStart(3, "0");
this.items.set(sourceId, { sourceId, url, title, fetchedAt: Date.now(), textSha256: sha });
return sourceId;
}
has(sourceId) {
return this.items.has(sourceId);
}
meta(sourceId) {
const ev = this.items.get(sourceId);
if (!ev) throw new Error("unknown source_id: " + sourceId);
return ev;
}
}
export function verifyCitations({ citedSourceIds, store }) {
const missing = citedSourceIds.filter((s) => !store.has(s));
if (missing.length) throw new Error("invalid citations (unknown source_ids): " + missing.join(", "));
}
export function answerWithCitations(task, { store }) {
// Real code: the model returns structured output validated by schema.
// Example shape:
// { answer: "...", citations: ["src_001", "src_002"] }
const out = llmAnswer(task); // (pseudo)
verifyCitations({ citedSourceIds: out.citations || [], store });
return out;
}
export function renderSources(citedIds, store) {
return citedIds.map((sid) => {
const ev = store.meta(sid);
return { source_id: sid, title: ev.title, url: ev.url, sha256: ev.textSha256.slice(0, 12) };
});
}What this buys you:
- citations can’t point to imaginary URLs
- you can reproduce answers later (“here’s the snapshot hash”)
- you can fail closed when citations don’t verify
If you want to go further, require an excerpt hash (or exact quote) per claim. It’s slower. It’s also harder to fake.
Example incident (numbers are illustrative)
Example: an “internal research agent” generating weekly competitive summaries. It was asked to “include sources”.
What actually happened:
- it cited a handful of credible-looking URLs
- those URLs were not fetched by the agent
- two of the links were dead
- one was a completely unrelated press release
Impact:
- a PM forwarded the doc to a partner (yikes)
- we spent ~6 engineer-hours reconstructing which tool calls happened
- we lost trust for a month (“cool demo, but I can’t use it”)
Fix:
- sources became
source_ids tied to tool snapshots - “search results” stopped counting as evidence
- answers without verified citations degraded to: “I can’t cite this reliably”
Dry lesson: if you don’t store evidence, you don’t have citations.
Trade-offs
- Evidence snapshots cost storage and time.
- Fail-closed citation verification reduces “answer rate” early on.
- For some tasks, citations are unnecessary overhead (don’t force it everywhere).
When NOT to use
- If the output is internal-only and doesn’t need citations, don’t add them “for vibes”.
- If your system can’t fetch/store evidence safely (PII, secrets), don’t pretend citations are reliable.
- If you’re doing deterministic lookups from a single source of truth, just link the source directly.
Copy-paste checklist
- [ ] Treat citations as
source_ids, not URLs - [ ] Store tool output snapshots (URL + hash + timestamp)
- [ ] Disallow citations to unfetched URLs
- [ ] Separate “search results” from “evidence”
- [ ] Validate citations (fail closed or degrade)
- [ ] Log
run_id+source_ids + snapshot hashes - [ ] Add a retention policy for snapshots
- [ ] Add a safe-mode: “answer without sources” if evidence isn’t available
Safe default config snippet (JSON/YAML)
citations:
required: true
evidence_sources: ["http.get", "kb.read"]
allow_search_results_as_evidence: false
fail_closed: true
attach_snapshot_hash: true
retention_days: 14
FAQ (3–5)
Used by patterns
Related failures
Related pages (3–6 links)
- Foundations: How agents use tools · How LLM limits affect agents
- Failure: Prompt injection attacks · Infinite loop
- Governance: Tool permissions (allowlists)
- Production stack: Production agent stack