All writing
22 Jan 2026 · 1 min read
AI/RAG

Most AI bugs are retrieval bugs

When an AI answer feels wrong, I usually start by checking what it was given to work with.

Every team I've worked with on RAG starts in the same place: tweaking the prompt. Variants of system messages, restated instructions, careful tone-shaping. It rarely moves the needle.

The reason is structural. A generation grounded on the wrong context cannot be saved by any prompt. The model is doing exactly what it was asked to do — it's just been asked the wrong question. Fix that, and most of the apparent quality problems disappear without touching the prompt at all.

Where to actually look

Start with retrieval evals before you touch generation evals. Sample two hundred recent queries, retrieve the top-k for each, and ask whether the right segments are in there at all. If they're not, no prompt will save you. If they are but the order is wrong, you have a reranking problem, not a generation problem.

Most of my time on Tessact, in any given week, lives upstream of the model. Indexing strategies, hybrid retrieval, reranking, structural metadata. The prompt itself is the cheap part.

End · 22 Jan 2026
Next post
15 Dec 2025 · 1 min

What I'm reading this winter

A few books I have been spending time with this season.