Tune retrieval quality

The most common cause of wrong AI answers is retrieval missing the right paragraph, not the AI being dumb. This guide covers a few console toggles plus an evaluation checklist to systematically improve accuracy.

Key takeaways

Fragmented docs: small chunks (500 chars). Long coherent paragraphs: large chunks (1500).
Hybrid search + rerank is on by default — turning it off causes literal-keyword misses.
Build a list of 20-50 real questions; run it after each tuning change to avoid tuning blind.

1. Pick the right chunk size

In the KB's Retrieval Settings you can adjust chunk size. Default 1000 chars fits most PDFs and longform content. For fragmented docs (FAQs, short policy bullets), drop to 500 for higher recall. For long coherent paragraphs (research papers, legal clauses), bump to 1500-2000 to preserve context. The KB needs reprocessing for changes to take effect.

2. Verify hybrid search + rerank is enabled

By default the system runs hybrid search + rerank — combining semantic and keyword matching to find candidates, then using a dedicated reranking model to pick the most relevant ones. If you see "the keyword is literally in the doc but the AI says it can't find it", check Retrieval Settings to confirm both toggles are on. Disabling them speeds things up slightly but recall drops noticeably.

3. Build an evaluation checklist

Collect 20-50 real questions from customers or teammates, and for each one note which document and paragraph the correct answer should come from. Run this checklist after every tuning change — chunk size, embedding model, retrieval count — and count how many hit the correct paragraph. Without an evaluation checklist, you're tuning by gut feel and often make things worse.

4. Settings for multilingual KBs

The default embedding model is the most balanced for mixed Chinese + English. If your docs are all English, switch to OpenAI's English-specific model in Retrieval Settings — recall lifts 5-10%. For legal or medical terminology, drop chunk size to around 600 and bump retrieval count to 8 — domain jargon often scatters across paragraphs, so you need more candidates retrieved.

Build your first RAG knowledge base

Read guide Need help?