Workflow playbook

RAG knowledge base evaluation workflow

Evaluate a RAG knowledge base by testing ingestion quality, source retrieval, answer faithfulness, and update ownership before scaling infrastructure.

Target users

  • AI builders
  • Product engineers
  • Knowledge teams

Inputs

  • Document set
  • Representative questions
  • Expected sources
  • Answer quality rules

Outputs

  • Retrieval scorecard
  • Ingestion fixes
  • RAG launch decision

Boundaries

  • Do not treat model fluency as retrieval quality.
  • Keep source documents, chunks, and metadata reviewable.
  • Avoid production RAG until update and deletion rules are owned.

Common mistakes

  • Choosing a vector database before writing real test queries.
  • Judging RAG quality only by fluent answers instead of retrieved sources.
  • Ignoring document update rules, deleted content, and metadata ownership.

Templates

  • RAG retrieval scorecard
  • Knowledge ingestion review sheet

Primary tools

Alternatives

Steps

  1. 1

    Create retrieval fixtures

    Collect real questions and mark the source passages that should answer them.

    Output: RAG evaluation fixture set.

  2. 2

    Test ingestion and retrieval

    Run retrieval tests against chunks, metadata, filters, and expected source coverage.

    Output: Retrieval quality report.

  3. 3

    Review generated answers

    Check whether answers cite the right sources, avoid unsupported claims, and handle unknowns safely.

    Output: Answer faithfulness review.

Copyable prompts

Create a RAG evaluation set with user questions, expected sources, metadata filters, and failure cases.

Review these retrieved chunks and answers for source mismatch, unsupported claims, and missing fallback behavior.

Related tools

Related guides

Use cases

  • RAG prototype
  • Knowledge assistant
  • Internal search quality review