RAG Pipeline

The agent retrieves documentation using a hybrid search pipeline: BM25 keyword matching, vector similarity via ChromaDB, reciprocal rank fusion, and a cross-encoder reranker. These are the measured results from 40 ground-truth queries with graded relevance.

Retrieval Comparison

Three retrieval modes evaluated on the same query set. Hybrid combines BM25 + semantic + reranker.

MetricBM25SemanticHybrid
MRR @5 0.737 0.847 0.900
NDCG @5 0.698 0.883 0.856
Recall @5 78.3% 81.2% 91.7%
Hit Rate @1 60% 72% 80%
Hit Rate @3 95% 100% 100%
MAP 0.620 0.851 0.785

Validation Set

10 held-out queries never tuned against. Confirms the pipeline generalizes.

0.900
MRR @5
0.852
NDCG @5
100%
Hit Rate @3

Reranker Impact

A/B comparison: hybrid search with and without the Contextual AI cross-encoder reranker.

ConditionMRR @5Queries Affected
Without reranker 0.861
With reranker 0.900 8 improved, 6 degraded
Lift +0.039 out of 40 queries

Index

326
Doc Pages
2825
Vector Chunks
6891
BM25 Terms

Pipeline Components

StageComponentDetails
EmbeddingsHF Inference APIall-mpnet-base-v2 (768-dim)
Vector DBChromaDB CloudCosine similarity, heading-prefixed embeddings
KeywordBM25k1=1.2, b=0.75, stopword filtering
Query ExpansionDomain synonymsRevenueCat-specific term expansion
RerankingContextual AI / HFctxl-rerank-v2 primary, ms-marco-MiniLM fallback
FusionReciprocal Rank Fusionk=60, 70/30 semantic/BM25 weighting
ChunkingHeading-aware400 max words, 50-word overlap