RAG Pipeline

The agent retrieves documentation using a hybrid search pipeline: BM25 keyword matching, vector similarity via ChromaDB, reciprocal rank fusion, and a cross-encoder reranker. These are the measured results from 40 ground-truth queries with graded relevance.

Retrieval Comparison

Three retrieval modes evaluated on the same query set. Hybrid combines BM25 + semantic + reranker.

Metric	BM25	Semantic	Hybrid
MRR @5	0.737	0.847	0.900
NDCG @5	0.698	0.883	0.856
Recall @5	78.3%	81.2%	91.7%
Hit Rate @1	60%	72%	80%
Hit Rate @3	95%	100%	100%
MAP	0.620	0.851	0.785

Validation Set

10 held-out queries never tuned against. Confirms the pipeline generalizes.

0.900

MRR @5

0.852

NDCG @5

100%

Hit Rate @3

Reranker Impact

A/B comparison: hybrid search with and without the Contextual AI cross-encoder reranker.

Condition	MRR @5	Queries Affected
Without reranker	0.861	—
With reranker	0.900	8 improved, 6 degraded
Lift	+0.039	out of 40 queries

Index

326

Doc Pages

2825

Vector Chunks

6891

BM25 Terms

Pipeline Components

Stage	Component	Details
Embeddings	HF Inference API	`all-mpnet-base-v2` (768-dim)
Vector DB	ChromaDB Cloud	Cosine similarity, heading-prefixed embeddings
Keyword	BM25	k1=1.2, b=0.75, stopword filtering
Query Expansion	Domain synonyms	RevenueCat-specific term expansion
Reranking	Contextual AI / HF	`ctxl-rerank-v2` primary, `ms-marco-MiniLM` fallback
Fusion	Reciprocal Rank Fusion	k=60, 70/30 semantic/BM25 weighting
Chunking	Heading-aware	400 max words, 50-word overlap