zvec_db.rerankers.fusion.rrf
Classes
|
Reciprocal Rank Fusion (RRF) reranker with optional source weighting. |
- class zvec_db.rerankers.fusion.rrf.RrfReranker(topn=10, rerank_field=None, rank_constant=60, weights=None, normalize=None, metrics=None, schema=None)[source]
Reciprocal Rank Fusion (RRF) reranker with optional source weighting.
RRF combines results from multiple ranked lists by computing a fused score based on the reciprocal of each document’s rank:
\[\text{RRF}(d) = \sum_{r \in R} w_r \times \frac{1}{k + \text{rank}(d, r)}\]- where:
\(k\) is the
rank_constant(default: 60)\(w_r\) is the weight for source \(r\) (default: 1.0)
By default, all sources have equal weight. Use the
weightsparameter to favor certain sources over others.- Parameters:
topn (int, optional) – Number of top documents to return. Defaults to 10.
rerank_field (Optional[str], optional) – Ignored by RRF. Defaults to None.
rank_constant (int, optional) – Smoothing constant \(k\) in the RRF formula. Larger values reduce the impact of early ranks. Defaults to 60.
weights (Optional[dict[str, float]], optional) – Weight per source. Sources not listed use weight 1.0. Defaults to None (equal weights).
normalize (Optional[Union[bool, str, dict]], optional) – Ignored for RRF. RRF uses ranks, not scores, so normalization has no effect. Setting this parameter will emit a warning. Defaults to None.
metrics (Optional[Union[MetricType, dict[str, Union[str, MetricType, None]]]])
schema (Optional['CollectionSchema'])
Example
>>> # Basic RRF with default parameters >>> reranker = RrfReranker(topn=10) >>> results = reranker.rerank({"bm25": bm25_docs, "dense": dense_docs})
>>> # Weighted RRF: favor dense embeddings (70%) over BM25 (30%) >>> reranker = RrfReranker( ... topn=10, ... weights={"dense": 0.7, "bm25": 0.3} ... ) >>> results = reranker.rerank({"bm25": bm25_docs, "dense": dense_docs})
>>> # Custom rank constant (higher = more uniform ranking) >>> reranker = RrfReranker(topn=10, rank_constant=100) >>> results = reranker.rerank({"bm25": bm25_docs, "dense": dense_docs})
Note
RRF uses only document ranks, not raw scores. This makes it robust to score scale differences between sources (e.g., BM25 scores vs. cosine similarities). Normalization is not applicable to RRF.
See also
WeightedReranker: For weighted fusion based on scores rather than ranks.
- __init__(topn=10, rerank_field=None, rank_constant=60, weights=None, normalize=None, metrics=None, schema=None)[source]
- rerank(query_results, query=None)[source]
Apply Reciprocal Rank Fusion to combine multiple query results.
- Parameters:
- Returns:
- Reranked documents with RRF scores in the
scorefield, sorted by descending score.
- Reranked documents with RRF scores in the
- Return type:
list[Doc]
Example
>>> reranker = RrfReranker(topn=5) >>> results = reranker.rerank({ ... "bm25": bm25_results, ... "dense": dense_results ... }) >>> print(f"Top document: {results[0].id} (score: {results[0].score:.4f})")