zvec_db.rerankers.fusion.rrf

Classes

RrfReranker([topn, rerank_field, ...])

Reciprocal Rank Fusion (RRF) reranker with optional source weighting.

class zvec_db.rerankers.fusion.rrf.RrfReranker(topn=10, rerank_field=None, rank_constant=60, weights=None, normalize=None, metrics=None, schema=None)[source]

Reciprocal Rank Fusion (RRF) reranker with optional source weighting.

RRF combines results from multiple ranked lists by computing a fused score based on the reciprocal of each document’s rank:

\[\text{RRF}(d) = \sum_{r \in R} w_r \times \frac{1}{k + \text{rank}(d, r)}\]

where:

\(k\) is the rank_constant (default: 60)
\(w_r\) is the weight for source \(r\) (default: 1.0)

By default, all sources have equal weight. Use the weights parameter to favor certain sources over others.

Parameters:

topn (int, optional) – Number of top documents to return. Defaults to 10.
rerank_field (Optional[str], optional) – Ignored by RRF. Defaults to None.
rank_constant (int, optional) – Smoothing constant \(k\) in the RRF formula. Larger values reduce the impact of early ranks. Defaults to 60.
weights (Optional[dict[str, float]], optional) – Weight per source. Sources not listed use weight 1.0. Defaults to None (equal weights).
normalize (Optional[Union[bool, str, dict]], optional) – Ignored for RRF. RRF uses ranks, not scores, so normalization has no effect. Setting this parameter will emit a warning. Defaults to None.
metrics (Optional[Union[MetricType, dict[str, Union[str, MetricType, None]]]])
schema (Optional['CollectionSchema'])

Example

>>> # Basic RRF with default parameters
>>> reranker = RrfReranker(topn=10)
>>> results = reranker.rerank({"bm25": bm25_docs, "dense": dense_docs})

>>> # Weighted RRF: favor dense embeddings (70%) over BM25 (30%)
>>> reranker = RrfReranker(
...     topn=10,
...     weights={"dense": 0.7, "bm25": 0.3}
... )
>>> results = reranker.rerank({"bm25": bm25_docs, "dense": dense_docs})

>>> # Custom rank constant (higher = more uniform ranking)
>>> reranker = RrfReranker(topn=10, rank_constant=100)
>>> results = reranker.rerank({"bm25": bm25_docs, "dense": dense_docs})

Note

RRF uses only document ranks, not raw scores. This makes it robust to score scale differences between sources (e.g., BM25 scores vs. cosine similarities). Normalization is not applicable to RRF.

See also

WeightedReranker: For weighted fusion based on scores rather than ranks.

__init__(topn=10, rerank_field=None, rank_constant=60, weights=None, normalize=None, metrics=None, schema=None)[source]

Parameters:

topn (int)
rerank_field (Optional[str])
rank_constant (int)
weights (Optional[dict[str, float]])
normalize (Optional[Union[bool, str, dict]])
metrics (Optional[Union[MetricType, dict[str, Union[str, MetricType, None]]]])
schema (Optional['CollectionSchema'])

property rank_constant: int

property weights: dict[str, float]

property normalize: bool | str | dict | None

rerank(query_results, query=None)[source]

Apply Reciprocal Rank Fusion to combine multiple query results.

Parameters:

query_results (dict[str, list[Doc]]) – Results from one or more vector queries. Keys are source names (e.g., “bm25”, “dense”), values are ranked document lists.
query (Optional[str], optional) – Ignored. Defaults to None.

Returns:

Reranked documents with RRF scores in the score field,: sorted by descending score.

Return type:

list[Doc]

Example

>>> reranker = RrfReranker(topn=5)
>>> results = reranker.rerank({
...     "bm25": bm25_results,
...     "dense": dense_results
... })
>>> print(f"Top document: {results[0].id} (score: {results[0].score:.4f})")