zvec_db.rerankers.utils.pipeline

PipelineReranker for zvec-db.

This module contains the PipelineReranker class that allows chaining multiple rerankers sequentially.

Example Usage

from zvec import create_and_open, VectorQuery
from zvec_db.rerankers import (
    RrfReranker,
    OpenAICrossEncoderReranker,
    PipelineReranker,
)

collection = create_and_open(...)

# Pipeline: RRF then Cross-Encoder
pipeline = PipelineReranker(
    rerankers=[
        RrfReranker(topn=50, rank_constant=60),
        OpenAICrossEncoderReranker(
            topn=10,
            model="gpt-4o-mini",
            query="machine learning"
        )
    ]
)

results = collection.query(
    vectors=[
        VectorQuery(field_name="bm25", vector=bm25_vec),
        VectorQuery(field_name="dense", vector=dense_vec),
    ],
    topk=10,
    reranker=pipeline
)

Classes

PipelineReranker(rerankers[, topn, rerank_field])

Chain multiple rerankers sequentially.

class zvec_db.rerankers.utils.pipeline.PipelineReranker(rerankers, topn=10, rerank_field=None)[source]

Chain multiple rerankers sequentially.

This reranker applies a list of rerankers in sequence, passing the output of one as the input to the next. This is useful for combining different reranking strategies (e.g., RRF followed by cross-encoder).

Parameters:
  • rerankers (list) – List of rerankers to apply in order.

  • topn (int, optional) – Number of final documents to return. Defaults to 10.

  • rerank_field (Optional[str], optional) – Ignored. Defaults to None.

Example

>>> pipeline = PipelineReranker([
...     RrfReranker(topn=50, rank_constant=60),
...     SentenceTransformerReranker(model_name="ms-marco-MiniLM-L-6-v2", topn=10)
... ])
>>> results = collection.query(..., reranker=pipeline)
__init__(rerankers, topn=10, rerank_field=None)[source]

Initialize PipelineReranker with a list of rerankers.

Parameters:
  • rerankers (list) – List of reranker instances to apply in order. Each reranker must implement the rerank() method.

  • topn (int, optional) – Number of final documents to return. Defaults to 10.

  • rerank_field (Optional[str], optional) – Ignored. Defaults to None.

Example

>>> pipeline = PipelineReranker([
...     RrfReranker(topn=50, rank_constant=60),
...     SentenceTransformerReranker(model_name="ms-marco-MiniLM-L-6-v2", topn=10)
... ])
>>> results = collection.query(..., reranker=pipeline)
rerank(query_results, query=None)[source]

Apply rerankers sequentially.

Parameters:
  • query_results (dict[str, list[Doc]]) – Results from vector queries.

  • query (Optional[str], optional) – The search query. Passed to underlying rerankers. Defaults to None.

Returns:

Final re-ranked documents after all rerankers applied.

Return type:

list[Doc]