zvec_db.rerankers.base

Base classes for reranking in zvec-db.

Classes

`FusionRerankerBase`([topn, rerank_field, ...])	Base class for fusion-based rerankers combining multiple sources.
`RerankFunction`([topn, rerank_field, schema, ...])	Abstract base class for reranking search results.

class zvec_db.rerankers.base.RerankFunction(topn=10, rerank_field=None, schema=None, metrics=None)[source]

Abstract base class for reranking search results.

Rerankers refine the output of one or more vector queries by applying a secondary scoring strategy. They are used in the query() method of Collection via the reranker parameter.

Parameters:

topn (int, optional) – Number of top documents to return after reranking. Defaults to 10.
rerank_field (str | None, optional) – Field name used as input for reranking (e.g., document title or body). Defaults to None.
schema (CollectionSchema | None, optional) – Collection schema to automatically extract metrics from. If provided and no explicit metrics are given, metric types are inferred from the schema. Defaults to None.
metrics (str | MetricType | dict[str, str | MetricType | None] | None, optional) –
Metric type(s) for converting distances to similarities. Can be: - A single MetricType (e.g., MetricType.COSINE) applied

to all sources
- A dict mapping source names to their metric type (use None or MetricType.IP for sources that don’t need conversion, e.g., BM25 scores)
- If None and schema is provided, metrics are inferred from the schema (defaults to IP if not specified)
- If None and no schema, defaults to IP (no conversion needed)
Defaults to None.

Note

Subclasses must implement the rerank() method.

__init__(topn=10, rerank_field=None, schema=None, metrics=None)[source]

Parameters:

topn (int)
rerank_field (str | None)
schema (CollectionSchema | None)
metrics (str | MetricType | dict[str, str | MetricType | None] | None)

property topn: int

Number of top documents to return after reranking.

Type:: int

property rerank_field: str | None

Field name used as reranking input.

Type:: str | None

property schema: CollectionSchema | None

The collection schema if provided.

Type:: CollectionSchema | None

property metrics: dict[str, str | MetricType | None]

Per-source metric types.

Type:: dict[str, str | MetricType | None]

abstractmethod rerank(query_results, query=None)[source]

Rerank documents from one or more vector queries.

Parameters:

query_results (dict[str, list[Doc]]) – Mapping from vector field name to list of retrieved documents (sorted by relevance).
query (str | None, optional) – The search query. Some rerankers may require this (e.g., CrossEncoder). Defaults to None.

Returns:

Reranked list of documents (length <= topn),: with updated score fields.

Return type:

list[Doc]

class zvec_db.rerankers.base.FusionRerankerBase(topn=10, rerank_field=None, schema=None, metrics=None)[source]

Base class for fusion-based rerankers combining multiple sources.

This class provides shared functionality for rerankers that fuse scores from multiple retrieval sources, including metric conversion and normalization.

Conversion formulas (ensure higher=better): - COSINE: (2 - score) / 2 - distance [0, 2] -> similarity [0, 1] - L2: -score - inverts order - IP: no conversion - already “higher=better” (also for BM25/non-vector scores)

Normalization: - COSINE: NEVER normalized (conversion already produces [0, 1]) - Others: Optional normalization (bayes, minmax, percentile, atan, etc.)

Parameters:

topn (int)
rerank_field (str | None)
schema (CollectionSchema | None)
metrics (str | MetricType | dict[str, str | MetricType | None] | None)