zvec_db.rerankers.base

Base classes for reranking in zvec-db.

Classes

FusionRerankerBase([topn, rerank_field, ...])

Base class for fusion-based rerankers combining multiple sources.

RerankFunction([topn, rerank_field, schema, ...])

Abstract base class for reranking search results.

class zvec_db.rerankers.base.RerankFunction(topn=10, rerank_field=None, schema=None, metrics=None)[source]

Abstract base class for reranking search results.

Rerankers refine the output of one or more vector queries by applying a secondary scoring strategy. They are used in the query() method of Collection via the reranker parameter.

Parameters:
  • topn (int, optional) – Number of top documents to return after reranking. Defaults to 10.

  • rerank_field (str | None, optional) – Field name used as input for reranking (e.g., document title or body). Defaults to None.

  • schema (CollectionSchema | None, optional) – Collection schema to automatically extract metrics from. If provided and no explicit metrics are given, metric types are inferred from the schema. Defaults to None.

  • metrics (str | MetricType | dict[str, str | MetricType | None] | None, optional) –

    Metric type(s) for converting distances to similarities. Can be: - A single MetricType (e.g., MetricType.COSINE) applied

    to all sources

    • A dict mapping source names to their metric type (use None or MetricType.IP for sources that don’t need conversion, e.g., BM25 scores)

    • If None and schema is provided, metrics are inferred from the schema (defaults to IP if not specified)

    • If None and no schema, defaults to IP (no conversion needed)

    Defaults to None.

Note

Subclasses must implement the rerank() method.

__init__(topn=10, rerank_field=None, schema=None, metrics=None)[source]
Parameters:
  • topn (int)

  • rerank_field (str | None)

  • schema (CollectionSchema | None)

  • metrics (str | MetricType | dict[str, str | MetricType | None] | None)

property topn: int

Number of top documents to return after reranking.

Type:

int

property rerank_field: str | None

Field name used as reranking input.

Type:

str | None

property schema: CollectionSchema | None

The collection schema if provided.

Type:

CollectionSchema | None

property metrics: dict[str, str | MetricType | None]

Per-source metric types.

Type:

dict[str, str | MetricType | None]

abstractmethod rerank(query_results, query=None)[source]

Rerank documents from one or more vector queries.

Parameters:
  • query_results (dict[str, list[Doc]]) – Mapping from vector field name to list of retrieved documents (sorted by relevance).

  • query (str | None, optional) – The search query. Some rerankers may require this (e.g., CrossEncoder). Defaults to None.

Returns:

Reranked list of documents (length <= topn),

with updated score fields.

Return type:

list[Doc]

class zvec_db.rerankers.base.FusionRerankerBase(topn=10, rerank_field=None, schema=None, metrics=None)[source]

Base class for fusion-based rerankers combining multiple sources.

This class provides shared functionality for rerankers that fuse scores from multiple retrieval sources, including metric conversion and normalization.

Conversion formulas (ensure higher=better): - COSINE: (2 - score) / 2 - distance [0, 2] -> similarity [0, 1] - L2: -score - inverts order - IP: no conversion - already “higher=better” (also for BM25/non-vector scores)

Normalization: - COSINE: NEVER normalized (conversion already produces [0, 1]) - Others: Optional normalization (bayes, minmax, percentile, atan, etc.)

Parameters:
  • topn (int)

  • rerank_field (str | None)

  • schema (CollectionSchema | None)

  • metrics (str | MetricType | dict[str, str | MetricType | None] | None)