zvec_db.rerankers.utils.base_utils

Common utilities for rerankers in zvec-db.

This module provides shared functionality used across multiple reranker implementations, including score extraction, document text extraction, and other common operations.

Functions

extract_field_score(doc, field_name)

Extract score from a specific document field.

extract_score(doc)

Extract score from a document, handling various numeric types.

get_document_text(doc[, rerank_field])

Extract document text for scoring or embedding.

zvec_db.rerankers.utils.base_utils.extract_score(doc)[source]

Extract score from a document, handling various numeric types.

Parameters:

doc (Doc) – Document with a score attribute.

Returns:

Score as a float, or 0.0 if score is None or invalid.

Return type:

float

Example

>>> doc = Doc(id="1", score=0.8)
>>> extract_score(doc)
0.8
>>> doc_no_score = Doc(id="2", score=None)
>>> extract_score(doc_no_score)
0.0
zvec_db.rerankers.utils.base_utils.extract_field_score(doc, field_name)[source]

Extract score from a specific document field.

Parameters:
  • doc (Doc) – Document with fields attribute.

  • field_name (str) – Name of the field to extract score from.

Returns:

Field score as a float, or 0.0 if field is missing or non-numeric.

Return type:

float

Example

>>> doc = Doc(id="1", fields={"title_score": 0.9, "content_score": 0.7})
>>> extract_field_score(doc, "title_score")
0.9
>>> extract_field_score(doc, "missing_field")
0.0
zvec_db.rerankers.utils.base_utils.get_document_text(doc, rerank_field=None)[source]

Extract document text for scoring or embedding.

This function attempts to extract text content from a document using the following strategy:

  1. If rerank_field is specified and the document has that field, use it.

  2. Otherwise, try common field names: “content”, “text”, “body”, “passage”.

  3. If no field matches, concatenate all fields.

  4. As a last resort, return the document ID as a string.

Parameters:
  • doc (Doc) – Document to extract text from.

  • rerank_field (Optional[str]) – Specific field name to use. If None, uses the fallback strategy. Defaults to None.

Returns:

Extracted document text.

Return type:

str

Example

>>> doc = Doc(id="1", fields={"content": "Hello world", "title": "Test"})
>>> get_document_text(doc)
'Hello world'
>>> get_document_text(doc, rerank_field="title")
'Test'