zvec_db.rerankers.utils.base_utils

Common utilities for rerankers in zvec-db.

This module provides shared functionality used across multiple reranker implementations, including score extraction, document text extraction, and other common operations.

Functions

`extract_field_score`(doc, field_name)	Extract score from a specific document field.
`extract_score`(doc)	Extract score from a document, handling various numeric types.
`get_document_text`(doc[, rerank_field])	Extract document text for scoring or embedding.

zvec_db.rerankers.utils.base_utils.extract_score(doc)[source]

Extract score from a document, handling various numeric types.

Parameters:: doc (Doc) – Document with a score attribute.
Returns:: Score as a float, or 0.0 if score is None or invalid.
Return type:: float

Example

>>> doc = Doc(id="1", score=0.8)
>>> extract_score(doc)
0.8
>>> doc_no_score = Doc(id="2", score=None)
>>> extract_score(doc_no_score)
0.0

zvec_db.rerankers.utils.base_utils.extract_field_score(doc, field_name)[source]

Extract score from a specific document field.

Parameters:

doc (Doc) – Document with fields attribute.
field_name (str) – Name of the field to extract score from.

Returns:

Field score as a float, or 0.0 if field is missing or non-numeric.

Return type:

float

Example

>>> doc = Doc(id="1", fields={"title_score": 0.9, "content_score": 0.7})
>>> extract_field_score(doc, "title_score")
0.9
>>> extract_field_score(doc, "missing_field")
0.0

zvec_db.rerankers.utils.base_utils.get_document_text(doc, rerank_field=None)[source]

Extract document text for scoring or embedding.

This function attempts to extract text content from a document using the following strategy:

If rerank_field is specified and the document has that field, use it.
Otherwise, try common field names: “content”, “text”, “body”, “passage”.
If no field matches, concatenate all fields.
As a last resort, return the document ID as a string.

Parameters:

doc (Doc) – Document to extract text from.
rerank_field (Optional[str]) – Specific field name to use. If None, uses the fallback strategy. Defaults to None.

Returns:

Extracted document text.

Return type:

str

Example

>>> doc = Doc(id="1", fields={"content": "Hello world", "title": "Test"})
>>> get_document_text(doc)
'Hello world'
>>> get_document_text(doc, rerank_field="title")
'Test'