zvec_db.rerankers.utils.base_utils
Common utilities for rerankers in zvec-db.
This module provides shared functionality used across multiple reranker implementations, including score extraction, document text extraction, and other common operations.
Functions
|
Extract score from a specific document field. |
|
Extract score from a document, handling various numeric types. |
|
Extract document text for scoring or embedding. |
- zvec_db.rerankers.utils.base_utils.extract_score(doc)[source]
Extract score from a document, handling various numeric types.
- Parameters:
doc (Doc) – Document with a score attribute.
- Returns:
Score as a float, or 0.0 if score is None or invalid.
- Return type:
Example
>>> doc = Doc(id="1", score=0.8) >>> extract_score(doc) 0.8 >>> doc_no_score = Doc(id="2", score=None) >>> extract_score(doc_no_score) 0.0
- zvec_db.rerankers.utils.base_utils.extract_field_score(doc, field_name)[source]
Extract score from a specific document field.
- Parameters:
doc (Doc) – Document with fields attribute.
field_name (str) – Name of the field to extract score from.
- Returns:
Field score as a float, or 0.0 if field is missing or non-numeric.
- Return type:
Example
>>> doc = Doc(id="1", fields={"title_score": 0.9, "content_score": 0.7}) >>> extract_field_score(doc, "title_score") 0.9 >>> extract_field_score(doc, "missing_field") 0.0
- zvec_db.rerankers.utils.base_utils.get_document_text(doc, rerank_field=None)[source]
Extract document text for scoring or embedding.
This function attempts to extract text content from a document using the following strategy:
If
rerank_fieldis specified and the document has that field, use it.Otherwise, try common field names: “content”, “text”, “body”, “passage”.
If no field matches, concatenate all fields.
As a last resort, return the document ID as a string.
- Parameters:
doc (Doc) – Document to extract text from.
rerank_field (Optional[str]) – Specific field name to use. If None, uses the fallback strategy. Defaults to None.
- Returns:
Extracted document text.
- Return type:
Example
>>> doc = Doc(id="1", fields={"content": "Hello world", "title": "Test"}) >>> get_document_text(doc) 'Hello world' >>> get_document_text(doc, rerank_field="title") 'Test'