zvec-db Documentation
Welcome to the zvec-db documentation!
zvec-db is a utility suite for sparse vectorization and document reranking, designed to work with zvec.
Quick Start
Sparse Embedding
from zvec_db.embedders import BM25Embedder
# Training
embedder = BM25Embedder(max_features=4096)
embedder.fit(documents)
# Embedding
vector = embedder.embed("search query")
print(vector) # {42: 0.523, 108: 0.312, ...}
Reranking
from zvec_db.rerankers import RrfReranker
from zvec.model.doc import Doc
reranker = RrfReranker(topn=10)
results = reranker.rerank({
"bm25": bm25_docs,
"dense": dense_docs
})
Features
6 Sparse Embedders: Count, BM25, BM25L, BM25+, DisMax, TF-IDF
3 Rerankers: RRF, Weighted, MultiField
Normalization: Standard and Bayesian
zvec-compatible: Sparse vector formats compatible with zvec
Tests: 100+ tests with ~95% coverage
Note
For more examples and guides, see the Installation and Sparse and Dense Embedding and Reranking sections.