Fused Triton kernels for late-interaction (MaxSim) scoring
-
Updated
Jun 10, 2026 - Python
Fused Triton kernels for late-interaction (MaxSim) scoring
Convert Hugging Face ColBERT retrieval models (ModernBERT & BERT) to pg_colbert GGUF, and export standard llama.cpp embedding models.
Add a description, image, and links to the pylate topic page so that developers can more easily learn about it.
To associate your repository with the pylate topic, visit your repo's landing page and select "manage topics."