Releases · lightonai/pylate

@NohTow

PyLate 1.3.3 – The compatible with every model version - Release Notes 🥳🚀

Upgrade Sentence-Transformers dependency to the new major version (>5.X) @NohTow
Make PyLate compatible with more models including LLMs (e.g, Gemma) and models with multiple dense layers. @NohTow
Addition of the possibility to add residual connections to dense layers @NohTow @bclavie
Accelerate the CI and solve cache issue @raphaelsty

@NohTow

This 1.3.2 release solves reproducibility issues with PLAID and Fast-Plaid indexes (@NohTow, @raphaelsty, @Samoed). It also add support for python 3.13 along with Fast-Plaid index (@raphaelsty). Typing and logging improvement (@Samoed) and it brings a multi-gpu training fix (@NohTow). PyLate was not properly scaling the cached contrastive loss. 🚀

PyLate 1.3.0 – The ModernColBERT and FastPlaid version - Release Notes 🥳🚀

New Features

FastPlaid is the new default backend for the PLAID index. It provides a significant speed-up with the same accuracy. The previous Stanford Index can be used by setting use_fast=False when creating the index.
The PLAID index now supports filtering. You can provide a subset of document IDs to score, which will significantly accelerate search performance.
Added optimized default parameters for ModernColBERT, ensuring good results right out of the box.

Improvements

Expanded the supported version range for the transformers library to increase compatibility.

Breaking Change

The new FastPlaid backend is not compatible with indexes created using the previous Stanford backend. To load an existing index created with Stanford Backend, you should add use_fast=False to your index initialization.

PyLate 1.2.0 – The PLAID version - Release Notes 🥳🚀

New
• PLAID index — a fresh, fast index option.

Improvements
• Fixed Voyager index loading on Windows.
• CachedContrastive loss now works on local queries and gathers document-side gradients only.
• Added word-level scaling during gradient gathering.
• Contrastive losses now take a temperature hyper-parameter.
• Modernized setup & CI (updated GitHub Actions, etc.).

Breaking change:

All index files have moved to a common indexes/ sub-folder when using PyLate indexes.
Update any hard-coded paths to avoid loading errors.

PyLate 1.1.7 - The Big Batches Update - is out! 🚀

Contrastive learning heavily relies on big batches to have the best possibles negatives samples and do the best learning, we thus introduce two new features to reach those sweet batch sizes:

Addition of the CachedConstrastive loss, that implements GradCache to scale the batch size without requiring more memory (can be seen as gradient accumulation but for contrastive learning).
Addition of the cross-GPU gathering option for (Cached)Contrastive loss, in order to leverage the representations computed by the other GPUs in a multi-gpu settings to increase the effective per gpu batch size.

Pylate 1.1.6 is available 🚀🥁🔥

Addition of NanoBEIREvaluator, allowing to give quick signal about the learning during training.
Bump of transformers/ST versions, allowing to use ModernBERT in PyLate and fixing an issue when loading models after training them with trust_remote_code=True.
Reading of Stanford-NLP models configurations (markers, attending to expansion tokens, ...), allowing to load models such as Jina-ColBERT with good default parameters.
Support of Python 3.9.
Fix 1.1.5 loading stanford model metadata error.

This release aim to extend the compatibility with existing models on the Hugging Face hub and properly loading them.

PyLate Update

This release introduces several new features and improvements:

1. Native Stanford-NLP Model Support

PyLate now supports loading Stanford-NLP models directly, without requiring manual weight conversion. This includes models like Jina-ColBERTv2 and local models. Use the model name when creating a PyLate model.

2. FastAPI Integration

PyLate now allows serving embeddings via a FastAPI server. The server supports dynamic batch processing to handle multiple requests efficiently. See the documentation for details.

3. DictDataset Added

DictDataset has been introduced for handling datasets more effectively during training and inference.

4. Model Card Generation

Trained models now include a generated Model Card containing metadata about the model and training setup.

Fixes and Enhancements

Fixed an issue where dataset processing during training could become unresponsive.
Improved performance and reliability for training and inference.

Release of PyLate 1.0.0.

ColBERT training: constrastive, knowledge distillation.
ColBERT retrieval, ranking.
Documentation and Readme.
Tests.
Model loading.
Various features.

Releases: lightonai/pylate

1.3.3

Contributors

Uh oh!

1.3.2

Contributors

Uh oh!

1.3.0

New Features

Improvements

Breaking Change

Uh oh!

1.2.0

Uh oh!

1.1.7

Uh oh!

1.1.6

Uh oh!

1.1.4

Uh oh!

1.1.3

PyLate Update

1. Native Stanford-NLP Model Support

2. FastAPI Integration

3. DictDataset Added

4. Model Card Generation

Fixes and Enhancements

Uh oh!

1.0.0

Uh oh!