1. Matryoshka Loss function - you can now train & perform inference on 🪆 Matryoshka Embedding models. See also our blogpost: https://huggingface.co/blog/matryoshka
2. CoSENTLoss & AnglELoss: State of the art loss functions. These are quite interesting, they outperform CosineSimilarityLoss on nearly all benchmarks as a drop-in replacement! See also the docs: https://sbert.net/docs/package_reference/losses.html#cosentloss
3. Prompt templates: Many popular models such as intfloat/multilingual-e5-large and BAAI/bge-large-en-v1.5 prefix their texts with prompts, so this adds configuration options to automatically include prompts using
model.encode(..., prompt_name="query")
which will include a prompt with the name "query". More info in the docs: https://sbert.net/examples/applications/computing-embeddings/README.html#prompt-templates4. Instructor support: Support for the INSTRUCTOR line of models, such as hkunlp/instructor-large. Learn how to use them here: https://sbert.net/docs/pretrained_models.html#instructor-models
5. Removed NLTK & sentencepiece dependencies: Should allow for a smaller installation & a slightly faster import!
6. Updated documentation: a new Loss Overview section: https://sbert.net/docs/training/loss_overview.html and more detailed loss functions: https://sbert.net/docs/package_reference/losses.html
And much more! See the full release notes here: https://github.com/UKPLab/sentence-transformers/releases/tag/v2.4.0
Some more very exciting updates are still on the horizon!