Tom Aarsen
commited on
Commit
·
32dd40a
1
Parent(s):
7a9d516
Link to the blogpost
Browse files
README.md
CHANGED
@@ -8464,6 +8464,8 @@ model-index:
|
|
8464 |
|
8465 |
This is a [sentence-transformers](https://www.SBERT.net) model trained on the [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq), [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1), [squad](https://huggingface.co/datasets/sentence-transformers/squad), [s2orc](https://huggingface.co/datasets/sentence-transformers/s2orc), [allnli](https://huggingface.co/datasets/sentence-transformers/all-nli), [paq](https://huggingface.co/datasets/sentence-transformers/paq), [trivia_qa](https://huggingface.co/datasets/sentence-transformers/trivia-qa), [msmarco_10m](https://huggingface.co/datasets/bclavie/msmarco-10m-triplets), [swim_ir](https://huggingface.co/datasets/nthakur/swim-ir-monolingual), [pubmedqa](https://huggingface.co/datasets/sentence-transformers/pubmedqa), [miracl](https://huggingface.co/datasets/sentence-transformers/miracl), [mldr](https://huggingface.co/datasets/sentence-transformers/mldr) and [mr_tydi](https://huggingface.co/datasets/sentence-transformers/mr-tydi) datasets. It maps sentences & paragraphs to a 1024-dimensional dense vector space and is designed to be used for semantic search.
|
8466 |
|
|
|
|
|
8467 |
* **0 Active Parameters:** This model does not use any active parameters, instead consisting exclusively of averaging pre-computed token embeddings.
|
8468 |
* **100x to 400x faster:** On CPU, this model is 100x to 400x faster than common options like [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). On GPU, it's 10x to 25x faster.
|
8469 |
* **Matryoshka:** This model was trained with a [Matryoshka loss](https://huggingface.co/blog/matryoshka), allowing you to truncate the embeddings for faster retrieval at minimal performance costs.
|
|
|
8464 |
|
8465 |
This is a [sentence-transformers](https://www.SBERT.net) model trained on the [gooaq](https://huggingface.co/datasets/sentence-transformers/gooaq), [msmarco](https://huggingface.co/datasets/sentence-transformers/msmarco-co-condenser-margin-mse-sym-mnrl-mean-v1), [squad](https://huggingface.co/datasets/sentence-transformers/squad), [s2orc](https://huggingface.co/datasets/sentence-transformers/s2orc), [allnli](https://huggingface.co/datasets/sentence-transformers/all-nli), [paq](https://huggingface.co/datasets/sentence-transformers/paq), [trivia_qa](https://huggingface.co/datasets/sentence-transformers/trivia-qa), [msmarco_10m](https://huggingface.co/datasets/bclavie/msmarco-10m-triplets), [swim_ir](https://huggingface.co/datasets/nthakur/swim-ir-monolingual), [pubmedqa](https://huggingface.co/datasets/sentence-transformers/pubmedqa), [miracl](https://huggingface.co/datasets/sentence-transformers/miracl), [mldr](https://huggingface.co/datasets/sentence-transformers/mldr) and [mr_tydi](https://huggingface.co/datasets/sentence-transformers/mr-tydi) datasets. It maps sentences & paragraphs to a 1024-dimensional dense vector space and is designed to be used for semantic search.
|
8466 |
|
8467 |
+
Read our [Static Embeddings blogpost](https://huggingface.co/blog/static-embeddings) to learn more about this model and how it was trained.
|
8468 |
+
|
8469 |
* **0 Active Parameters:** This model does not use any active parameters, instead consisting exclusively of averaging pre-computed token embeddings.
|
8470 |
* **100x to 400x faster:** On CPU, this model is 100x to 400x faster than common options like [all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). On GPU, it's 10x to 25x faster.
|
8471 |
* **Matryoshka:** This model was trained with a [Matryoshka loss](https://huggingface.co/blog/matryoshka), allowing you to truncate the embeddings for faster retrieval at minimal performance costs.
|