|
--- |
|
base_model: microsoft/deberta-v3-small |
|
datasets: |
|
- tals/vitaminc |
|
language: |
|
- en |
|
library_name: sentence-transformers |
|
metrics: |
|
- pearson_cosine |
|
- spearman_cosine |
|
- pearson_manhattan |
|
- spearman_manhattan |
|
- pearson_euclidean |
|
- spearman_euclidean |
|
- pearson_dot |
|
- spearman_dot |
|
- pearson_max |
|
- spearman_max |
|
- cosine_accuracy |
|
- cosine_accuracy_threshold |
|
- cosine_f1 |
|
- cosine_f1_threshold |
|
- cosine_precision |
|
- cosine_recall |
|
- cosine_ap |
|
- dot_accuracy |
|
- dot_accuracy_threshold |
|
- dot_f1 |
|
- dot_f1_threshold |
|
- dot_precision |
|
- dot_recall |
|
- dot_ap |
|
- manhattan_accuracy |
|
- manhattan_accuracy_threshold |
|
- manhattan_f1 |
|
- manhattan_f1_threshold |
|
- manhattan_precision |
|
- manhattan_recall |
|
- manhattan_ap |
|
- euclidean_accuracy |
|
- euclidean_accuracy_threshold |
|
- euclidean_f1 |
|
- euclidean_f1_threshold |
|
- euclidean_precision |
|
- euclidean_recall |
|
- euclidean_ap |
|
- max_accuracy |
|
- max_accuracy_threshold |
|
- max_f1 |
|
- max_f1_threshold |
|
- max_precision |
|
- max_recall |
|
- max_ap |
|
pipeline_tag: sentence-similarity |
|
tags: |
|
- sentence-transformers |
|
- sentence-similarity |
|
- feature-extraction |
|
- generated_from_trainer |
|
- dataset_size:225247 |
|
- loss:CachedGISTEmbedLoss |
|
widget: |
|
- source_sentence: what is exfo toolbox |
|
sentences: |
|
- Eye dilation from eye drops used for examination of the eye usually lasts from |
|
4 to 24 hours, depending upon the strength of the drop and upon the individual |
|
patient. |
|
- Garden Grove is a city in northern Orange County in the U.S. state of California, |
|
34 miles (55 km) south of Los Angeles. The population was 170,883 at the 2010 |
|
United States Census. State Route 22, also known as the Garden Grove Freeway, |
|
passes through the city in an east-west direction. |
|
- EXFO ToolBox Office is a product that offers you a collection of viewers and analyzers. |
|
It enables you to manage and analyze results acquired from fiber optic test modules |
|
and instruments. |
|
- source_sentence: More than 273 people have died from the 2019-20 coronavirus outside |
|
mainland China . |
|
sentences: |
|
- 'More than 3,700 people have died : around 3,100 in mainland China and around |
|
550 in all other countries combined .' |
|
- 'More than 3,200 people have died : almost 3,000 in mainland China and around |
|
275 in other countries .' |
|
- more than 4,900 deaths have been attributed to COVID-19 . |
|
- source_sentence: Ultrasound, a diagnostic technology, uses high-frequency vibrations |
|
transmitted into any tissue in contact with the transducer. |
|
sentences: |
|
- What diagnostic technology uses high-frequency vibrations transmitted into any |
|
tissue in contact with the transducer? |
|
- The abnormal cells cannot carry oxygen properly and can get stuck where? |
|
- What type of organism is a bacteria? |
|
- source_sentence: When you add moles of gas to a baloon by blowing it up, the volume |
|
increases. |
|
sentences: |
|
- What shape is the lens of the eye? |
|
- What happens to the volume of a balloon when you add moles of gas to it by blowing |
|
up? |
|
- Most turtle bodies are covered by a special bony or cartilaginous shell developed |
|
from their what? |
|
- source_sentence: What was the name of eleven rulers of the 19th and 20th Egyptian |
|
dynasties? |
|
sentences: |
|
- 'Airlines Yugoslavia 1968 - 1968 Renamed ^ Comments : Aviogenex was formed on |
|
21May1968 as Genex Airlines. Restarted under current name on 30Apr1969 & liquidated |
|
in Feb2015 ^ Genealogy : Genex Airlines >Aviogenex 1968 - 1986 Renamed ^ Comments |
|
: Adria Airways was formed on 14Mar1961 & operations started on 30Jun1961 as Adria |
|
Airways, renamed to Inex in 1968 and back to Adria again in 1986. National airline |
|
of Slovenia ^ Genealogy : Adria Airways >Inex Adria Airways >Adria Airways JAT |
|
(Jugoslovenski Aerotransport) 1947 - 2003 Renamed ^ Comments : Air Serbia was |
|
founded as Aeroput on 17Jun1927, renamed to JAT on 01Apr1947. Started ops on 15Apr1947, |
|
Renamed again on 08Aug2003 to JAT Airways & reformed as Air Serbia on 26Oct2013 |
|
^ Genealogy : Aeroput >JAT (Jugoslovenski Aerotransport) >JAT Airways >Air Serbia |
|
Jugoslovenski Aerotransport' |
|
- List of Rulers of Ancient Egypt and Nubia | Lists of Rulers | Heilbrunn Timeline |
|
of Art History | The Metropolitan Museum of Art The Metropolitan Museum of Art |
|
List of Rulers of Ancient Egypt and Nubia See works of art 30.8.234 52.127.4 Our |
|
knowledge of the succession of Egyptian kings is based on kinglists kept by the |
|
ancient Egyptians themselves. The most famous are the Palermo Stone, which covers |
|
the period from the earliest dynasties to the middle of Dynasty 5; the Abydos |
|
Kinglist, which Seti I had carved on his temple at Abydos; and the Turin Canon, |
|
a papyrus that covers the period from the earliest dynasties to the reign of Ramesses |
|
II. All are incomplete or fragmentary. We also rely on the History of Egypt written |
|
by Manetho in the third century B.C. A priest in the temple at Heliopolis, Manetho |
|
had access to many original sources and it was he who divided the kings into the |
|
thirty dynasties we use today. It is to this structure of dynasties and listed |
|
kings that we now attempt to link an absolute chronology of dates in terms of |
|
our own calendrical system. The process is made difficult by the fragmentary condition |
|
of the kinglists and by differences in the calendrical years used at various times. |
|
Some astronomical observations from the ancient Egyptians have survived, allowing |
|
us to calculate absolute dates within a margin of error. Synchronisms with the |
|
other civilizations of the ancient world are also of limited use. |
|
- 'What is the "Jack Sprat" nursery rhyme? | Reference.com What is the "Jack Sprat" |
|
nursery rhyme? A: Quick Answer "Jack Sprat" is a traditional English nursery rhyme |
|
whose main verse says, "Jack Sprat could eat no fat. His wife could eat no lean. |
|
And so between them both, you see, they licked the platter clean." Though it was |
|
likely sung by children long before, "Jack Sprat" was first published around 1765 |
|
in the compilation "Mother Goose''s Melody." Full Answer According to Rhymes.org, |
|
a U.K. website devoted to nursery rhyme lyrics and origins, the "Jack Sprat" nursery |
|
rhyme has its origins in British history. In one interpretation, Jack Sprat was |
|
King Charles I, who ruled England in the early part of the 17th century, and his |
|
wife was Queen Henrietta Maria. Parliament refused to finance the king''s war |
|
with Spain, which made him lean. However, the queen fattened the coffers by levying |
|
an illegal war tax. In an alternative version, the "Jack Sprat" nursery rhyme |
|
is linked to King Richard and his brother John of the Robin Hood legend. Jack |
|
Sprat was King John, the usurper who tried to take over the crown when King Richard |
|
went off to fight in the Crusades in the 12th century. When King Richard was captured, |
|
John had to raise a ransom to rescue him, leaving the country lean. The wife was |
|
Joan, daughter of the Earl of Gloucester, the greedy wife of King John. However, |
|
after King Richard died and John became king, he had his marriage with Joan annulled.' |
|
model-index: |
|
- name: SentenceTransformer based on microsoft/deberta-v3-small |
|
results: |
|
- task: |
|
type: semantic-similarity |
|
name: Semantic Similarity |
|
dataset: |
|
name: sts test |
|
type: sts-test |
|
metrics: |
|
- type: pearson_cosine |
|
value: 0.7673854808079448 |
|
name: Pearson Cosine |
|
- type: spearman_cosine |
|
value: 0.7776198286738142 |
|
name: Spearman Cosine |
|
- type: pearson_manhattan |
|
value: 0.782368447545155 |
|
name: Pearson Manhattan |
|
- type: spearman_manhattan |
|
value: 0.7720687033298573 |
|
name: Spearman Manhattan |
|
- type: pearson_euclidean |
|
value: 0.7882638792170585 |
|
name: Pearson Euclidean |
|
- type: spearman_euclidean |
|
value: 0.7775073687564514 |
|
name: Spearman Euclidean |
|
- type: pearson_dot |
|
value: 0.7669147371310585 |
|
name: Pearson Dot |
|
- type: spearman_dot |
|
value: 0.7762894632049069 |
|
name: Spearman Dot |
|
- type: pearson_max |
|
value: 0.7882638792170585 |
|
name: Pearson Max |
|
- type: spearman_max |
|
value: 0.7776198286738142 |
|
name: Spearman Max |
|
- task: |
|
type: binary-classification |
|
name: Binary Classification |
|
dataset: |
|
name: allNLI dev |
|
type: allNLI-dev |
|
metrics: |
|
- type: cosine_accuracy |
|
value: 0.708984375 |
|
name: Cosine Accuracy |
|
- type: cosine_accuracy_threshold |
|
value: 0.8714957237243652 |
|
name: Cosine Accuracy Threshold |
|
- type: cosine_f1 |
|
value: 0.5913043478260869 |
|
name: Cosine F1 |
|
- type: cosine_f1_threshold |
|
value: 0.7768557071685791 |
|
name: Cosine F1 Threshold |
|
- type: cosine_precision |
|
value: 0.4738675958188153 |
|
name: Cosine Precision |
|
- type: cosine_recall |
|
value: 0.7861271676300579 |
|
name: Cosine Recall |
|
- type: cosine_ap |
|
value: 0.5644305887001508 |
|
name: Cosine Ap |
|
- type: dot_accuracy |
|
value: 0.7109375 |
|
name: Dot Accuracy |
|
- type: dot_accuracy_threshold |
|
value: 674.426025390625 |
|
name: Dot Accuracy Threshold |
|
- type: dot_f1 |
|
value: 0.5913043478260869 |
|
name: Dot F1 |
|
- type: dot_f1_threshold |
|
value: 603.435302734375 |
|
name: Dot F1 Threshold |
|
- type: dot_precision |
|
value: 0.4738675958188153 |
|
name: Dot Precision |
|
- type: dot_recall |
|
value: 0.7861271676300579 |
|
name: Dot Recall |
|
- type: dot_ap |
|
value: 0.5664868031504724 |
|
name: Dot Ap |
|
- type: manhattan_accuracy |
|
value: 0.7109375 |
|
name: Manhattan Accuracy |
|
- type: manhattan_accuracy_threshold |
|
value: 294.4728088378906 |
|
name: Manhattan Accuracy Threshold |
|
- type: manhattan_f1 |
|
value: 0.5935483870967742 |
|
name: Manhattan F1 |
|
- type: manhattan_f1_threshold |
|
value: 401.1482849121094 |
|
name: Manhattan F1 Threshold |
|
- type: manhattan_precision |
|
value: 0.4726027397260274 |
|
name: Manhattan Precision |
|
- type: manhattan_recall |
|
value: 0.7976878612716763 |
|
name: Manhattan Recall |
|
- type: manhattan_ap |
|
value: 0.5642688421649988 |
|
name: Manhattan Ap |
|
- type: euclidean_accuracy |
|
value: 0.7109375 |
|
name: Euclidean Accuracy |
|
- type: euclidean_accuracy_threshold |
|
value: 14.565500259399414 |
|
name: Euclidean Accuracy Threshold |
|
- type: euclidean_f1 |
|
value: 0.5913043478260869 |
|
name: Euclidean F1 |
|
- type: euclidean_f1_threshold |
|
value: 18.60409164428711 |
|
name: Euclidean F1 Threshold |
|
- type: euclidean_precision |
|
value: 0.4738675958188153 |
|
name: Euclidean Precision |
|
- type: euclidean_recall |
|
value: 0.7861271676300579 |
|
name: Euclidean Recall |
|
- type: euclidean_ap |
|
value: 0.5645557227019772 |
|
name: Euclidean Ap |
|
- type: max_accuracy |
|
value: 0.7109375 |
|
name: Max Accuracy |
|
- type: max_accuracy_threshold |
|
value: 674.426025390625 |
|
name: Max Accuracy Threshold |
|
- type: max_f1 |
|
value: 0.5935483870967742 |
|
name: Max F1 |
|
- type: max_f1_threshold |
|
value: 603.435302734375 |
|
name: Max F1 Threshold |
|
- type: max_precision |
|
value: 0.4738675958188153 |
|
name: Max Precision |
|
- type: max_recall |
|
value: 0.7976878612716763 |
|
name: Max Recall |
|
- type: max_ap |
|
value: 0.5664868031504724 |
|
name: Max Ap |
|
- task: |
|
type: binary-classification |
|
name: Binary Classification |
|
dataset: |
|
name: Qnli dev |
|
type: Qnli-dev |
|
metrics: |
|
- type: cosine_accuracy |
|
value: 0.6796875 |
|
name: Cosine Accuracy |
|
- type: cosine_accuracy_threshold |
|
value: 0.7726649045944214 |
|
name: Cosine Accuracy Threshold |
|
- type: cosine_f1 |
|
value: 0.6925675675675677 |
|
name: Cosine F1 |
|
- type: cosine_f1_threshold |
|
value: 0.7317887544631958 |
|
name: Cosine F1 Threshold |
|
- type: cosine_precision |
|
value: 0.5758426966292135 |
|
name: Cosine Precision |
|
- type: cosine_recall |
|
value: 0.8686440677966102 |
|
name: Cosine Recall |
|
- type: cosine_ap |
|
value: 0.7302564198016936 |
|
name: Cosine Ap |
|
- type: dot_accuracy |
|
value: 0.67578125 |
|
name: Dot Accuracy |
|
- type: dot_accuracy_threshold |
|
value: 598.0419921875 |
|
name: Dot Accuracy Threshold |
|
- type: dot_f1 |
|
value: 0.6912751677852348 |
|
name: Dot F1 |
|
- type: dot_f1_threshold |
|
value: 565.4718017578125 |
|
name: Dot F1 Threshold |
|
- type: dot_precision |
|
value: 0.5722222222222222 |
|
name: Dot Precision |
|
- type: dot_recall |
|
value: 0.8728813559322034 |
|
name: Dot Recall |
|
- type: dot_ap |
|
value: 0.7300462025003271 |
|
name: Dot Ap |
|
- type: manhattan_accuracy |
|
value: 0.6796875 |
|
name: Manhattan Accuracy |
|
- type: manhattan_accuracy_threshold |
|
value: 404.8309020996094 |
|
name: Manhattan Accuracy Threshold |
|
- type: manhattan_f1 |
|
value: 0.6933333333333332 |
|
name: Manhattan F1 |
|
- type: manhattan_f1_threshold |
|
value: 444.99224853515625 |
|
name: Manhattan F1 Threshold |
|
- type: manhattan_precision |
|
value: 0.5714285714285714 |
|
name: Manhattan Precision |
|
- type: manhattan_recall |
|
value: 0.8813559322033898 |
|
name: Manhattan Recall |
|
- type: manhattan_ap |
|
value: 0.7369214156436785 |
|
name: Manhattan Ap |
|
- type: euclidean_accuracy |
|
value: 0.6796875 |
|
name: Euclidean Accuracy |
|
- type: euclidean_accuracy_threshold |
|
value: 18.790739059448242 |
|
name: Euclidean Accuracy Threshold |
|
- type: euclidean_f1 |
|
value: 0.6934306569343065 |
|
name: Euclidean F1 |
|
- type: euclidean_f1_threshold |
|
value: 19.35132598876953 |
|
name: Euclidean F1 Threshold |
|
- type: euclidean_precision |
|
value: 0.6089743589743589 |
|
name: Euclidean Precision |
|
- type: euclidean_recall |
|
value: 0.8050847457627118 |
|
name: Euclidean Recall |
|
- type: euclidean_ap |
|
value: 0.7307381840067684 |
|
name: Euclidean Ap |
|
- type: max_accuracy |
|
value: 0.6796875 |
|
name: Max Accuracy |
|
- type: max_accuracy_threshold |
|
value: 598.0419921875 |
|
name: Max Accuracy Threshold |
|
- type: max_f1 |
|
value: 0.6934306569343065 |
|
name: Max F1 |
|
- type: max_f1_threshold |
|
value: 565.4718017578125 |
|
name: Max F1 Threshold |
|
- type: max_precision |
|
value: 0.6089743589743589 |
|
name: Max Precision |
|
- type: max_recall |
|
value: 0.8813559322033898 |
|
name: Max Recall |
|
- type: max_ap |
|
value: 0.7369214156436785 |
|
name: Max Ap |
|
--- |
|
|
|
# SentenceTransformer based on microsoft/deberta-v3-small |
|
|
|
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
- **Model Type:** Sentence Transformer |
|
- **Base model:** [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small) <!-- at revision a36c739020e01763fe789b4b85e2df55d6180012 --> |
|
- **Maximum Sequence Length:** 512 tokens |
|
- **Output Dimensionality:** 768 tokens |
|
- **Similarity Function:** Cosine Similarity |
|
<!-- - **Training Dataset:** Unknown --> |
|
- **Language:** en |
|
<!-- - **License:** Unknown --> |
|
|
|
### Model Sources |
|
|
|
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net) |
|
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) |
|
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) |
|
|
|
### Full Model Architecture |
|
|
|
``` |
|
SentenceTransformer( |
|
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model |
|
(1): AdvancedWeightedPooling( |
|
(linear_cls): Linear(in_features=768, out_features=768, bias=True) |
|
(linear_mean): Linear(in_features=768, out_features=768, bias=True) |
|
(mha): MultiheadAttention( |
|
(out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True) |
|
) |
|
(layernorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True) |
|
(layernorm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) |
|
(layernorm_cls): LayerNorm((768,), eps=1e-05, elementwise_affine=True) |
|
(layernorm_mean): LayerNorm((768,), eps=1e-05, elementwise_affine=True) |
|
) |
|
) |
|
``` |
|
|
|
## Usage |
|
|
|
### Direct Usage (Sentence Transformers) |
|
|
|
First install the Sentence Transformers library: |
|
|
|
```bash |
|
pip install -U sentence-transformers |
|
``` |
|
|
|
Then you can load this model and run inference. |
|
```python |
|
from sentence_transformers import SentenceTransformer |
|
|
|
# Download from the 🤗 Hub |
|
model = SentenceTransformer("bobox/DeBERTa3-s-CustomPoolin-v3-step1") |
|
# Run inference |
|
sentences = [ |
|
'What was the name of eleven rulers of the 19th and 20th Egyptian dynasties?', |
|
'List of Rulers of Ancient Egypt and Nubia | Lists of Rulers | Heilbrunn Timeline of Art History | The Metropolitan Museum of Art The Metropolitan Museum of Art List of Rulers of Ancient Egypt and Nubia See works of art 30.8.234 52.127.4 Our knowledge of the succession of Egyptian kings is based on kinglists kept by the ancient Egyptians themselves. The most famous are the Palermo Stone, which covers the period from the earliest dynasties to the middle of Dynasty 5; the Abydos Kinglist, which Seti I had carved on his temple at Abydos; and the Turin Canon, a papyrus that covers the period from the earliest dynasties to the reign of Ramesses II. All are incomplete or fragmentary. We also rely on the History of Egypt written by Manetho in the third century B.C. A priest in the temple at Heliopolis, Manetho had access to many original sources and it was he who divided the kings into the thirty dynasties we use today. It is to this structure of dynasties and listed kings that we now attempt to link an absolute chronology of dates in terms of our own calendrical system. The process is made difficult by the fragmentary condition of the kinglists and by differences in the calendrical years used at various times. Some astronomical observations from the ancient Egyptians have survived, allowing us to calculate absolute dates within a margin of error. Synchronisms with the other civilizations of the ancient world are also of limited use.', |
|
'What is the "Jack Sprat" nursery rhyme? | Reference.com What is the "Jack Sprat" nursery rhyme? A: Quick Answer "Jack Sprat" is a traditional English nursery rhyme whose main verse says, "Jack Sprat could eat no fat. His wife could eat no lean. And so between them both, you see, they licked the platter clean." Though it was likely sung by children long before, "Jack Sprat" was first published around 1765 in the compilation "Mother Goose\'s Melody." Full Answer According to Rhymes.org, a U.K. website devoted to nursery rhyme lyrics and origins, the "Jack Sprat" nursery rhyme has its origins in British history. In one interpretation, Jack Sprat was King Charles I, who ruled England in the early part of the 17th century, and his wife was Queen Henrietta Maria. Parliament refused to finance the king\'s war with Spain, which made him lean. However, the queen fattened the coffers by levying an illegal war tax. In an alternative version, the "Jack Sprat" nursery rhyme is linked to King Richard and his brother John of the Robin Hood legend. Jack Sprat was King John, the usurper who tried to take over the crown when King Richard went off to fight in the Crusades in the 12th century. When King Richard was captured, John had to raise a ransom to rescue him, leaving the country lean. The wife was Joan, daughter of the Earl of Gloucester, the greedy wife of King John. However, after King Richard died and John became king, he had his marriage with Joan annulled.', |
|
] |
|
embeddings = model.encode(sentences) |
|
print(embeddings.shape) |
|
# [3, 768] |
|
|
|
# Get the similarity scores for the embeddings |
|
similarities = model.similarity(embeddings, embeddings) |
|
print(similarities.shape) |
|
# [3, 3] |
|
``` |
|
|
|
<!-- |
|
### Direct Usage (Transformers) |
|
|
|
<details><summary>Click to see the direct usage in Transformers</summary> |
|
|
|
</details> |
|
--> |
|
|
|
<!-- |
|
### Downstream Usage (Sentence Transformers) |
|
|
|
You can finetune this model on your own dataset. |
|
|
|
<details><summary>Click to expand</summary> |
|
|
|
</details> |
|
--> |
|
|
|
<!-- |
|
### Out-of-Scope Use |
|
|
|
*List how the model may foreseeably be misused and address what users ought not to do with the model.* |
|
--> |
|
|
|
## Evaluation |
|
|
|
### Metrics |
|
|
|
#### Semantic Similarity |
|
* Dataset: `sts-test` |
|
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) |
|
|
|
| Metric | Value | |
|
|:--------------------|:-----------| |
|
| pearson_cosine | 0.7674 | |
|
| **spearman_cosine** | **0.7776** | |
|
| pearson_manhattan | 0.7824 | |
|
| spearman_manhattan | 0.7721 | |
|
| pearson_euclidean | 0.7883 | |
|
| spearman_euclidean | 0.7775 | |
|
| pearson_dot | 0.7669 | |
|
| spearman_dot | 0.7763 | |
|
| pearson_max | 0.7883 | |
|
| spearman_max | 0.7776 | |
|
|
|
#### Binary Classification |
|
* Dataset: `allNLI-dev` |
|
* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator) |
|
|
|
| Metric | Value | |
|
|:-----------------------------|:-----------| |
|
| cosine_accuracy | 0.709 | |
|
| cosine_accuracy_threshold | 0.8715 | |
|
| cosine_f1 | 0.5913 | |
|
| cosine_f1_threshold | 0.7769 | |
|
| cosine_precision | 0.4739 | |
|
| cosine_recall | 0.7861 | |
|
| cosine_ap | 0.5644 | |
|
| dot_accuracy | 0.7109 | |
|
| dot_accuracy_threshold | 674.426 | |
|
| dot_f1 | 0.5913 | |
|
| dot_f1_threshold | 603.4353 | |
|
| dot_precision | 0.4739 | |
|
| dot_recall | 0.7861 | |
|
| dot_ap | 0.5665 | |
|
| manhattan_accuracy | 0.7109 | |
|
| manhattan_accuracy_threshold | 294.4728 | |
|
| manhattan_f1 | 0.5935 | |
|
| manhattan_f1_threshold | 401.1483 | |
|
| manhattan_precision | 0.4726 | |
|
| manhattan_recall | 0.7977 | |
|
| manhattan_ap | 0.5643 | |
|
| euclidean_accuracy | 0.7109 | |
|
| euclidean_accuracy_threshold | 14.5655 | |
|
| euclidean_f1 | 0.5913 | |
|
| euclidean_f1_threshold | 18.6041 | |
|
| euclidean_precision | 0.4739 | |
|
| euclidean_recall | 0.7861 | |
|
| euclidean_ap | 0.5646 | |
|
| max_accuracy | 0.7109 | |
|
| max_accuracy_threshold | 674.426 | |
|
| max_f1 | 0.5935 | |
|
| max_f1_threshold | 603.4353 | |
|
| max_precision | 0.4739 | |
|
| max_recall | 0.7977 | |
|
| **max_ap** | **0.5665** | |
|
|
|
#### Binary Classification |
|
* Dataset: `Qnli-dev` |
|
* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator) |
|
|
|
| Metric | Value | |
|
|:-----------------------------|:-----------| |
|
| cosine_accuracy | 0.6797 | |
|
| cosine_accuracy_threshold | 0.7727 | |
|
| cosine_f1 | 0.6926 | |
|
| cosine_f1_threshold | 0.7318 | |
|
| cosine_precision | 0.5758 | |
|
| cosine_recall | 0.8686 | |
|
| cosine_ap | 0.7303 | |
|
| dot_accuracy | 0.6758 | |
|
| dot_accuracy_threshold | 598.042 | |
|
| dot_f1 | 0.6913 | |
|
| dot_f1_threshold | 565.4718 | |
|
| dot_precision | 0.5722 | |
|
| dot_recall | 0.8729 | |
|
| dot_ap | 0.73 | |
|
| manhattan_accuracy | 0.6797 | |
|
| manhattan_accuracy_threshold | 404.8309 | |
|
| manhattan_f1 | 0.6933 | |
|
| manhattan_f1_threshold | 444.9922 | |
|
| manhattan_precision | 0.5714 | |
|
| manhattan_recall | 0.8814 | |
|
| manhattan_ap | 0.7369 | |
|
| euclidean_accuracy | 0.6797 | |
|
| euclidean_accuracy_threshold | 18.7907 | |
|
| euclidean_f1 | 0.6934 | |
|
| euclidean_f1_threshold | 19.3513 | |
|
| euclidean_precision | 0.609 | |
|
| euclidean_recall | 0.8051 | |
|
| euclidean_ap | 0.7307 | |
|
| max_accuracy | 0.6797 | |
|
| max_accuracy_threshold | 598.042 | |
|
| max_f1 | 0.6934 | |
|
| max_f1_threshold | 565.4718 | |
|
| max_precision | 0.609 | |
|
| max_recall | 0.8814 | |
|
| **max_ap** | **0.7369** | |
|
|
|
<!-- |
|
## Bias, Risks and Limitations |
|
|
|
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.* |
|
--> |
|
|
|
<!-- |
|
### Recommendations |
|
|
|
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.* |
|
--> |
|
|
|
## Training Details |
|
|
|
### Evaluation Dataset |
|
|
|
#### vitaminc-pairs |
|
|
|
* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0) |
|
* Size: 128 evaluation samples |
|
* Columns: <code>claim</code> and <code>evidence</code> |
|
* Approximate statistics based on the first 128 samples: |
|
| | claim | evidence | |
|
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| |
|
| type | string | string | |
|
| details | <ul><li>min: 9 tokens</li><li>mean: 21.42 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 35.55 tokens</li><li>max: 79 tokens</li></ul> | |
|
* Samples: |
|
| claim | evidence | |
|
|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| |
|
| <code>Dragon Con had over 5000 guests .</code> | <code>Among the more than 6000 guests and musical performers at the 2009 convention were such notables as Patrick Stewart , William Shatner , Leonard Nimoy , Terry Gilliam , Bruce Boxleitner , James Marsters , and Mary McDonnell .</code> | |
|
| <code>COVID-19 has reached more than 185 countries .</code> | <code>As of , more than cases of COVID-19 have been reported in more than 190 countries and 200 territories , resulting in more than deaths .</code> | |
|
| <code>In March , Italy had 3.6x times more cases of coronavirus than China .</code> | <code>As of 12 March , among nations with at least one million citizens , Italy has the world 's highest per capita rate of positive coronavirus cases at 206.1 cases per million people ( 3.6x times the rate of China ) and is the country with the second-highest number of positive cases as well as of deaths in the world , after China .</code> | |
|
* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters: |
|
```json |
|
{'guide': SentenceTransformer( |
|
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel |
|
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) |
|
(2): Normalize() |
|
), 'temperature': 0.025} |
|
``` |
|
|
|
### Training Hyperparameters |
|
#### Non-Default Hyperparameters |
|
|
|
- `eval_strategy`: steps |
|
- `per_device_train_batch_size`: 100 |
|
- `per_device_eval_batch_size`: 256 |
|
- `gradient_accumulation_steps`: 2 |
|
- `lr_scheduler_type`: cosine_with_min_lr |
|
- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 1.6666666666666667e-05} |
|
- `warmup_ratio`: 0.33 |
|
- `save_safetensors`: False |
|
- `fp16`: True |
|
- `push_to_hub`: True |
|
- `hub_model_id`: bobox/DeBERTa3-s-CustomPoolin-v3-step1-checkpoints-tmp |
|
- `hub_strategy`: all_checkpoints |
|
- `batch_sampler`: no_duplicates |
|
|
|
#### All Hyperparameters |
|
<details><summary>Click to expand</summary> |
|
|
|
- `overwrite_output_dir`: False |
|
- `do_predict`: False |
|
- `eval_strategy`: steps |
|
- `prediction_loss_only`: True |
|
- `per_device_train_batch_size`: 100 |
|
- `per_device_eval_batch_size`: 256 |
|
- `per_gpu_train_batch_size`: None |
|
- `per_gpu_eval_batch_size`: None |
|
- `gradient_accumulation_steps`: 2 |
|
- `eval_accumulation_steps`: None |
|
- `torch_empty_cache_steps`: None |
|
- `learning_rate`: 5e-05 |
|
- `weight_decay`: 0.0 |
|
- `adam_beta1`: 0.9 |
|
- `adam_beta2`: 0.999 |
|
- `adam_epsilon`: 1e-08 |
|
- `max_grad_norm`: 1.0 |
|
- `num_train_epochs`: 3 |
|
- `max_steps`: -1 |
|
- `lr_scheduler_type`: cosine_with_min_lr |
|
- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 1.6666666666666667e-05} |
|
- `warmup_ratio`: 0.33 |
|
- `warmup_steps`: 0 |
|
- `log_level`: passive |
|
- `log_level_replica`: warning |
|
- `log_on_each_node`: True |
|
- `logging_nan_inf_filter`: True |
|
- `save_safetensors`: False |
|
- `save_on_each_node`: False |
|
- `save_only_model`: False |
|
- `restore_callback_states_from_checkpoint`: False |
|
- `no_cuda`: False |
|
- `use_cpu`: False |
|
- `use_mps_device`: False |
|
- `seed`: 42 |
|
- `data_seed`: None |
|
- `jit_mode_eval`: False |
|
- `use_ipex`: False |
|
- `bf16`: False |
|
- `fp16`: True |
|
- `fp16_opt_level`: O1 |
|
- `half_precision_backend`: auto |
|
- `bf16_full_eval`: False |
|
- `fp16_full_eval`: False |
|
- `tf32`: None |
|
- `local_rank`: 0 |
|
- `ddp_backend`: None |
|
- `tpu_num_cores`: None |
|
- `tpu_metrics_debug`: False |
|
- `debug`: [] |
|
- `dataloader_drop_last`: False |
|
- `dataloader_num_workers`: 0 |
|
- `dataloader_prefetch_factor`: None |
|
- `past_index`: -1 |
|
- `disable_tqdm`: False |
|
- `remove_unused_columns`: True |
|
- `label_names`: None |
|
- `load_best_model_at_end`: False |
|
- `ignore_data_skip`: False |
|
- `fsdp`: [] |
|
- `fsdp_min_num_params`: 0 |
|
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} |
|
- `fsdp_transformer_layer_cls_to_wrap`: None |
|
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} |
|
- `deepspeed`: None |
|
- `label_smoothing_factor`: 0.0 |
|
- `optim`: adamw_torch |
|
- `optim_args`: None |
|
- `adafactor`: False |
|
- `group_by_length`: False |
|
- `length_column_name`: length |
|
- `ddp_find_unused_parameters`: None |
|
- `ddp_bucket_cap_mb`: None |
|
- `ddp_broadcast_buffers`: False |
|
- `dataloader_pin_memory`: True |
|
- `dataloader_persistent_workers`: False |
|
- `skip_memory_metrics`: True |
|
- `use_legacy_prediction_loop`: False |
|
- `push_to_hub`: True |
|
- `resume_from_checkpoint`: None |
|
- `hub_model_id`: bobox/DeBERTa3-s-CustomPoolin-v3-step1-checkpoints-tmp |
|
- `hub_strategy`: all_checkpoints |
|
- `hub_private_repo`: False |
|
- `hub_always_push`: False |
|
- `gradient_checkpointing`: False |
|
- `gradient_checkpointing_kwargs`: None |
|
- `include_inputs_for_metrics`: False |
|
- `eval_do_concat_batches`: True |
|
- `fp16_backend`: auto |
|
- `push_to_hub_model_id`: None |
|
- `push_to_hub_organization`: None |
|
- `mp_parameters`: |
|
- `auto_find_batch_size`: False |
|
- `full_determinism`: False |
|
- `torchdynamo`: None |
|
- `ray_scope`: last |
|
- `ddp_timeout`: 1800 |
|
- `torch_compile`: False |
|
- `torch_compile_backend`: None |
|
- `torch_compile_mode`: None |
|
- `dispatch_batches`: None |
|
- `split_batches`: None |
|
- `include_tokens_per_second`: False |
|
- `include_num_input_tokens_seen`: False |
|
- `neftune_noise_alpha`: None |
|
- `optim_target_modules`: None |
|
- `batch_eval_metrics`: False |
|
- `eval_on_start`: False |
|
- `eval_use_gather_object`: False |
|
- `batch_sampler`: no_duplicates |
|
- `multi_dataset_batch_sampler`: proportional |
|
|
|
</details> |
|
|
|
### Training Logs |
|
<details><summary>Click to expand</summary> |
|
|
|
| Epoch | Step | Training Loss | vitaminc-pairs loss | negation-triplets loss | scitail-pairs-pos loss | scitail-pairs-qa loss | xsum-pairs loss | sciq pairs loss | qasc pairs loss | openbookqa pairs loss | msmarco pairs loss | nq pairs loss | trivia pairs loss | gooaq pairs loss | paws-pos loss | global dataset loss | sts-test_spearman_cosine | allNLI-dev_max_ap | Qnli-dev_max_ap | |
|
|:------:|:----:|:-------------:|:-------------------:|:----------------------:|:----------------------:|:---------------------:|:---------------:|:---------------:|:---------------:|:---------------------:|:------------------:|:-------------:|:-----------------:|:----------------:|:-------------:|:-------------------:|:------------------------:|:-----------------:|:---------------:| |
|
| 0.0168 | 8 | 10.2928 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.0336 | 16 | 9.2166 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.0504 | 24 | 9.4858 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.0672 | 32 | 10.6143 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.0840 | 40 | 8.7553 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.1008 | 48 | 10.9939 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.1176 | 56 | 7.6039 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.1345 | 64 | 5.9498 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.1513 | 72 | 7.3051 | 3.2988 | 3.9604 | 1.9818 | 2.1997 | 6.0515 | 0.6095 | 6.3199 | 4.8391 | 6.4886 | 6.6406 | 6.4894 | 6.1527 | 2.0082 | 4.9577 | 0.3066 | 0.3444 | 0.5627 | |
|
| 0.1681 | 80 | 8.3034 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.1849 | 88 | 7.6669 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.2017 | 96 | 6.6415 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.2185 | 104 | 5.7797 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.2353 | 112 | 5.8361 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.2521 | 120 | 5.3339 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.2689 | 128 | 5.5908 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.2857 | 136 | 5.3209 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.3025 | 144 | 5.5359 | 3.3310 | 3.8580 | 1.4769 | 1.6994 | 5.4819 | 0.5385 | 5.2021 | 4.4410 | 5.3419 | 5.5506 | 5.6972 | 5.3376 | 1.4170 | 3.9169 | 0.2954 | 0.3795 | 0.6317 | |
|
| 0.3193 | 152 | 5.4713 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.3361 | 160 | 4.9368 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.3529 | 168 | 4.6594 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.3697 | 176 | 4.8392 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.3866 | 184 | 4.414 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.4034 | 192 | 4.891 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.4202 | 200 | 4.4553 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.4370 | 208 | 3.9729 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.4538 | 216 | 3.7705 | 3.2468 | 3.6435 | 0.7890 | 0.7356 | 3.9327 | 0.4082 | 3.7175 | 3.5404 | 3.5351 | 4.0506 | 3.9953 | 3.6074 | 0.4195 | 2.4726 | 0.3791 | 0.4133 | 0.6779 | |
|
| 0.4706 | 224 | 3.8409 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.4874 | 232 | 3.7894 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.5042 | 240 | 3.3523 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.5210 | 248 | 3.2407 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.5378 | 256 | 3.3203 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.5546 | 264 | 2.8457 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.5714 | 272 | 2.4181 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.5882 | 280 | 3.4589 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.6050 | 288 | 2.8203 | 3.1119 | 3.1485 | 0.4531 | 0.2652 | 2.6895 | 0.2656 | 2.5542 | 2.7523 | 2.6600 | 3.1773 | 3.2099 | 2.7316 | 0.2006 | 1.6342 | 0.5257 | 0.4717 | 0.7078 | |
|
| 0.6218 | 296 | 2.4697 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.6387 | 304 | 2.4654 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.6555 | 312 | 2.4236 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.6723 | 320 | 2.2879 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.6891 | 328 | 2.2145 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.7059 | 336 | 1.8464 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.7227 | 344 | 2.0086 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.7395 | 352 | 2.0635 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.7563 | 360 | 1.8584 | 3.3202 | 2.5793 | 0.3434 | 0.1618 | 1.6759 | 0.1834 | 1.6454 | 2.1257 | 2.1938 | 2.5316 | 2.4558 | 2.0596 | 0.0984 | 1.2206 | 0.6610 | 0.5199 | 0.7119 | |
|
| 0.7731 | 368 | 2.0286 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.7899 | 376 | 1.9389 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.8067 | 384 | 1.7453 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.8235 | 392 | 1.6629 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.8403 | 400 | 1.2724 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.8571 | 408 | 1.7824 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.8739 | 416 | 1.5826 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.8908 | 424 | 1.1971 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.9076 | 432 | 1.5228 | 3.3624 | 2.1952 | 0.3006 | 0.1223 | 1.1091 | 0.1582 | 1.2383 | 1.8664 | 1.7434 | 2.3959 | 2.0697 | 1.7563 | 0.0766 | 1.0193 | 0.7292 | 0.5194 | 0.7126 | |
|
| 0.9244 | 440 | 1.3323 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.9412 | 448 | 1.5124 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.9580 | 456 | 1.5565 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.9748 | 464 | 1.3672 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 0.9916 | 472 | 1.0382 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.0084 | 480 | 1.0626 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.0252 | 488 | 1.3539 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.0420 | 496 | 1.1723 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.0588 | 504 | 1.4235 | 3.4031 | 1.9759 | 0.2554 | 0.0814 | 0.9034 | 0.1378 | 1.1603 | 1.7589 | 1.5608 | 2.1230 | 1.7719 | 1.6633 | 0.0720 | 0.9380 | 0.7523 | 0.5297 | 0.7129 | |
|
| 1.0756 | 512 | 1.2283 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.0924 | 520 | 1.2455 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.1092 | 528 | 1.4265 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.1261 | 536 | 1.296 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.1429 | 544 | 0.8763 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.1597 | 552 | 1.5678 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.1765 | 560 | 1.2548 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.1933 | 568 | 1.3731 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.2101 | 576 | 1.3023 | 3.3815 | 1.8740 | 0.2373 | 0.0769 | 0.7711 | 0.1237 | 0.9432 | 1.6871 | 1.5070 | 1.9947 | 1.6041 | 1.5579 | 0.0721 | 0.8661 | 0.7642 | 0.5412 | 0.7159 | |
|
| 1.2269 | 584 | 0.8135 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.2437 | 592 | 1.0259 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.2605 | 600 | 1.1896 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.2773 | 608 | 1.0532 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.2941 | 616 | 1.3221 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.3109 | 624 | 1.3136 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.3277 | 632 | 1.2238 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.3445 | 640 | 1.2407 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.3613 | 648 | 1.2245 | 3.4717 | 1.7962 | 0.2242 | 0.0488 | 0.7472 | 0.1108 | 0.9272 | 1.6692 | 1.3845 | 1.9117 | 1.3410 | 1.4387 | 0.0701 | 0.8505 | 0.7680 | 0.5471 | 0.7227 | |
|
| 1.3782 | 656 | 1.0428 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.3950 | 664 | 1.1391 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.4118 | 672 | 1.2632 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.4286 | 680 | 0.9403 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.4454 | 688 | 0.7571 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.4622 | 696 | 0.9436 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.4790 | 704 | 1.1239 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.4958 | 712 | 0.9499 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.5126 | 720 | 1.0945 | 3.6495 | 1.6693 | 0.2157 | 0.0492 | 0.6830 | 0.1049 | 0.9140 | 1.5967 | 1.4397 | 1.7394 | 1.3303 | 1.4334 | 0.0603 | 0.8185 | 0.7815 | 0.5606 | 0.7098 | |
|
| 1.5294 | 728 | 1.1161 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.5462 | 736 | 1.0056 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.5630 | 744 | 1.1743 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.5798 | 752 | 0.9153 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.5966 | 760 | 1.1589 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.6134 | 768 | 0.9187 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.6303 | 776 | 0.6937 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.6471 | 784 | 0.9704 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.6639 | 792 | 0.7343 | 3.5442 | 1.6493 | 0.2208 | 0.0249 | 0.6152 | 0.0969 | 0.7111 | 1.5369 | 1.4058 | 1.7066 | 1.2784 | 1.3419 | 0.0585 | 0.7827 | 0.7749 | 0.5627 | 0.7284 | |
|
| 1.6807 | 800 | 1.2878 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.6975 | 808 | 0.9898 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.7143 | 816 | 0.7613 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.7311 | 824 | 0.9612 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.7479 | 832 | 1.1524 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.7647 | 840 | 0.827 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.7815 | 848 | 1.1898 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.7983 | 856 | 1.0117 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.8151 | 864 | 0.7019 | 3.4544 | 1.6149 | 0.2035 | 0.0181 | 0.5525 | 0.0999 | 0.6641 | 1.5456 | 1.3911 | 1.7188 | 1.2547 | 1.3517 | 0.0562 | 0.7473 | 0.7684 | 0.5697 | 0.7329 | |
|
| 1.8319 | 872 | 0.8352 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.8487 | 880 | 0.7836 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.8655 | 888 | 1.0187 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.8824 | 896 | 0.74 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.8992 | 904 | 0.7263 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.9160 | 912 | 0.8073 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.9328 | 920 | 0.8185 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.9496 | 928 | 1.0992 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 1.9664 | 936 | 0.9973 | 3.5110 | 1.5776 | 0.2035 | 0.0250 | 0.5881 | 0.0934 | 0.6719 | 1.5059 | 1.2970 | 1.6186 | 1.1815 | 1.2714 | 0.0564 | 0.7213 | 0.7799 | 0.5544 | 0.7341 | |
|
| 1.9832 | 944 | 0.6662 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.0 | 952 | 0.533 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.0168 | 960 | 0.7712 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.0336 | 968 | 0.6879 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.0504 | 976 | 0.7975 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.0672 | 984 | 0.873 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.0840 | 992 | 0.7995 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.1008 | 1000 | 1.0119 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.1176 | 1008 | 0.6317 | 3.6778 | 1.5845 | 0.2102 | 0.0228 | 0.5851 | 0.0977 | 0.6411 | 1.4752 | 1.2992 | 1.6314 | 1.1260 | 1.2683 | 0.0556 | 0.7329 | 0.7693 | 0.5614 | 0.7274 | |
|
| 2.1345 | 1016 | 0.72 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.1513 | 1024 | 0.9418 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.1681 | 1032 | 0.7848 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.1849 | 1040 | 0.6965 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.2017 | 1048 | 1.0447 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.2185 | 1056 | 0.6361 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.2353 | 1064 | 0.6837 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.2521 | 1072 | 0.5713 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.2689 | 1080 | 0.8193 | 3.6399 | 1.5565 | 0.2069 | 0.0213 | 0.5440 | 0.0904 | 0.6057 | 1.4815 | 1.2856 | 1.6441 | 1.1469 | 1.2540 | 0.0543 | 0.7216 | 0.7765 | 0.5599 | 0.7322 | |
|
| 2.2857 | 1088 | 0.9754 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.3025 | 1096 | 0.8932 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.3193 | 1104 | 0.8716 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.3361 | 1112 | 0.8787 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.3529 | 1120 | 0.9529 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.3697 | 1128 | 0.775 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.3866 | 1136 | 0.6178 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.4034 | 1144 | 0.8384 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.4202 | 1152 | 0.9425 | 3.5672 | 1.5244 | 0.2111 | 0.0162 | 0.5593 | 0.0893 | 0.5759 | 1.4933 | 1.2703 | 1.5815 | 1.1202 | 1.2132 | 0.0531 | 0.7058 | 0.7730 | 0.5635 | 0.7350 | |
|
| 2.4370 | 1160 | 0.4551 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.4538 | 1168 | 0.6392 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.4706 | 1176 | 0.8341 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.4874 | 1184 | 0.7392 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.5042 | 1192 | 0.7646 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.5210 | 1200 | 0.8613 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.5378 | 1208 | 0.7585 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.5546 | 1216 | 1.0611 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.5714 | 1224 | 0.6506 | 3.6439 | 1.5040 | 0.2125 | 0.0162 | 0.5282 | 0.0863 | 0.5858 | 1.5073 | 1.2444 | 1.5493 | 1.1014 | 1.2073 | 0.0532 | 0.7022 | 0.7774 | 0.5647 | 0.7328 | |
|
| 2.5882 | 1232 | 0.8525 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.6050 | 1240 | 0.6304 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.6218 | 1248 | 0.6354 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.6387 | 1256 | 0.6583 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.6555 | 1264 | 0.5964 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.6723 | 1272 | 0.818 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.6891 | 1280 | 0.8635 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.7059 | 1288 | 0.6389 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.7227 | 1296 | 0.6819 | 3.6131 | 1.5104 | 0.2084 | 0.0148 | 0.5229 | 0.0854 | 0.5588 | 1.4963 | 1.2766 | 1.5679 | 1.0982 | 1.2203 | 0.0529 | 0.7059 | 0.7762 | 0.5659 | 0.7355 | |
|
| 2.7395 | 1304 | 0.7878 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.7563 | 1312 | 0.7638 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.7731 | 1320 | 0.8885 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.7899 | 1328 | 0.8184 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.8067 | 1336 | 0.7472 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.8235 | 1344 | 0.7012 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.8403 | 1352 | 0.4622 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.8571 | 1360 | 0.846 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.8739 | 1368 | 0.8308 | 3.6224 | 1.5088 | 0.2084 | 0.0148 | 0.5118 | 0.0858 | 0.5523 | 1.4941 | 1.2756 | 1.5808 | 1.0925 | 1.2114 | 0.0521 | 0.7022 | 0.7765 | 0.5662 | 0.7366 | |
|
| 2.8908 | 1376 | 0.5334 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.9076 | 1384 | 0.7893 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.9244 | 1392 | 0.6897 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.9412 | 1400 | 0.7803 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.9580 | 1408 | 0.841 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.9748 | 1416 | 0.787 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 2.9916 | 1424 | 0.5861 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | |
|
| 3.0 | 1428 | - | 3.6139 | 1.5071 | 0.2084 | 0.0150 | 0.5124 | 0.0862 | 0.5532 | 1.4924 | 1.2700 | 1.5806 | 1.0905 | 1.2081 | 0.0519 | 0.6997 | 0.7776 | 0.5665 | 0.7369 | |
|
|
|
</details> |
|
|
|
### Framework Versions |
|
- Python: 3.10.12 |
|
- Sentence Transformers: 3.2.0 |
|
- Transformers: 4.44.2 |
|
- PyTorch: 2.4.1+cu121 |
|
- Accelerate: 0.34.2 |
|
- Datasets: 3.0.1 |
|
- Tokenizers: 0.19.1 |
|
|
|
## Citation |
|
|
|
### BibTeX |
|
|
|
#### Sentence Transformers |
|
```bibtex |
|
@inproceedings{reimers-2019-sentence-bert, |
|
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", |
|
author = "Reimers, Nils and Gurevych, Iryna", |
|
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", |
|
month = "11", |
|
year = "2019", |
|
publisher = "Association for Computational Linguistics", |
|
url = "https://arxiv.org/abs/1908.10084", |
|
} |
|
``` |
|
|
|
<!-- |
|
## Glossary |
|
|
|
*Clearly define terms in order to be accessible across audiences.* |
|
--> |
|
|
|
<!-- |
|
## Model Card Authors |
|
|
|
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.* |
|
--> |
|
|
|
<!-- |
|
## Model Card Contact |
|
|
|
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.* |
|
--> |