rbhatia46 commited on
Commit
0133738
·
verified ·
1 Parent(s): 34d0121

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,1156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/all-mpnet-base-v2
3
+ datasets: []
4
+ language:
5
+ - en
6
+ library_name: sentence-transformers
7
+ license: apache-2.0
8
+ metrics:
9
+ - cosine_accuracy@1
10
+ - cosine_accuracy@3
11
+ - cosine_accuracy@5
12
+ - cosine_accuracy@10
13
+ - cosine_precision@1
14
+ - cosine_precision@3
15
+ - cosine_precision@5
16
+ - cosine_precision@10
17
+ - cosine_recall@1
18
+ - cosine_recall@3
19
+ - cosine_recall@5
20
+ - cosine_recall@10
21
+ - cosine_ndcg@10
22
+ - cosine_mrr@10
23
+ - cosine_map@100
24
+ pipeline_tag: sentence-similarity
25
+ tags:
26
+ - sentence-transformers
27
+ - sentence-similarity
28
+ - feature-extraction
29
+ - generated_from_trainer
30
+ - dataset_size:169213
31
+ - loss:MatryoshkaLoss
32
+ - loss:MultipleNegativesRankingLoss
33
+ widget:
34
+ - source_sentence: This is bullshit. The US government requires taxes to be paid in
35
+ USD. There's your intrinsic value. If you want to be compliant with the federal
36
+ law, your business and you as an individual are required to convert assets or
37
+ labor into USD to pay them.
38
+ sentences:
39
+ - we love face paint melbourne
40
+ - how long to pay off debt
41
+ - what is the difference between us tax and mls
42
+ - source_sentence: '> There''s always another fresh-faced new grad with dollar
43
+ signs in his eyes who doesn''t know enough to ask about outstanding shares, dilution,
44
+ or preferences. They''ll learn soon enough. > Very few startups are looking
45
+ for penny-ante ''investor'' employees who can only put <$100k. You''ll probably
46
+ find that the majority of tech startups are looking for under $100k to get going.
47
+ Check out kickstarter.com sometime. > Actual employees are lucky if they can
48
+ properly value their options, let alone control how much it ends up being worth
49
+ in the end. If you''re asked to put in work without being fully compensated,
50
+ you are no longer an employee. You''re an investor. You need to change your way
51
+ of thinking.'
52
+ sentences:
53
+ - how much money is needed to start a company
54
+ - capital one interest rate
55
+ - can you transfer abc tax directly to a customer
56
+ - source_sentence: Let's suppose your friend gave your $100 and you invested all of
57
+ it (plus your own money, $500) into one stock. Therefore, the total investment
58
+ becomes $100 + $500 = $600. After few months, when you want to sell the stock
59
+ or give back the money to your friend, check the percentage of profit/loss. So,
60
+ let's assume you get 10% return on total investment of $600. Now, you have two
61
+ choices. Either you exit the stock entirely, OR you just sell his portion. If
62
+ you want to exit, sell everything and go home with $600 + 10% of 600 = $660. Out
63
+ of $660, give you friend his initial capital + 10% of initial capital. Therefore,
64
+ your friend will get $100 + 10% of $100 = $110. If you choose the later, to sell
65
+ his portion, then you'll need to work everything opposite. Take his initial capital
66
+ and add 10% of initial capital to it; which is $100 + 10% of $100 = $110. Sell
67
+ the stocks that would be worth equivalent to that money and that's it. Similarly,
68
+ you can apply the same logic if you broke his $100 into parts. Do the maths.
69
+ sentences:
70
+ - what do people think about getting a good job
71
+ - how to tell how much to sell a stock after buying one
72
+ - how to claim rrsp room allowance
73
+ - source_sentence: '"You''re acting like my comments are inconsistent. They''re not. I
74
+ think bitcoin''s price is primarily due to Chinese money being moved outside of
75
+ China. I don''t think you can point to a price chart and say ""Look, that''s the
76
+ Chinese money right there, and look, that part isn''t Chinese money"". That''s
77
+ what I said already."'
78
+ sentences:
79
+ - bitcoin price in china
80
+ - can i use tax act to file a spouse's tax
81
+ - what to look at if house sells for an appraiser?
82
+ - source_sentence: 'It''s simple, really: Practice. Fiscal responsibility is not a
83
+ trick you can learn look up on Google, or a service you can buy from your accountant. Being
84
+ responsible with your money is a skill that is learned over a lifetime. The only
85
+ way to get better at it is to practice, and not get discouraged when you make
86
+ mistakes.'
87
+ sentences:
88
+ - how long does it take for a loan to get paid interest
89
+ - whatsapp to use with a foreigner
90
+ - why do people have to be fiscally responsible
91
+ model-index:
92
+ - name: mpnet-base-financial-rag-matryoshka
93
+ results:
94
+ - task:
95
+ type: information-retrieval
96
+ name: Information Retrieval
97
+ dataset:
98
+ name: dim 768
99
+ type: dim_768
100
+ metrics:
101
+ - type: cosine_accuracy@1
102
+ value: 0.1809635722679201
103
+ name: Cosine Accuracy@1
104
+ - type: cosine_accuracy@3
105
+ value: 0.4935370152761457
106
+ name: Cosine Accuracy@3
107
+ - type: cosine_accuracy@5
108
+ value: 0.5734430082256169
109
+ name: Cosine Accuracy@5
110
+ - type: cosine_accuracy@10
111
+ value: 0.663924794359577
112
+ name: Cosine Accuracy@10
113
+ - type: cosine_precision@1
114
+ value: 0.1809635722679201
115
+ name: Cosine Precision@1
116
+ - type: cosine_precision@3
117
+ value: 0.1645123384253819
118
+ name: Cosine Precision@3
119
+ - type: cosine_precision@5
120
+ value: 0.11468860164512337
121
+ name: Cosine Precision@5
122
+ - type: cosine_precision@10
123
+ value: 0.06639247943595769
124
+ name: Cosine Precision@10
125
+ - type: cosine_recall@1
126
+ value: 0.1809635722679201
127
+ name: Cosine Recall@1
128
+ - type: cosine_recall@3
129
+ value: 0.4935370152761457
130
+ name: Cosine Recall@3
131
+ - type: cosine_recall@5
132
+ value: 0.5734430082256169
133
+ name: Cosine Recall@5
134
+ - type: cosine_recall@10
135
+ value: 0.663924794359577
136
+ name: Cosine Recall@10
137
+ - type: cosine_ndcg@10
138
+ value: 0.41746626575107176
139
+ name: Cosine Ndcg@10
140
+ - type: cosine_mrr@10
141
+ value: 0.33849252979687783
142
+ name: Cosine Mrr@10
143
+ - type: cosine_map@100
144
+ value: 0.3464380043472146
145
+ name: Cosine Map@100
146
+ - task:
147
+ type: information-retrieval
148
+ name: Information Retrieval
149
+ dataset:
150
+ name: dim 512
151
+ type: dim_512
152
+ metrics:
153
+ - type: cosine_accuracy@1
154
+ value: 0.19036427732079905
155
+ name: Cosine Accuracy@1
156
+ - type: cosine_accuracy@3
157
+ value: 0.4900117508813161
158
+ name: Cosine Accuracy@3
159
+ - type: cosine_accuracy@5
160
+ value: 0.5687426556991775
161
+ name: Cosine Accuracy@5
162
+ - type: cosine_accuracy@10
163
+ value: 0.6533490011750881
164
+ name: Cosine Accuracy@10
165
+ - type: cosine_precision@1
166
+ value: 0.19036427732079905
167
+ name: Cosine Precision@1
168
+ - type: cosine_precision@3
169
+ value: 0.16333725029377202
170
+ name: Cosine Precision@3
171
+ - type: cosine_precision@5
172
+ value: 0.11374853113983546
173
+ name: Cosine Precision@5
174
+ - type: cosine_precision@10
175
+ value: 0.06533490011750881
176
+ name: Cosine Precision@10
177
+ - type: cosine_recall@1
178
+ value: 0.19036427732079905
179
+ name: Cosine Recall@1
180
+ - type: cosine_recall@3
181
+ value: 0.4900117508813161
182
+ name: Cosine Recall@3
183
+ - type: cosine_recall@5
184
+ value: 0.5687426556991775
185
+ name: Cosine Recall@5
186
+ - type: cosine_recall@10
187
+ value: 0.6533490011750881
188
+ name: Cosine Recall@10
189
+ - type: cosine_ndcg@10
190
+ value: 0.4174472433498665
191
+ name: Cosine Ndcg@10
192
+ - type: cosine_mrr@10
193
+ value: 0.3417030384421691
194
+ name: Cosine Mrr@10
195
+ - type: cosine_map@100
196
+ value: 0.35038294448729146
197
+ name: Cosine Map@100
198
+ - task:
199
+ type: information-retrieval
200
+ name: Information Retrieval
201
+ dataset:
202
+ name: dim 256
203
+ type: dim_256
204
+ metrics:
205
+ - type: cosine_accuracy@1
206
+ value: 0.1797884841363102
207
+ name: Cosine Accuracy@1
208
+ - type: cosine_accuracy@3
209
+ value: 0.47473560517038776
210
+ name: Cosine Accuracy@3
211
+ - type: cosine_accuracy@5
212
+ value: 0.54524089306698
213
+ name: Cosine Accuracy@5
214
+ - type: cosine_accuracy@10
215
+ value: 0.6439482961222092
216
+ name: Cosine Accuracy@10
217
+ - type: cosine_precision@1
218
+ value: 0.1797884841363102
219
+ name: Cosine Precision@1
220
+ - type: cosine_precision@3
221
+ value: 0.15824520172346257
222
+ name: Cosine Precision@3
223
+ - type: cosine_precision@5
224
+ value: 0.10904817861339598
225
+ name: Cosine Precision@5
226
+ - type: cosine_precision@10
227
+ value: 0.06439482961222091
228
+ name: Cosine Precision@10
229
+ - type: cosine_recall@1
230
+ value: 0.1797884841363102
231
+ name: Cosine Recall@1
232
+ - type: cosine_recall@3
233
+ value: 0.47473560517038776
234
+ name: Cosine Recall@3
235
+ - type: cosine_recall@5
236
+ value: 0.54524089306698
237
+ name: Cosine Recall@5
238
+ - type: cosine_recall@10
239
+ value: 0.6439482961222092
240
+ name: Cosine Recall@10
241
+ - type: cosine_ndcg@10
242
+ value: 0.4067526935952037
243
+ name: Cosine Ndcg@10
244
+ - type: cosine_mrr@10
245
+ value: 0.3308208829947965
246
+ name: Cosine Mrr@10
247
+ - type: cosine_map@100
248
+ value: 0.33951940009649473
249
+ name: Cosine Map@100
250
+ - task:
251
+ type: information-retrieval
252
+ name: Information Retrieval
253
+ dataset:
254
+ name: dim 128
255
+ type: dim_128
256
+ metrics:
257
+ - type: cosine_accuracy@1
258
+ value: 0.18566392479435959
259
+ name: Cosine Accuracy@1
260
+ - type: cosine_accuracy@3
261
+ value: 0.4535840188014101
262
+ name: Cosine Accuracy@3
263
+ - type: cosine_accuracy@5
264
+ value: 0.5240893066980024
265
+ name: Cosine Accuracy@5
266
+ - type: cosine_accuracy@10
267
+ value: 0.6216216216216216
268
+ name: Cosine Accuracy@10
269
+ - type: cosine_precision@1
270
+ value: 0.18566392479435959
271
+ name: Cosine Precision@1
272
+ - type: cosine_precision@3
273
+ value: 0.15119467293380337
274
+ name: Cosine Precision@3
275
+ - type: cosine_precision@5
276
+ value: 0.10481786133960047
277
+ name: Cosine Precision@5
278
+ - type: cosine_precision@10
279
+ value: 0.06216216216216215
280
+ name: Cosine Precision@10
281
+ - type: cosine_recall@1
282
+ value: 0.18566392479435959
283
+ name: Cosine Recall@1
284
+ - type: cosine_recall@3
285
+ value: 0.4535840188014101
286
+ name: Cosine Recall@3
287
+ - type: cosine_recall@5
288
+ value: 0.5240893066980024
289
+ name: Cosine Recall@5
290
+ - type: cosine_recall@10
291
+ value: 0.6216216216216216
292
+ name: Cosine Recall@10
293
+ - type: cosine_ndcg@10
294
+ value: 0.39600584846785714
295
+ name: Cosine Ndcg@10
296
+ - type: cosine_mrr@10
297
+ value: 0.324298211254733
298
+ name: Cosine Mrr@10
299
+ - type: cosine_map@100
300
+ value: 0.33327512340163784
301
+ name: Cosine Map@100
302
+ - task:
303
+ type: information-retrieval
304
+ name: Information Retrieval
305
+ dataset:
306
+ name: dim 64
307
+ type: dim_64
308
+ metrics:
309
+ - type: cosine_accuracy@1
310
+ value: 0.16333725029377202
311
+ name: Cosine Accuracy@1
312
+ - type: cosine_accuracy@3
313
+ value: 0.42420681551116335
314
+ name: Cosine Accuracy@3
315
+ - type: cosine_accuracy@5
316
+ value: 0.491186839012926
317
+ name: Cosine Accuracy@5
318
+ - type: cosine_accuracy@10
319
+ value: 0.5781433607520564
320
+ name: Cosine Accuracy@10
321
+ - type: cosine_precision@1
322
+ value: 0.16333725029377202
323
+ name: Cosine Precision@1
324
+ - type: cosine_precision@3
325
+ value: 0.14140227183705445
326
+ name: Cosine Precision@3
327
+ - type: cosine_precision@5
328
+ value: 0.09823736780258518
329
+ name: Cosine Precision@5
330
+ - type: cosine_precision@10
331
+ value: 0.05781433607520563
332
+ name: Cosine Precision@10
333
+ - type: cosine_recall@1
334
+ value: 0.16333725029377202
335
+ name: Cosine Recall@1
336
+ - type: cosine_recall@3
337
+ value: 0.42420681551116335
338
+ name: Cosine Recall@3
339
+ - type: cosine_recall@5
340
+ value: 0.491186839012926
341
+ name: Cosine Recall@5
342
+ - type: cosine_recall@10
343
+ value: 0.5781433607520564
344
+ name: Cosine Recall@10
345
+ - type: cosine_ndcg@10
346
+ value: 0.36616361619562976
347
+ name: Cosine Ndcg@10
348
+ - type: cosine_mrr@10
349
+ value: 0.2984467386641303
350
+ name: Cosine Mrr@10
351
+ - type: cosine_map@100
352
+ value: 0.3078022299669783
353
+ name: Cosine Map@100
354
+ ---
355
+
356
+ # mpnet-base-financial-rag-matryoshka
357
+
358
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
359
+
360
+ ## Model Details
361
+
362
+ ### Model Description
363
+ - **Model Type:** Sentence Transformer
364
+ - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) <!-- at revision 84f2bcc00d77236f9e89c8a360a00fb1139bf47d -->
365
+ - **Maximum Sequence Length:** 384 tokens
366
+ - **Output Dimensionality:** 768 tokens
367
+ - **Similarity Function:** Cosine Similarity
368
+ <!-- - **Training Dataset:** Unknown -->
369
+ - **Language:** en
370
+ - **License:** apache-2.0
371
+
372
+ ### Model Sources
373
+
374
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
375
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
376
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
377
+
378
+ ### Full Model Architecture
379
+
380
+ ```
381
+ SentenceTransformer(
382
+ (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
383
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
384
+ (2): Normalize()
385
+ )
386
+ ```
387
+
388
+ ## Usage
389
+
390
+ ### Direct Usage (Sentence Transformers)
391
+
392
+ First install the Sentence Transformers library:
393
+
394
+ ```bash
395
+ pip install -U sentence-transformers
396
+ ```
397
+
398
+ Then you can load this model and run inference.
399
+ ```python
400
+ from sentence_transformers import SentenceTransformer
401
+
402
+ # Download from the 🤗 Hub
403
+ model = SentenceTransformer("rbhatia46/mpnet-base-financial-rag-matryoshka")
404
+ # Run inference
405
+ sentences = [
406
+ "It's simple, really: Practice. Fiscal responsibility is not a trick you can learn look up on Google, or a service you can buy from your accountant. Being responsible with your money is a skill that is learned over a lifetime. The only way to get better at it is to practice, and not get discouraged when you make mistakes.",
407
+ 'why do people have to be fiscally responsible',
408
+ 'how long does it take for a loan to get paid interest',
409
+ ]
410
+ embeddings = model.encode(sentences)
411
+ print(embeddings.shape)
412
+ # [3, 768]
413
+
414
+ # Get the similarity scores for the embeddings
415
+ similarities = model.similarity(embeddings, embeddings)
416
+ print(similarities.shape)
417
+ # [3, 3]
418
+ ```
419
+
420
+ <!--
421
+ ### Direct Usage (Transformers)
422
+
423
+ <details><summary>Click to see the direct usage in Transformers</summary>
424
+
425
+ </details>
426
+ -->
427
+
428
+ <!--
429
+ ### Downstream Usage (Sentence Transformers)
430
+
431
+ You can finetune this model on your own dataset.
432
+
433
+ <details><summary>Click to expand</summary>
434
+
435
+ </details>
436
+ -->
437
+
438
+ <!--
439
+ ### Out-of-Scope Use
440
+
441
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
442
+ -->
443
+
444
+ ## Evaluation
445
+
446
+ ### Metrics
447
+
448
+ #### Information Retrieval
449
+ * Dataset: `dim_768`
450
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
451
+
452
+ | Metric | Value |
453
+ |:--------------------|:-----------|
454
+ | cosine_accuracy@1 | 0.181 |
455
+ | cosine_accuracy@3 | 0.4935 |
456
+ | cosine_accuracy@5 | 0.5734 |
457
+ | cosine_accuracy@10 | 0.6639 |
458
+ | cosine_precision@1 | 0.181 |
459
+ | cosine_precision@3 | 0.1645 |
460
+ | cosine_precision@5 | 0.1147 |
461
+ | cosine_precision@10 | 0.0664 |
462
+ | cosine_recall@1 | 0.181 |
463
+ | cosine_recall@3 | 0.4935 |
464
+ | cosine_recall@5 | 0.5734 |
465
+ | cosine_recall@10 | 0.6639 |
466
+ | cosine_ndcg@10 | 0.4175 |
467
+ | cosine_mrr@10 | 0.3385 |
468
+ | **cosine_map@100** | **0.3464** |
469
+
470
+ #### Information Retrieval
471
+ * Dataset: `dim_512`
472
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
473
+
474
+ | Metric | Value |
475
+ |:--------------------|:-----------|
476
+ | cosine_accuracy@1 | 0.1904 |
477
+ | cosine_accuracy@3 | 0.49 |
478
+ | cosine_accuracy@5 | 0.5687 |
479
+ | cosine_accuracy@10 | 0.6533 |
480
+ | cosine_precision@1 | 0.1904 |
481
+ | cosine_precision@3 | 0.1633 |
482
+ | cosine_precision@5 | 0.1137 |
483
+ | cosine_precision@10 | 0.0653 |
484
+ | cosine_recall@1 | 0.1904 |
485
+ | cosine_recall@3 | 0.49 |
486
+ | cosine_recall@5 | 0.5687 |
487
+ | cosine_recall@10 | 0.6533 |
488
+ | cosine_ndcg@10 | 0.4174 |
489
+ | cosine_mrr@10 | 0.3417 |
490
+ | **cosine_map@100** | **0.3504** |
491
+
492
+ #### Information Retrieval
493
+ * Dataset: `dim_256`
494
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
495
+
496
+ | Metric | Value |
497
+ |:--------------------|:-----------|
498
+ | cosine_accuracy@1 | 0.1798 |
499
+ | cosine_accuracy@3 | 0.4747 |
500
+ | cosine_accuracy@5 | 0.5452 |
501
+ | cosine_accuracy@10 | 0.6439 |
502
+ | cosine_precision@1 | 0.1798 |
503
+ | cosine_precision@3 | 0.1582 |
504
+ | cosine_precision@5 | 0.109 |
505
+ | cosine_precision@10 | 0.0644 |
506
+ | cosine_recall@1 | 0.1798 |
507
+ | cosine_recall@3 | 0.4747 |
508
+ | cosine_recall@5 | 0.5452 |
509
+ | cosine_recall@10 | 0.6439 |
510
+ | cosine_ndcg@10 | 0.4068 |
511
+ | cosine_mrr@10 | 0.3308 |
512
+ | **cosine_map@100** | **0.3395** |
513
+
514
+ #### Information Retrieval
515
+ * Dataset: `dim_128`
516
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
517
+
518
+ | Metric | Value |
519
+ |:--------------------|:-----------|
520
+ | cosine_accuracy@1 | 0.1857 |
521
+ | cosine_accuracy@3 | 0.4536 |
522
+ | cosine_accuracy@5 | 0.5241 |
523
+ | cosine_accuracy@10 | 0.6216 |
524
+ | cosine_precision@1 | 0.1857 |
525
+ | cosine_precision@3 | 0.1512 |
526
+ | cosine_precision@5 | 0.1048 |
527
+ | cosine_precision@10 | 0.0622 |
528
+ | cosine_recall@1 | 0.1857 |
529
+ | cosine_recall@3 | 0.4536 |
530
+ | cosine_recall@5 | 0.5241 |
531
+ | cosine_recall@10 | 0.6216 |
532
+ | cosine_ndcg@10 | 0.396 |
533
+ | cosine_mrr@10 | 0.3243 |
534
+ | **cosine_map@100** | **0.3333** |
535
+
536
+ #### Information Retrieval
537
+ * Dataset: `dim_64`
538
+ * Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)
539
+
540
+ | Metric | Value |
541
+ |:--------------------|:-----------|
542
+ | cosine_accuracy@1 | 0.1633 |
543
+ | cosine_accuracy@3 | 0.4242 |
544
+ | cosine_accuracy@5 | 0.4912 |
545
+ | cosine_accuracy@10 | 0.5781 |
546
+ | cosine_precision@1 | 0.1633 |
547
+ | cosine_precision@3 | 0.1414 |
548
+ | cosine_precision@5 | 0.0982 |
549
+ | cosine_precision@10 | 0.0578 |
550
+ | cosine_recall@1 | 0.1633 |
551
+ | cosine_recall@3 | 0.4242 |
552
+ | cosine_recall@5 | 0.4912 |
553
+ | cosine_recall@10 | 0.5781 |
554
+ | cosine_ndcg@10 | 0.3662 |
555
+ | cosine_mrr@10 | 0.2984 |
556
+ | **cosine_map@100** | **0.3078** |
557
+
558
+ <!--
559
+ ## Bias, Risks and Limitations
560
+
561
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
562
+ -->
563
+
564
+ <!--
565
+ ### Recommendations
566
+
567
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
568
+ -->
569
+
570
+ ## Training Details
571
+
572
+ ### Training Dataset
573
+
574
+ #### Unnamed Dataset
575
+
576
+
577
+ * Size: 169,213 training samples
578
+ * Columns: <code>positive</code> and <code>anchor</code>
579
+ * Approximate statistics based on the first 1000 samples:
580
+ | | positive | anchor |
581
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
582
+ | type | string | string |
583
+ | details | <ul><li>min: 7 tokens</li><li>mean: 158.03 tokens</li><li>max: 384 tokens</li></ul> | <ul><li>min: 4 tokens</li><li>mean: 10.16 tokens</li><li>max: 30 tokens</li></ul> |
584
+ * Samples:
585
+ | positive | anchor |
586
+ ||:-------------------------------------------------------------|
587
+ | <code>International Trade, the exchange of goods and services between nations. “Goods” can be defined as finished products, as intermediate goods used in producing other goods, or as raw materials such as minerals, agricultural products, and other such commodities. International trade commerce enables a nation to specialize in those goods it can produce most cheaply and efficiently, and sell those that are surplus to its requirements. Trade also enables a country to consume more than it would be able to produce if it depended only on its own resources. Finally, trade encourages economic development by increasing the size of the market to which products can be sold. Trade has always been the major force behind the economic relations among nations; it is a measure of national strength.</code> | <code>what does international trade</code> |
588
+ | <code>My wife and I meet in the first few days of each month to create a budget for the coming month. During that meeting we reconcile any spending for the previous month and make sure the amount money in our accounts matches the amount of money in our budget record to the penny. (We use an excel spreadsheet, how you track it matters less than the need to track it and see how much you spent in each category during the previous month.) After we have have reviewed the previous month's spending, we allocate money we made during that previous month to each of the categories. What categories you track and how granular you are is less important than regularly seeing how much you spend so that you can evaluate whether your spending is really matching your priorities. We keep a running total for each category so if we go over on groceries one month, then the following month we have to add more to bring the category back to black as well as enough for our anticipated needs in the coming month. If there is one category that we are consistently underestimating (or overestimating) we talk about why. If there are large purchases that we are planning in the coming month, or even in a few months, we talk about them, why we want them, and we talk about how much we're planning to spend. If we want a new TV or to go on a trip, we may start adding money to the category with no plans to spend in the coming month. The biggest benefit to this process has been that we don't make a lot of impulse purchases, or if we do, they are for small dollar amounts. The simple need to explain what I want and why means I have to put the thought into it myself, and I talk myself out of a lot of purchases during that train of thought. The time spent regularly evaluating what we get for our money has cut waste that wasn't really bringing much happiness. We still buy what we want, but we agree that we want it first.</code> | <code>how to make a budget</code> |
589
+ | <code>I just finished my bachelor and I'm doing my masters in Computer Science at a french school in Quebec. I consider myself being in the top 5% and I have an excellent curriculum, having studied abroad, learned 4 languages, participated in student committees, etc. I'm leaning towards IT or business strategy/development...but I'm not sure yet. I guess I'm not that prepared, that's why I wanted a little help.</code> | <code>what school do you want to attend for a masters</code> |
590
+ * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
591
+ ```json
592
+ {
593
+ "loss": "MultipleNegativesRankingLoss",
594
+ "matryoshka_dims": [
595
+ 768,
596
+ 512,
597
+ 256,
598
+ 128,
599
+ 64
600
+ ],
601
+ "matryoshka_weights": [
602
+ 1,
603
+ 1,
604
+ 1,
605
+ 1,
606
+ 1
607
+ ],
608
+ "n_dims_per_step": -1
609
+ }
610
+ ```
611
+
612
+ ### Training Hyperparameters
613
+ #### Non-Default Hyperparameters
614
+
615
+ - `eval_strategy`: epoch
616
+ - `per_device_train_batch_size`: 32
617
+ - `per_device_eval_batch_size`: 16
618
+ - `gradient_accumulation_steps`: 16
619
+ - `learning_rate`: 2e-05
620
+ - `num_train_epochs`: 10
621
+ - `lr_scheduler_type`: cosine
622
+ - `warmup_ratio`: 0.1
623
+ - `bf16`: True
624
+ - `tf32`: True
625
+ - `load_best_model_at_end`: True
626
+ - `optim`: adamw_torch_fused
627
+ - `batch_sampler`: no_duplicates
628
+
629
+ #### All Hyperparameters
630
+ <details><summary>Click to expand</summary>
631
+
632
+ - `overwrite_output_dir`: False
633
+ - `do_predict`: False
634
+ - `eval_strategy`: epoch
635
+ - `prediction_loss_only`: True
636
+ - `per_device_train_batch_size`: 32
637
+ - `per_device_eval_batch_size`: 16
638
+ - `per_gpu_train_batch_size`: None
639
+ - `per_gpu_eval_batch_size`: None
640
+ - `gradient_accumulation_steps`: 16
641
+ - `eval_accumulation_steps`: None
642
+ - `learning_rate`: 2e-05
643
+ - `weight_decay`: 0.0
644
+ - `adam_beta1`: 0.9
645
+ - `adam_beta2`: 0.999
646
+ - `adam_epsilon`: 1e-08
647
+ - `max_grad_norm`: 1.0
648
+ - `num_train_epochs`: 10
649
+ - `max_steps`: -1
650
+ - `lr_scheduler_type`: cosine
651
+ - `lr_scheduler_kwargs`: {}
652
+ - `warmup_ratio`: 0.1
653
+ - `warmup_steps`: 0
654
+ - `log_level`: passive
655
+ - `log_level_replica`: warning
656
+ - `log_on_each_node`: True
657
+ - `logging_nan_inf_filter`: True
658
+ - `save_safetensors`: True
659
+ - `save_on_each_node`: False
660
+ - `save_only_model`: False
661
+ - `restore_callback_states_from_checkpoint`: False
662
+ - `no_cuda`: False
663
+ - `use_cpu`: False
664
+ - `use_mps_device`: False
665
+ - `seed`: 42
666
+ - `data_seed`: None
667
+ - `jit_mode_eval`: False
668
+ - `use_ipex`: False
669
+ - `bf16`: True
670
+ - `fp16`: False
671
+ - `fp16_opt_level`: O1
672
+ - `half_precision_backend`: auto
673
+ - `bf16_full_eval`: False
674
+ - `fp16_full_eval`: False
675
+ - `tf32`: True
676
+ - `local_rank`: 0
677
+ - `ddp_backend`: None
678
+ - `tpu_num_cores`: None
679
+ - `tpu_metrics_debug`: False
680
+ - `debug`: []
681
+ - `dataloader_drop_last`: False
682
+ - `dataloader_num_workers`: 0
683
+ - `dataloader_prefetch_factor`: None
684
+ - `past_index`: -1
685
+ - `disable_tqdm`: False
686
+ - `remove_unused_columns`: True
687
+ - `label_names`: None
688
+ - `load_best_model_at_end`: True
689
+ - `ignore_data_skip`: False
690
+ - `fsdp`: []
691
+ - `fsdp_min_num_params`: 0
692
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
693
+ - `fsdp_transformer_layer_cls_to_wrap`: None
694
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
695
+ - `deepspeed`: None
696
+ - `label_smoothing_factor`: 0.0
697
+ - `optim`: adamw_torch_fused
698
+ - `optim_args`: None
699
+ - `adafactor`: False
700
+ - `group_by_length`: False
701
+ - `length_column_name`: length
702
+ - `ddp_find_unused_parameters`: None
703
+ - `ddp_bucket_cap_mb`: None
704
+ - `ddp_broadcast_buffers`: False
705
+ - `dataloader_pin_memory`: True
706
+ - `dataloader_persistent_workers`: False
707
+ - `skip_memory_metrics`: True
708
+ - `use_legacy_prediction_loop`: False
709
+ - `push_to_hub`: False
710
+ - `resume_from_checkpoint`: None
711
+ - `hub_model_id`: None
712
+ - `hub_strategy`: every_save
713
+ - `hub_private_repo`: False
714
+ - `hub_always_push`: False
715
+ - `gradient_checkpointing`: False
716
+ - `gradient_checkpointing_kwargs`: None
717
+ - `include_inputs_for_metrics`: False
718
+ - `eval_do_concat_batches`: True
719
+ - `fp16_backend`: auto
720
+ - `push_to_hub_model_id`: None
721
+ - `push_to_hub_organization`: None
722
+ - `mp_parameters`:
723
+ - `auto_find_batch_size`: False
724
+ - `full_determinism`: False
725
+ - `torchdynamo`: None
726
+ - `ray_scope`: last
727
+ - `ddp_timeout`: 1800
728
+ - `torch_compile`: False
729
+ - `torch_compile_backend`: None
730
+ - `torch_compile_mode`: None
731
+ - `dispatch_batches`: None
732
+ - `split_batches`: None
733
+ - `include_tokens_per_second`: False
734
+ - `include_num_input_tokens_seen`: False
735
+ - `neftune_noise_alpha`: None
736
+ - `optim_target_modules`: None
737
+ - `batch_eval_metrics`: False
738
+ - `batch_sampler`: no_duplicates
739
+ - `multi_dataset_batch_sampler`: proportional
740
+
741
+ </details>
742
+
743
+ ### Training Logs
744
+ <details><summary>Click to expand</summary>
745
+
746
+ | Epoch | Step | Training Loss | dim_128_cosine_map@100 | dim_256_cosine_map@100 | dim_512_cosine_map@100 | dim_64_cosine_map@100 | dim_768_cosine_map@100 |
747
+ |:----------:|:--------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|:----------------------:|
748
+ | 0.0303 | 10 | 2.2113 | - | - | - | - | - |
749
+ | 0.0605 | 20 | 2.1051 | - | - | - | - | - |
750
+ | 0.0908 | 30 | 1.9214 | - | - | - | - | - |
751
+ | 0.1210 | 40 | 1.744 | - | - | - | - | - |
752
+ | 0.1513 | 50 | 1.5873 | - | - | - | - | - |
753
+ | 0.1815 | 60 | 1.3988 | - | - | - | - | - |
754
+ | 0.2118 | 70 | 1.263 | - | - | - | - | - |
755
+ | 0.2421 | 80 | 1.1082 | - | - | - | - | - |
756
+ | 0.2723 | 90 | 1.0061 | - | - | - | - | - |
757
+ | 0.3026 | 100 | 1.0127 | - | - | - | - | - |
758
+ | 0.3328 | 110 | 0.8644 | - | - | - | - | - |
759
+ | 0.3631 | 120 | 0.8006 | - | - | - | - | - |
760
+ | 0.3933 | 130 | 0.8067 | - | - | - | - | - |
761
+ | 0.4236 | 140 | 0.7624 | - | - | - | - | - |
762
+ | 0.4539 | 150 | 0.799 | - | - | - | - | - |
763
+ | 0.4841 | 160 | 0.7025 | - | - | - | - | - |
764
+ | 0.5144 | 170 | 0.7467 | - | - | - | - | - |
765
+ | 0.5446 | 180 | 0.7509 | - | - | - | - | - |
766
+ | 0.5749 | 190 | 0.7057 | - | - | - | - | - |
767
+ | 0.6051 | 200 | 0.6929 | - | - | - | - | - |
768
+ | 0.6354 | 210 | 0.6948 | - | - | - | - | - |
769
+ | 0.6657 | 220 | 0.6477 | - | - | - | - | - |
770
+ | 0.6959 | 230 | 0.6562 | - | - | - | - | - |
771
+ | 0.7262 | 240 | 0.6278 | - | - | - | - | - |
772
+ | 0.7564 | 250 | 0.6249 | - | - | - | - | - |
773
+ | 0.7867 | 260 | 0.6057 | - | - | - | - | - |
774
+ | 0.8169 | 270 | 0.6258 | - | - | - | - | - |
775
+ | 0.8472 | 280 | 0.5007 | - | - | - | - | - |
776
+ | 0.8775 | 290 | 0.5998 | - | - | - | - | - |
777
+ | 0.9077 | 300 | 0.5958 | - | - | - | - | - |
778
+ | 0.9380 | 310 | 0.5568 | - | - | - | - | - |
779
+ | 0.9682 | 320 | 0.5236 | - | - | - | - | - |
780
+ | 0.9985 | 330 | 0.6239 | 0.3189 | 0.3389 | 0.3645 | 0.3046 | 0.3700 |
781
+ | 1.0287 | 340 | 0.5106 | - | - | - | - | - |
782
+ | 1.0590 | 350 | 0.6022 | - | - | - | - | - |
783
+ | 1.0893 | 360 | 0.5822 | - | - | - | - | - |
784
+ | 1.1195 | 370 | 0.5094 | - | - | - | - | - |
785
+ | 1.1498 | 380 | 0.5037 | - | - | - | - | - |
786
+ | 1.1800 | 390 | 0.5415 | - | - | - | - | - |
787
+ | 1.2103 | 400 | 0.5011 | - | - | - | - | - |
788
+ | 1.2405 | 410 | 0.4571 | - | - | - | - | - |
789
+ | 1.2708 | 420 | 0.4587 | - | - | - | - | - |
790
+ | 1.3011 | 430 | 0.5065 | - | - | - | - | - |
791
+ | 1.3313 | 440 | 0.4589 | - | - | - | - | - |
792
+ | 1.3616 | 450 | 0.4165 | - | - | - | - | - |
793
+ | 1.3918 | 460 | 0.4215 | - | - | - | - | - |
794
+ | 1.4221 | 470 | 0.4302 | - | - | - | - | - |
795
+ | 1.4523 | 480 | 0.4556 | - | - | - | - | - |
796
+ | 1.4826 | 490 | 0.3793 | - | - | - | - | - |
797
+ | 1.5129 | 500 | 0.4586 | - | - | - | - | - |
798
+ | 1.5431 | 510 | 0.4327 | - | - | - | - | - |
799
+ | 1.5734 | 520 | 0.4207 | - | - | - | - | - |
800
+ | 1.6036 | 530 | 0.4042 | - | - | - | - | - |
801
+ | 1.6339 | 540 | 0.4019 | - | - | - | - | - |
802
+ | 1.6641 | 550 | 0.3804 | - | - | - | - | - |
803
+ | 1.6944 | 560 | 0.3796 | - | - | - | - | - |
804
+ | 1.7247 | 570 | 0.3476 | - | - | - | - | - |
805
+ | 1.7549 | 580 | 0.3871 | - | - | - | - | - |
806
+ | 1.7852 | 590 | 0.3602 | - | - | - | - | - |
807
+ | 1.8154 | 600 | 0.3711 | - | - | - | - | - |
808
+ | 1.8457 | 610 | 0.2879 | - | - | - | - | - |
809
+ | 1.8759 | 620 | 0.3497 | - | - | - | - | - |
810
+ | 1.9062 | 630 | 0.3346 | - | - | - | - | - |
811
+ | 1.9365 | 640 | 0.3426 | - | - | - | - | - |
812
+ | 1.9667 | 650 | 0.2977 | - | - | - | - | - |
813
+ | 1.9970 | 660 | 0.3783 | - | - | - | - | - |
814
+ | 2.0 | 661 | - | 0.3282 | 0.3485 | 0.3749 | 0.2960 | 0.3666 |
815
+ | 2.0272 | 670 | 0.3012 | - | - | - | - | - |
816
+ | 2.0575 | 680 | 0.3491 | - | - | - | - | - |
817
+ | 2.0877 | 690 | 0.3589 | - | - | - | - | - |
818
+ | 2.1180 | 700 | 0.2998 | - | - | - | - | - |
819
+ | 2.1483 | 710 | 0.2925 | - | - | - | - | - |
820
+ | 2.1785 | 720 | 0.3261 | - | - | - | - | - |
821
+ | 2.2088 | 730 | 0.2917 | - | - | - | - | - |
822
+ | 2.2390 | 740 | 0.2685 | - | - | - | - | - |
823
+ | 2.2693 | 750 | 0.2674 | - | - | - | - | - |
824
+ | 2.2995 | 760 | 0.3136 | - | - | - | - | - |
825
+ | 2.3298 | 770 | 0.2631 | - | - | - | - | - |
826
+ | 2.3601 | 780 | 0.2509 | - | - | - | - | - |
827
+ | 2.3903 | 790 | 0.2518 | - | - | - | - | - |
828
+ | 2.4206 | 800 | 0.2603 | - | - | - | - | - |
829
+ | 2.4508 | 810 | 0.2773 | - | - | - | - | - |
830
+ | 2.4811 | 820 | 0.245 | - | - | - | - | - |
831
+ | 2.5113 | 830 | 0.2746 | - | - | - | - | - |
832
+ | 2.5416 | 840 | 0.2747 | - | - | - | - | - |
833
+ | 2.5719 | 850 | 0.2426 | - | - | - | - | - |
834
+ | 2.6021 | 860 | 0.2593 | - | - | - | - | - |
835
+ | 2.6324 | 870 | 0.2482 | - | - | - | - | - |
836
+ | 2.6626 | 880 | 0.2344 | - | - | - | - | - |
837
+ | 2.6929 | 890 | 0.2452 | - | - | - | - | - |
838
+ | 2.7231 | 900 | 0.218 | - | - | - | - | - |
839
+ | 2.7534 | 910 | 0.2319 | - | - | - | - | - |
840
+ | 2.7837 | 920 | 0.2366 | - | - | - | - | - |
841
+ | 2.8139 | 930 | 0.2265 | - | - | - | - | - |
842
+ | 2.8442 | 940 | 0.1753 | - | - | - | - | - |
843
+ | 2.8744 | 950 | 0.2153 | - | - | - | - | - |
844
+ | 2.9047 | 960 | 0.201 | - | - | - | - | - |
845
+ | 2.9349 | 970 | 0.2205 | - | - | - | - | - |
846
+ | 2.9652 | 980 | 0.1933 | - | - | - | - | - |
847
+ | 2.9955 | 990 | 0.2301 | - | - | - | - | - |
848
+ | 2.9985 | 991 | - | 0.3285 | 0.3484 | 0.3636 | 0.2966 | 0.3660 |
849
+ | 3.0257 | 1000 | 0.1946 | - | - | - | - | - |
850
+ | 3.0560 | 1010 | 0.203 | - | - | - | - | - |
851
+ | 3.0862 | 1020 | 0.2385 | - | - | - | - | - |
852
+ | 3.1165 | 1030 | 0.1821 | - | - | - | - | - |
853
+ | 3.1467 | 1040 | 0.1858 | - | - | - | - | - |
854
+ | 3.1770 | 1050 | 0.2057 | - | - | - | - | - |
855
+ | 3.2073 | 1060 | 0.18 | - | - | - | - | - |
856
+ | 3.2375 | 1070 | 0.1751 | - | - | - | - | - |
857
+ | 3.2678 | 1080 | 0.1539 | - | - | - | - | - |
858
+ | 3.2980 | 1090 | 0.2153 | - | - | - | - | - |
859
+ | 3.3283 | 1100 | 0.1739 | - | - | - | - | - |
860
+ | 3.3585 | 1110 | 0.1621 | - | - | - | - | - |
861
+ | 3.3888 | 1120 | 0.1541 | - | - | - | - | - |
862
+ | 3.4191 | 1130 | 0.1642 | - | - | - | - | - |
863
+ | 3.4493 | 1140 | 0.1893 | - | - | - | - | - |
864
+ | 3.4796 | 1150 | 0.16 | - | - | - | - | - |
865
+ | 3.5098 | 1160 | 0.1839 | - | - | - | - | - |
866
+ | 3.5401 | 1170 | 0.1748 | - | - | - | - | - |
867
+ | 3.5703 | 1180 | 0.1499 | - | - | - | - | - |
868
+ | 3.6006 | 1190 | 0.1706 | - | - | - | - | - |
869
+ | 3.6309 | 1200 | 0.1541 | - | - | - | - | - |
870
+ | 3.6611 | 1210 | 0.1592 | - | - | - | - | - |
871
+ | 3.6914 | 1220 | 0.1683 | - | - | - | - | - |
872
+ | 3.7216 | 1230 | 0.1408 | - | - | - | - | - |
873
+ | 3.7519 | 1240 | 0.1595 | - | - | - | - | - |
874
+ | 3.7821 | 1250 | 0.1585 | - | - | - | - | - |
875
+ | 3.8124 | 1260 | 0.1521 | - | - | - | - | - |
876
+ | 3.8427 | 1270 | 0.1167 | - | - | - | - | - |
877
+ | 3.8729 | 1280 | 0.1416 | - | - | - | - | - |
878
+ | 3.9032 | 1290 | 0.1386 | - | - | - | - | - |
879
+ | 3.9334 | 1300 | 0.1513 | - | - | - | - | - |
880
+ | 3.9637 | 1310 | 0.1329 | - | - | - | - | - |
881
+ | 3.9939 | 1320 | 0.1565 | - | - | - | - | - |
882
+ | 4.0 | 1322 | - | 0.3270 | 0.3575 | 0.3636 | 0.3053 | 0.3660 |
883
+ | 4.0242 | 1330 | 0.1253 | - | - | - | - | - |
884
+ | 4.0545 | 1340 | 0.1325 | - | - | - | - | - |
885
+ | 4.0847 | 1350 | 0.1675 | - | - | - | - | - |
886
+ | 4.1150 | 1360 | 0.1291 | - | - | - | - | - |
887
+ | 4.1452 | 1370 | 0.1259 | - | - | - | - | - |
888
+ | 4.1755 | 1380 | 0.1359 | - | - | - | - | - |
889
+ | 4.2057 | 1390 | 0.1344 | - | - | - | - | - |
890
+ | 4.2360 | 1400 | 0.1187 | - | - | - | - | - |
891
+ | 4.2663 | 1410 | 0.1062 | - | - | - | - | - |
892
+ | 4.2965 | 1420 | 0.1653 | - | - | - | - | - |
893
+ | 4.3268 | 1430 | 0.1164 | - | - | - | - | - |
894
+ | 4.3570 | 1440 | 0.103 | - | - | - | - | - |
895
+ | 4.3873 | 1450 | 0.1093 | - | - | - | - | - |
896
+ | 4.4175 | 1460 | 0.1156 | - | - | - | - | - |
897
+ | 4.4478 | 1470 | 0.1195 | - | - | - | - | - |
898
+ | 4.4781 | 1480 | 0.1141 | - | - | - | - | - |
899
+ | 4.5083 | 1490 | 0.1233 | - | - | - | - | - |
900
+ | 4.5386 | 1500 | 0.1169 | - | - | - | - | - |
901
+ | 4.5688 | 1510 | 0.0957 | - | - | - | - | - |
902
+ | 4.5991 | 1520 | 0.1147 | - | - | - | - | - |
903
+ | 4.6293 | 1530 | 0.1134 | - | - | - | - | - |
904
+ | 4.6596 | 1540 | 0.1143 | - | - | - | - | - |
905
+ | 4.6899 | 1550 | 0.1125 | - | - | - | - | - |
906
+ | 4.7201 | 1560 | 0.0988 | - | - | - | - | - |
907
+ | 4.7504 | 1570 | 0.1149 | - | - | - | - | - |
908
+ | 4.7806 | 1580 | 0.1154 | - | - | - | - | - |
909
+ | 4.8109 | 1590 | 0.1043 | - | - | - | - | - |
910
+ | 4.8411 | 1600 | 0.0887 | - | - | - | - | - |
911
+ | 4.8714 | 1610 | 0.0921 | - | - | - | - | - |
912
+ | 4.9017 | 1620 | 0.1023 | - | - | - | - | - |
913
+ | 4.9319 | 1630 | 0.1078 | - | - | - | - | - |
914
+ | 4.9622 | 1640 | 0.1053 | - | - | - | - | - |
915
+ | 4.9924 | 1650 | 0.1135 | - | - | - | - | - |
916
+ | 4.9985 | 1652 | - | 0.3402 | 0.3620 | 0.3781 | 0.3236 | 0.3842 |
917
+ | 5.0227 | 1660 | 0.0908 | - | - | - | - | - |
918
+ | 5.0530 | 1670 | 0.0908 | - | - | - | - | - |
919
+ | 5.0832 | 1680 | 0.1149 | - | - | - | - | - |
920
+ | 5.1135 | 1690 | 0.0991 | - | - | - | - | - |
921
+ | 5.1437 | 1700 | 0.0864 | - | - | - | - | - |
922
+ | 5.1740 | 1710 | 0.0987 | - | - | - | - | - |
923
+ | 5.2042 | 1720 | 0.0949 | - | - | - | - | - |
924
+ | 5.2345 | 1730 | 0.0893 | - | - | - | - | - |
925
+ | 5.2648 | 1740 | 0.0806 | - | - | - | - | - |
926
+ | 5.2950 | 1750 | 0.1187 | - | - | - | - | - |
927
+ | 5.3253 | 1760 | 0.0851 | - | - | - | - | - |
928
+ | 5.3555 | 1770 | 0.0814 | - | - | - | - | - |
929
+ | 5.3858 | 1780 | 0.0803 | - | - | - | - | - |
930
+ | 5.4160 | 1790 | 0.0816 | - | - | - | - | - |
931
+ | 5.4463 | 1800 | 0.0916 | - | - | - | - | - |
932
+ | 5.4766 | 1810 | 0.0892 | - | - | - | - | - |
933
+ | 5.5068 | 1820 | 0.0935 | - | - | - | - | - |
934
+ | 5.5371 | 1830 | 0.0963 | - | - | - | - | - |
935
+ | 5.5673 | 1840 | 0.0759 | - | - | - | - | - |
936
+ | 5.5976 | 1850 | 0.0908 | - | - | - | - | - |
937
+ | 5.6278 | 1860 | 0.0896 | - | - | - | - | - |
938
+ | 5.6581 | 1870 | 0.0855 | - | - | - | - | - |
939
+ | 5.6884 | 1880 | 0.0849 | - | - | - | - | - |
940
+ | 5.7186 | 1890 | 0.0805 | - | - | - | - | - |
941
+ | 5.7489 | 1900 | 0.0872 | - | - | - | - | - |
942
+ | 5.7791 | 1910 | 0.0853 | - | - | - | - | - |
943
+ | 5.8094 | 1920 | 0.0856 | - | - | - | - | - |
944
+ | 5.8396 | 1930 | 0.064 | - | - | - | - | - |
945
+ | 5.8699 | 1940 | 0.0748 | - | - | - | - | - |
946
+ | 5.9002 | 1950 | 0.0769 | - | - | - | - | - |
947
+ | 5.9304 | 1960 | 0.0868 | - | - | - | - | - |
948
+ | 5.9607 | 1970 | 0.0842 | - | - | - | - | - |
949
+ | 5.9909 | 1980 | 0.0825 | - | - | - | - | - |
950
+ | 6.0 | 1983 | - | 0.3412 | 0.3542 | 0.3615 | 0.3171 | 0.3676 |
951
+ | 6.0212 | 1990 | 0.073 | - | - | - | - | - |
952
+ | 6.0514 | 2000 | 0.0708 | - | - | - | - | - |
953
+ | 6.0817 | 2010 | 0.0908 | - | - | - | - | - |
954
+ | 6.1120 | 2020 | 0.0807 | - | - | - | - | - |
955
+ | 6.1422 | 2030 | 0.0665 | - | - | - | - | - |
956
+ | 6.1725 | 2040 | 0.0773 | - | - | - | - | - |
957
+ | 6.2027 | 2050 | 0.0798 | - | - | - | - | - |
958
+ | 6.2330 | 2060 | 0.0743 | - | - | - | - | - |
959
+ | 6.2632 | 2070 | 0.0619 | - | - | - | - | - |
960
+ | 6.2935 | 2080 | 0.0954 | - | - | - | - | - |
961
+ | 6.3238 | 2090 | 0.0682 | - | - | - | - | - |
962
+ | 6.3540 | 2100 | 0.0594 | - | - | - | - | - |
963
+ | 6.3843 | 2110 | 0.0621 | - | - | - | - | - |
964
+ | 6.4145 | 2120 | 0.0674 | - | - | - | - | - |
965
+ | 6.4448 | 2130 | 0.069 | - | - | - | - | - |
966
+ | 6.4750 | 2140 | 0.0741 | - | - | - | - | - |
967
+ | 6.5053 | 2150 | 0.0757 | - | - | - | - | - |
968
+ | 6.5356 | 2160 | 0.0781 | - | - | - | - | - |
969
+ | 6.5658 | 2170 | 0.0632 | - | - | - | - | - |
970
+ | 6.5961 | 2180 | 0.07 | - | - | - | - | - |
971
+ | 6.6263 | 2190 | 0.0767 | - | - | - | - | - |
972
+ | 6.6566 | 2200 | 0.0674 | - | - | - | - | - |
973
+ | 6.6868 | 2210 | 0.0704 | - | - | - | - | - |
974
+ | 6.7171 | 2220 | 0.065 | - | - | - | - | - |
975
+ | 6.7474 | 2230 | 0.066 | - | - | - | - | - |
976
+ | 6.7776 | 2240 | 0.0752 | - | - | - | - | - |
977
+ | 6.8079 | 2250 | 0.07 | - | - | - | - | - |
978
+ | 6.8381 | 2260 | 0.0602 | - | - | - | - | - |
979
+ | 6.8684 | 2270 | 0.0595 | - | - | - | - | - |
980
+ | 6.8986 | 2280 | 0.065 | - | - | - | - | - |
981
+ | 6.9289 | 2290 | 0.0677 | - | - | - | - | - |
982
+ | 6.9592 | 2300 | 0.0708 | - | - | - | - | - |
983
+ | 6.9894 | 2310 | 0.0651 | - | - | - | - | - |
984
+ | **6.9985** | **2313** | **-** | **0.3484** | **0.3671** | **0.3645** | **0.3214** | **0.3773** |
985
+ | 7.0197 | 2320 | 0.0657 | - | - | - | - | - |
986
+ | 7.0499 | 2330 | 0.0588 | - | - | - | - | - |
987
+ | 7.0802 | 2340 | 0.0701 | - | - | - | - | - |
988
+ | 7.1104 | 2350 | 0.0689 | - | - | - | - | - |
989
+ | 7.1407 | 2360 | 0.0586 | - | - | - | - | - |
990
+ | 7.1710 | 2370 | 0.0626 | - | - | - | - | - |
991
+ | 7.2012 | 2380 | 0.0723 | - | - | - | - | - |
992
+ | 7.2315 | 2390 | 0.0602 | - | - | - | - | - |
993
+ | 7.2617 | 2400 | 0.0541 | - | - | - | - | - |
994
+ | 7.2920 | 2410 | 0.0823 | - | - | - | - | - |
995
+ | 7.3222 | 2420 | 0.0592 | - | - | - | - | - |
996
+ | 7.3525 | 2430 | 0.0535 | - | - | - | - | - |
997
+ | 7.3828 | 2440 | 0.0548 | - | - | - | - | - |
998
+ | 7.4130 | 2450 | 0.0598 | - | - | - | - | - |
999
+ | 7.4433 | 2460 | 0.0554 | - | - | - | - | - |
1000
+ | 7.4735 | 2470 | 0.0663 | - | - | - | - | - |
1001
+ | 7.5038 | 2480 | 0.0645 | - | - | - | - | - |
1002
+ | 7.5340 | 2490 | 0.0638 | - | - | - | - | - |
1003
+ | 7.5643 | 2500 | 0.0574 | - | - | - | - | - |
1004
+ | 7.5946 | 2510 | 0.0608 | - | - | - | - | - |
1005
+ | 7.6248 | 2520 | 0.0633 | - | - | - | - | - |
1006
+ | 7.6551 | 2530 | 0.0576 | - | - | - | - | - |
1007
+ | 7.6853 | 2540 | 0.0613 | - | - | - | - | - |
1008
+ | 7.7156 | 2550 | 0.054 | - | - | - | - | - |
1009
+ | 7.7458 | 2560 | 0.0591 | - | - | - | - | - |
1010
+ | 7.7761 | 2570 | 0.0659 | - | - | - | - | - |
1011
+ | 7.8064 | 2580 | 0.0601 | - | - | - | - | - |
1012
+ | 7.8366 | 2590 | 0.053 | - | - | - | - | - |
1013
+ | 7.8669 | 2600 | 0.0536 | - | - | - | - | - |
1014
+ | 7.8971 | 2610 | 0.0581 | - | - | - | - | - |
1015
+ | 7.9274 | 2620 | 0.0603 | - | - | - | - | - |
1016
+ | 7.9576 | 2630 | 0.0661 | - | - | - | - | - |
1017
+ | 7.9879 | 2640 | 0.0588 | - | - | - | - | - |
1018
+ | 8.0 | 2644 | - | 0.3340 | 0.3533 | 0.3541 | 0.3163 | 0.3651 |
1019
+ | 8.0182 | 2650 | 0.0559 | - | - | - | - | - |
1020
+ | 8.0484 | 2660 | 0.0566 | - | - | - | - | - |
1021
+ | 8.0787 | 2670 | 0.0666 | - | - | - | - | - |
1022
+ | 8.1089 | 2680 | 0.0601 | - | - | - | - | - |
1023
+ | 8.1392 | 2690 | 0.0522 | - | - | - | - | - |
1024
+ | 8.1694 | 2700 | 0.0527 | - | - | - | - | - |
1025
+ | 8.1997 | 2710 | 0.0622 | - | - | - | - | - |
1026
+ | 8.2300 | 2720 | 0.0577 | - | - | - | - | - |
1027
+ | 8.2602 | 2730 | 0.0467 | - | - | - | - | - |
1028
+ | 8.2905 | 2740 | 0.0762 | - | - | - | - | - |
1029
+ | 8.3207 | 2750 | 0.0562 | - | - | - | - | - |
1030
+ | 8.3510 | 2760 | 0.0475 | - | - | - | - | - |
1031
+ | 8.3812 | 2770 | 0.0482 | - | - | - | - | - |
1032
+ | 8.4115 | 2780 | 0.0536 | - | - | - | - | - |
1033
+ | 8.4418 | 2790 | 0.0534 | - | - | - | - | - |
1034
+ | 8.4720 | 2800 | 0.0588 | - | - | - | - | - |
1035
+ | 8.5023 | 2810 | 0.0597 | - | - | - | - | - |
1036
+ | 8.5325 | 2820 | 0.0587 | - | - | - | - | - |
1037
+ | 8.5628 | 2830 | 0.0544 | - | - | - | - | - |
1038
+ | 8.5930 | 2840 | 0.0577 | - | - | - | - | - |
1039
+ | 8.6233 | 2850 | 0.0592 | - | - | - | - | - |
1040
+ | 8.6536 | 2860 | 0.0554 | - | - | - | - | - |
1041
+ | 8.6838 | 2870 | 0.0541 | - | - | - | - | - |
1042
+ | 8.7141 | 2880 | 0.0495 | - | - | - | - | - |
1043
+ | 8.7443 | 2890 | 0.0547 | - | - | - | - | - |
1044
+ | 8.7746 | 2900 | 0.0646 | - | - | - | - | - |
1045
+ | 8.8048 | 2910 | 0.0574 | - | - | - | - | - |
1046
+ | 8.8351 | 2920 | 0.0486 | - | - | - | - | - |
1047
+ | 8.8654 | 2930 | 0.0517 | - | - | - | - | - |
1048
+ | 8.8956 | 2940 | 0.0572 | - | - | - | - | - |
1049
+ | 8.9259 | 2950 | 0.0518 | - | - | - | - | - |
1050
+ | 8.9561 | 2960 | 0.0617 | - | - | - | - | - |
1051
+ | 8.9864 | 2970 | 0.0572 | - | - | - | - | - |
1052
+ | 8.9985 | 2974 | - | 0.3434 | 0.3552 | 0.3694 | 0.3253 | 0.3727 |
1053
+ | 9.0166 | 2980 | 0.0549 | - | - | - | - | - |
1054
+ | 9.0469 | 2990 | 0.0471 | - | - | - | - | - |
1055
+ | 9.0772 | 3000 | 0.0629 | - | - | - | - | - |
1056
+ | 9.1074 | 3010 | 0.058 | - | - | - | - | - |
1057
+ | 9.1377 | 3020 | 0.0531 | - | - | - | - | - |
1058
+ | 9.1679 | 3030 | 0.051 | - | - | - | - | - |
1059
+ | 9.1982 | 3040 | 0.0593 | - | - | - | - | - |
1060
+ | 9.2284 | 3050 | 0.056 | - | - | - | - | - |
1061
+ | 9.2587 | 3060 | 0.0452 | - | - | - | - | - |
1062
+ | 9.2890 | 3070 | 0.0672 | - | - | - | - | - |
1063
+ | 9.3192 | 3080 | 0.0547 | - | - | - | - | - |
1064
+ | 9.3495 | 3090 | 0.0477 | - | - | - | - | - |
1065
+ | 9.3797 | 3100 | 0.0453 | - | - | - | - | - |
1066
+ | 9.4100 | 3110 | 0.0542 | - | - | - | - | - |
1067
+ | 9.4402 | 3120 | 0.0538 | - | - | - | - | - |
1068
+ | 9.4705 | 3130 | 0.0552 | - | - | - | - | - |
1069
+ | 9.5008 | 3140 | 0.0586 | - | - | - | - | - |
1070
+ | 9.5310 | 3150 | 0.0567 | - | - | - | - | - |
1071
+ | 9.5613 | 3160 | 0.0499 | - | - | - | - | - |
1072
+ | 9.5915 | 3170 | 0.0598 | - | - | - | - | - |
1073
+ | 9.6218 | 3180 | 0.0546 | - | - | - | - | - |
1074
+ | 9.6520 | 3190 | 0.0513 | - | - | - | - | - |
1075
+ | 9.6823 | 3200 | 0.0549 | - | - | - | - | - |
1076
+ | 9.7126 | 3210 | 0.0513 | - | - | - | - | - |
1077
+ | 9.7428 | 3220 | 0.0536 | - | - | - | - | - |
1078
+ | 9.7731 | 3230 | 0.0588 | - | - | - | - | - |
1079
+ | 9.8033 | 3240 | 0.0531 | - | - | - | - | - |
1080
+ | 9.8336 | 3250 | 0.0472 | - | - | - | - | - |
1081
+ | 9.8638 | 3260 | 0.0486 | - | - | - | - | - |
1082
+ | 9.8941 | 3270 | 0.0576 | - | - | - | - | - |
1083
+ | 9.9244 | 3280 | 0.0526 | - | - | - | - | - |
1084
+ | 9.9546 | 3290 | 0.0568 | - | - | - | - | - |
1085
+ | 9.9849 | 3300 | 0.0617 | 0.3333 | 0.3395 | 0.3504 | 0.3078 | 0.3464 |
1086
+
1087
+ * The bold row denotes the saved checkpoint.
1088
+ </details>
1089
+
1090
+ ### Framework Versions
1091
+ - Python: 3.10.8
1092
+ - Sentence Transformers: 3.0.1
1093
+ - Transformers: 4.41.2
1094
+ - PyTorch: 2.1.2+cu121
1095
+ - Accelerate: 0.33.0
1096
+ - Datasets: 2.19.1
1097
+ - Tokenizers: 0.19.1
1098
+
1099
+ ## Citation
1100
+
1101
+ ### BibTeX
1102
+
1103
+ #### Sentence Transformers
1104
+ ```bibtex
1105
+ @inproceedings{reimers-2019-sentence-bert,
1106
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
1107
+ author = "Reimers, Nils and Gurevych, Iryna",
1108
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
1109
+ month = "11",
1110
+ year = "2019",
1111
+ publisher = "Association for Computational Linguistics",
1112
+ url = "https://arxiv.org/abs/1908.10084",
1113
+ }
1114
+ ```
1115
+
1116
+ #### MatryoshkaLoss
1117
+ ```bibtex
1118
+ @misc{kusupati2024matryoshka,
1119
+ title={Matryoshka Representation Learning},
1120
+ author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
1121
+ year={2024},
1122
+ eprint={2205.13147},
1123
+ archivePrefix={arXiv},
1124
+ primaryClass={cs.LG}
1125
+ }
1126
+ ```
1127
+
1128
+ #### MultipleNegativesRankingLoss
1129
+ ```bibtex
1130
+ @misc{henderson2017efficient,
1131
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
1132
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
1133
+ year={2017},
1134
+ eprint={1705.00652},
1135
+ archivePrefix={arXiv},
1136
+ primaryClass={cs.CL}
1137
+ }
1138
+ ```
1139
+
1140
+ <!--
1141
+ ## Glossary
1142
+
1143
+ *Clearly define terms in order to be accessible across audiences.*
1144
+ -->
1145
+
1146
+ <!--
1147
+ ## Model Card Authors
1148
+
1149
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
1150
+ -->
1151
+
1152
+ <!--
1153
+ ## Model Card Contact
1154
+
1155
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
1156
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/all-mpnet-base-v2",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.41.2",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.1.2+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e08692ee1c62197840f81476c5bd0c80af364303716d9374d6ec433778de5047
3
+ size 437967672
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 384,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[UNK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30526": {
44
+ "content": "<mask>",
45
+ "lstrip": true,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ }
51
+ },
52
+ "bos_token": "<s>",
53
+ "clean_up_tokenization_spaces": true,
54
+ "cls_token": "<s>",
55
+ "do_lower_case": true,
56
+ "eos_token": "</s>",
57
+ "mask_token": "<mask>",
58
+ "max_length": 128,
59
+ "model_max_length": 384,
60
+ "pad_to_multiple_of": null,
61
+ "pad_token": "<pad>",
62
+ "pad_token_type_id": 0,
63
+ "padding_side": "right",
64
+ "sep_token": "</s>",
65
+ "stride": 0,
66
+ "strip_accents": null,
67
+ "tokenize_chinese_chars": true,
68
+ "tokenizer_class": "MPNetTokenizer",
69
+ "truncation_side": "right",
70
+ "truncation_strategy": "longest_first",
71
+ "unk_token": "[UNK]"
72
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff