baconnier commited on
Commit
7af9727
·
verified ·
1 Parent(s): 096408e

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,607 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: []
3
+ library_name: sentence-transformers
4
+ tags:
5
+ - sentence-transformers
6
+ - sentence-similarity
7
+ - feature-extraction
8
+ - generated_from_trainer
9
+ - dataset_size:15525
10
+ - loss:MultipleNegativesRankingLoss
11
+ base_model: BAAI/bge-small-en-v1.5
12
+ datasets:
13
+ - baconnier/finance_dataset_small_private
14
+ metrics:
15
+ - cosine_accuracy
16
+ - dot_accuracy
17
+ - manhattan_accuracy
18
+ - euclidean_accuracy
19
+ - max_accuracy
20
+ widget:
21
+ - source_sentence: What is the loan to value ratio (LTV) for Samantha's mortgage,
22
+ and how does it relate to the definition of LTV?
23
+ sentences:
24
+ - 'Loan amount = Home value - Down payment
25
+
26
+ Loan amount = $300,000 - $60,000 = $240,000
27
+
28
+ LTV = Loan amount ÷ Home value
29
+
30
+ LTV = $240,000 ÷ $300,000 = 0.8 or 80%
31
+
32
+ The LTV is the proportion of the property''s value financed by the loan.
33
+
34
+ The LTV for Samantha''s mortgage is 80%, which aligns with the definition of LTV
35
+ as the proportion of the property''s value financed by the loan.'
36
+ - 'LTV = Down payment ÷ Home value
37
+
38
+ LTV = $60,000 ÷ $300,000 = 0.2 or 20%
39
+
40
+ The LTV for Samantha''s mortgage is 20%.'
41
+ - What is a sale in the context of securities trading?
42
+ - source_sentence: What is greenmail, and how does it differ from a typical stock
43
+ acquisition?
44
+ sentences:
45
+ - 'Greenmail is when a company buys a small amount of stock in another company.
46
+ This is different from a normal stock purchase because the amount is small.
47
+
48
+ Greenmail is a small stock purchase, unlike a typical acquisition.'
49
+ - 'Greenmail is a corporate finance tactic where an unfriendly entity acquires a
50
+ large block of a target company''s stock, intending to force the target company
51
+ to buy back the shares at a significant premium to prevent a hostile takeover.
52
+ This differs from a typical stock acquisition, which is usually done for investment
53
+ purposes or to gain a smaller ownership stake, without the explicit intention
54
+ of forcing a buyback or threatening a takeover.
55
+
56
+ Greenmail is a tactic used by an unfriendly entity to force a target company to
57
+ buy back its shares at a premium to prevent a hostile takeover, while a typical
58
+ stock acquisition is done for investment or to gain a smaller ownership stake
59
+ without the intention of forcing a buyback or threatening a takeover.'
60
+ - What is the process of 'circling' in the context of underwriting a new share issue?
61
+ - source_sentence: 'ISOs are not taxed at grant or exercise. If shares are held for
62
+ 2 years from grant and 1 year from exercise, the profit is taxed as long-term
63
+ capital gain. If holding periods are not met, it''s a disqualifying disposition,
64
+ and the profit is taxed as ordinary income.
65
+
66
+ ISOs are tax-free at grant and exercise. Profit is taxed as capital gain or ordinary
67
+ income based on holding periods.'
68
+ sentences:
69
+ - 'Incentive Stock Options have no tax benefits and are taxed as ordinary income
70
+ when exercised.
71
+
72
+ ISOs are taxed as ordinary income when exercised.'
73
+ - What are the key characteristics of Incentive Stock Options (ISOs) in terms of
74
+ taxation?
75
+ - What is a short squeeze, and how does it affect stock prices?
76
+ - source_sentence: What is a sell order, and how does it relate to Maggie's decision
77
+ to sell her XYZ Corporation shares?
78
+ sentences:
79
+ - 'A performance fund is a growth-oriented mutual fund that invests primarily in
80
+ stocks of companies with high growth potential and low dividend payouts. These
81
+ funds are typically associated with higher risk compared to other types of mutual
82
+ funds. For example, balanced funds invest in a mix of stocks and bonds and have
83
+ a more moderate risk profile, while money market funds invest in low-risk, short-term
84
+ securities and offer lower returns. Performance funds aim for higher capital appreciation
85
+ but come with increased volatility.
86
+
87
+ A performance fund is a high-risk, growth-oriented mutual fund that invests in
88
+ stocks with high growth potential and low dividends, aiming for capital appreciation.
89
+ It differs from balanced funds (moderate risk, mix of stocks and bonds) and money
90
+ market funds (low risk, short-term securities, lower returns).'
91
+ - 'A sell order is when you want to buy shares of a stock. Maggie wanted to sell
92
+ her XYZ shares because the price was going up.
93
+
94
+ Maggie placed a sell order to buy XYZ shares since the price was increasing.'
95
+ - 'A sell order is an instruction given by an investor to a broker to sell a specific
96
+ financial asset at a certain price or market condition. Maggie placed a sell order
97
+ for 1,000 shares of XYZ Corporation at a limit price of $50 per share because
98
+ she believed the company''s recent acquisition announcement would negatively impact
99
+ the stock price in the short term.
100
+
101
+ Maggie placed a sell order to sell 1,000 XYZ shares at $50 or higher due to her
102
+ expectation of a short-term price decline following the company''s acquisition
103
+ announcement.'
104
+ - source_sentence: What is industrial production, and how is it measured by the Federal
105
+ Reserve Board?
106
+ sentences:
107
+ - What is triangular arbitrage, and how does it allow traders to profit from price
108
+ discrepancies across three different markets?
109
+ - 'Industrial production is a statistic that measures the output of factories and
110
+ mines in the US. It is released by the Federal Reserve Board every quarter.
111
+
112
+ Industrial production measures factory and mine output, released quarterly by
113
+ the Fed.'
114
+ - 'Industrial production is a statistic determined by the Federal Reserve Board
115
+ that measures the total output of all US factories and mines on a monthly basis.
116
+ The Fed collects data from various government agencies and trade associations
117
+ to calculate the industrial production index, which serves as an important economic
118
+ indicator, providing insight into the health of the manufacturing and mining sectors.
119
+
120
+ Industrial production is a monthly statistic calculated by the Federal Reserve
121
+ Board, measuring the total output of US factories and mines using data from government
122
+ agencies and trade associations, serving as a key economic indicator for the manufacturing
123
+ and mining sectors.'
124
+ pipeline_tag: sentence-similarity
125
+ model-index:
126
+ - name: SentenceTransformer based on BAAI/bge-small-en-v1.5
127
+ results:
128
+ - task:
129
+ type: triplet
130
+ name: Triplet
131
+ dataset:
132
+ name: Finance Embedding Metric
133
+ type: Finance_Embedding_Metric
134
+ metrics:
135
+ - type: cosine_accuracy
136
+ value: 0.9791425260718424
137
+ name: Cosine Accuracy
138
+ - type: dot_accuracy
139
+ value: 0.02085747392815759
140
+ name: Dot Accuracy
141
+ - type: manhattan_accuracy
142
+ value: 0.9779837775202781
143
+ name: Manhattan Accuracy
144
+ - type: euclidean_accuracy
145
+ value: 0.9791425260718424
146
+ name: Euclidean Accuracy
147
+ - type: max_accuracy
148
+ value: 0.9791425260718424
149
+ name: Max Accuracy
150
+ ---
151
+
152
+ # SentenceTransformer based on BAAI/bge-small-en-v1.5
153
+
154
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) on the [baconnier/finance_dataset_small_private](https://huggingface.co/datasets/baconnier/finance_dataset_small_private) dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
155
+
156
+ ## Model Details
157
+
158
+ ### Model Description
159
+ - **Model Type:** Sentence Transformer
160
+ - **Base model:** [BAAI/bge-small-en-v1.5](https://huggingface.co/BAAI/bge-small-en-v1.5) <!-- at revision 5c38ec7c405ec4b44b94cc5a9bb96e735b38267a -->
161
+ - **Maximum Sequence Length:** 512 tokens
162
+ - **Output Dimensionality:** 384 tokens
163
+ - **Similarity Function:** Cosine Similarity
164
+ - **Training Dataset:**
165
+ - [baconnier/finance_dataset_small_private](https://huggingface.co/datasets/baconnier/finance_dataset_small_private)
166
+ <!-- - **Language:** Unknown -->
167
+ <!-- - **License:** Unknown -->
168
+
169
+ ### Model Sources
170
+
171
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
172
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
173
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
174
+
175
+ ### Full Model Architecture
176
+
177
+ ```
178
+ SentenceTransformer(
179
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
180
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
181
+ (2): Normalize()
182
+ )
183
+ ```
184
+
185
+ ## Usage
186
+
187
+ ### Direct Usage (Sentence Transformers)
188
+
189
+ First install the Sentence Transformers library:
190
+
191
+ ```bash
192
+ pip install -U sentence-transformers
193
+ ```
194
+
195
+ Then you can load this model and run inference.
196
+ ```python
197
+ from sentence_transformers import SentenceTransformer
198
+
199
+ # Download from the 🤗 Hub
200
+ model = SentenceTransformer("baconnier/Finance2_embedding_small_en-V1.5")
201
+ # Run inference
202
+ sentences = [
203
+ 'What is industrial production, and how is it measured by the Federal Reserve Board?',
204
+ 'Industrial production is a statistic determined by the Federal Reserve Board that measures the total output of all US factories and mines on a monthly basis. The Fed collects data from various government agencies and trade associations to calculate the industrial production index, which serves as an important economic indicator, providing insight into the health of the manufacturing and mining sectors.\nIndustrial production is a monthly statistic calculated by the Federal Reserve Board, measuring the total output of US factories and mines using data from government agencies and trade associations, serving as a key economic indicator for the manufacturing and mining sectors.',
205
+ 'Industrial production is a statistic that measures the output of factories and mines in the US. It is released by the Federal Reserve Board every quarter.\nIndustrial production measures factory and mine output, released quarterly by the Fed.',
206
+ ]
207
+ embeddings = model.encode(sentences)
208
+ print(embeddings.shape)
209
+ # [3, 384]
210
+
211
+ # Get the similarity scores for the embeddings
212
+ similarities = model.similarity(embeddings, embeddings)
213
+ print(similarities.shape)
214
+ # [3, 3]
215
+ ```
216
+
217
+ <!--
218
+ ### Direct Usage (Transformers)
219
+
220
+ <details><summary>Click to see the direct usage in Transformers</summary>
221
+
222
+ </details>
223
+ -->
224
+
225
+ <!--
226
+ ### Downstream Usage (Sentence Transformers)
227
+
228
+ You can finetune this model on your own dataset.
229
+
230
+ <details><summary>Click to expand</summary>
231
+
232
+ </details>
233
+ -->
234
+
235
+ <!--
236
+ ### Out-of-Scope Use
237
+
238
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
239
+ -->
240
+
241
+ ## Evaluation
242
+
243
+ ### Metrics
244
+
245
+ #### Triplet
246
+ * Dataset: `Finance_Embedding_Metric`
247
+ * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
248
+
249
+ | Metric | Value |
250
+ |:-------------------|:-----------|
251
+ | cosine_accuracy | 0.9791 |
252
+ | dot_accuracy | 0.0209 |
253
+ | manhattan_accuracy | 0.978 |
254
+ | euclidean_accuracy | 0.9791 |
255
+ | **max_accuracy** | **0.9791** |
256
+
257
+ <!--
258
+ ## Bias, Risks and Limitations
259
+
260
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
261
+ -->
262
+
263
+ <!--
264
+ ### Recommendations
265
+
266
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
267
+ -->
268
+
269
+ ## Training Details
270
+
271
+ ### Training Dataset
272
+
273
+ #### baconnier/finance_dataset_small_private
274
+
275
+ * Dataset: [baconnier/finance_dataset_small_private](https://huggingface.co/datasets/baconnier/finance_dataset_small_private) at [d7e6492](https://huggingface.co/datasets/baconnier/finance_dataset_small_private/tree/d7e6492d2b42d28b49bbe5f2c91bf93f04b570cb)
276
+ * Size: 15,525 training samples
277
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
278
+ * Approximate statistics based on the first 1000 samples:
279
+ | | anchor | positive | negative |
280
+ |:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
281
+ | type | string | string | string |
282
+ | details | <ul><li>min: 11 tokens</li><li>mean: 76.86 tokens</li><li>max: 304 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 79.23 tokens</li><li>max: 299 tokens</li></ul> | <ul><li>min: 14 tokens</li><li>mean: 60.36 tokens</li><li>max: 155 tokens</li></ul> |
283
+ * Samples:
284
+ | anchor | positive | negative |
285
+ |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
286
+ | <code>What is the key difference between a whole loan and a participation loan in terms of investment ownership?</code> | <code>The context clearly states that a whole loan is a type of investment where an investor purchases the entire mortgage loan from the original lender, becoming the sole owner. This is in contrast to a participation loan, where multiple investors share ownership of a single loan. Therefore, the key difference between a whole loan and a participation loan is that a whole loan is owned entirely by a single investor, while a participation loan involves shared ownership among multiple investors.<br>In a whole loan, a single investor owns the entire mortgage loan, while in a participation loan, multiple investors share ownership of the loan.</code> | <code>A whole loan is where multiple investors share ownership of a loan, while a participation loan is where an investor purchases the entire loan. Since the context states that a whole loan is where an investor purchases the entire mortgage loan and becomes the sole owner, this answer is incorrect.<br>A whole loan involves multiple investors, while a participation loan is owned by a single investor.</code> |
287
+ | <code>The role of an executor is to manage and distribute the assets of a deceased person's estate in accordance with their will. This includes tasks such as settling debts, filing tax returns, and ensuring that the assets are distributed to the beneficiaries as specified in the will. The executor is appointed by the court to carry out these duties. In the given context, Michael Johnson was nominated by John Smith in his will and appointed by the court as the executor of John's estate, which was valued at $5 million. Michael's responsibilities include dividing the estate equally among John's three children, donating $500,000 to the local animal shelter as per John's instructions, settling the $200,000 mortgage and $50,000 credit card debt, and filing John's final income tax return and paying any outstanding taxes.<br>An executor, appointed by the court, manages and distributes a deceased person's assets according to their will, settling debts, filing taxes, and ensuring the will is followed.</code> | <code>What is the role of an executor in managing a deceased person's estate?</code> | <code>An executor is someone who manages a deceased person's estate. They are responsible for distributing the assets according to the will. In this case, John Smith passed away and nominated Michael Johnson as the executor.<br>The executor is responsible for distributing the assets of a deceased person's estate according to their will.</code> |
288
+ | <code>What is a ticker tape, and how does it help investors?</code> | <code>A ticker tape is a computerized device that relays stock symbols, latest prices, and trading volumes to investors worldwide in real-time. It helps investors by providing up-to-the-second information about the stocks they are monitoring or interested in, enabling them to make quick and informed trading decisions based on the most current market data available.<br>A ticker tape is a real-time digital stock data display that empowers investors to make timely, data-driven trading decisions by providing the latest stock symbols, prices, and volumes.</code> | <code>A ticker tape is a device that shows stock information. It helps investors by providing some data about stocks.<br>A ticker tape provides stock data to investors.</code> |
289
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
290
+ ```json
291
+ {
292
+ "scale": 20.0,
293
+ "similarity_fct": "cos_sim"
294
+ }
295
+ ```
296
+
297
+ ### Evaluation Dataset
298
+
299
+ #### baconnier/finance_dataset_small_private
300
+
301
+ * Dataset: [baconnier/finance_dataset_small_private](https://huggingface.co/datasets/baconnier/finance_dataset_small_private) at [d7e6492](https://huggingface.co/datasets/baconnier/finance_dataset_small_private/tree/d7e6492d2b42d28b49bbe5f2c91bf93f04b570cb)
302
+ * Size: 862 evaluation samples
303
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
304
+ * Approximate statistics based on the first 1000 samples:
305
+ | | anchor | positive | negative |
306
+ |:--------|:------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
307
+ | type | string | string | string |
308
+ | details | <ul><li>min: 10 tokens</li><li>mean: 78.51 tokens</li><li>max: 286 tokens</li></ul> | <ul><li>min: 12 tokens</li><li>mean: 76.02 tokens</li><li>max: 304 tokens</li></ul> | <ul><li>min: 20 tokens</li><li>mean: 59.8 tokens</li><li>max: 271 tokens</li></ul> |
309
+ * Samples:
310
+ | anchor | positive | negative |
311
+ |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
312
+ | <code>What is the underwriter's discount in the given IPO scenario, and how does it relate to the gross spread?</code> | <code>The underwriter's discount is the difference between the price the underwriter pays for the shares and the price at which they sell them to the public. In this case, the underwriter buys the shares at a 7% discount from the IPO price of $20 per share. The underwriter's discount is also known as the gross spread, as it represents the gross profit earned by the underwriter.<br>The underwriter's discount is 7%, which is equivalent to $1.40 per share. This is also known as the gross spread, representing the underwriter's gross profit.</code> | <code>The underwriter's discount is the difference between the price the underwriter pays for the shares and the price at which they sell them to the public. In this case, the underwriter buys the shares at a 7% discount, but the gross spread is not mentioned.<br>The underwriter's discount is 7%, but the gross spread is unknown.</code> |
313
+ | <code>What is the primary function of the equity market, and how does it relate to the stock market?</code> | <code>The equity market, synonymous with the stock market, serves as a platform for companies to issue ownership shares to raise capital for growth and expansion. Simultaneously, it allows investors to buy these shares, becoming part-owners of the companies and potentially earning returns through stock price appreciation and dividends. The equity market plays a vital role in the financial system by efficiently allocating capital to businesses and providing investment opportunities to individuals and institutions.<br>The equity market, or stock market, primarily functions as a mechanism for companies to raise capital by issuing ownership shares, while providing investors with opportunities to invest in these companies and earn returns, thus facilitating efficient capital allocation in the financial system.</code> | <code>The equity market is where ownership shares of companies are bought and sold. It allows companies to raise money by selling stocks. The stock market is the same as the equity market.<br>The equity market and the stock market are the same thing, where stocks are traded.</code> |
314
+ | <code>A selling syndicate is a group of investment banks that work together to underwrite and distribute a new security issue, such as stocks or bonds, to investors. The syndicate is typically led by one or more lead underwriters, who coordinate the distribution of the securities and set the offering price. In the case of XYZ Corporation, the selling syndicate is led by ABC Investment Bank and consists of 5 investment banks in total. The syndicate has agreed to purchase 10 million new shares from XYZ Corporation at a fixed price of $50 per share, which they will then sell to investors at a higher price of $55 per share. This process allows XYZ Corporation to raise capital by issuing new shares, while the selling syndicate earns a commission on the sale of the shares. The syndicate's role is to facilitate the distribution of the new shares to a wider pool of investors, helping to ensure the success of the offering.<br>A selling syndicate is a group of investment banks that jointly underwrite and distribute a new security issue to investors. In XYZ Corporation's case, the syndicate will purchase shares from the company at a fixed price and resell them to investors at a higher price, earning a commission and facilitating the successful distribution of the new shares.</code> | <code>What is a selling syndicate, and how does it function in the context of XYZ Corporation's new share issue?</code> | <code>A selling syndicate is a group of investment banks that work together to sell new shares of a company. In this case, XYZ Corporation has hired 5 investment banks to sell their new shares. The syndicate buys the shares from XYZ Corporation at a fixed price and then sells them to investors at a higher price.<br>A selling syndicate is a group of investment banks that jointly underwrite and distribute new shares of a company to investors, buying the shares at a fixed price and selling them at a higher price.</code> |
315
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
316
+ ```json
317
+ {
318
+ "scale": 20.0,
319
+ "similarity_fct": "cos_sim"
320
+ }
321
+ ```
322
+
323
+ ### Training Hyperparameters
324
+ #### Non-Default Hyperparameters
325
+
326
+ - `eval_strategy`: steps
327
+ - `per_device_train_batch_size`: 16
328
+ - `per_device_eval_batch_size`: 16
329
+ - `num_train_epochs`: 1
330
+ - `warmup_ratio`: 0.1
331
+ - `bf16`: True
332
+ - `batch_sampler`: no_duplicates
333
+
334
+ #### All Hyperparameters
335
+ <details><summary>Click to expand</summary>
336
+
337
+ - `overwrite_output_dir`: False
338
+ - `do_predict`: False
339
+ - `eval_strategy`: steps
340
+ - `prediction_loss_only`: True
341
+ - `per_device_train_batch_size`: 16
342
+ - `per_device_eval_batch_size`: 16
343
+ - `per_gpu_train_batch_size`: None
344
+ - `per_gpu_eval_batch_size`: None
345
+ - `gradient_accumulation_steps`: 1
346
+ - `eval_accumulation_steps`: None
347
+ - `learning_rate`: 5e-05
348
+ - `weight_decay`: 0.0
349
+ - `adam_beta1`: 0.9
350
+ - `adam_beta2`: 0.999
351
+ - `adam_epsilon`: 1e-08
352
+ - `max_grad_norm`: 1.0
353
+ - `num_train_epochs`: 1
354
+ - `max_steps`: -1
355
+ - `lr_scheduler_type`: linear
356
+ - `lr_scheduler_kwargs`: {}
357
+ - `warmup_ratio`: 0.1
358
+ - `warmup_steps`: 0
359
+ - `log_level`: passive
360
+ - `log_level_replica`: warning
361
+ - `log_on_each_node`: True
362
+ - `logging_nan_inf_filter`: True
363
+ - `save_safetensors`: True
364
+ - `save_on_each_node`: False
365
+ - `save_only_model`: False
366
+ - `restore_callback_states_from_checkpoint`: False
367
+ - `no_cuda`: False
368
+ - `use_cpu`: False
369
+ - `use_mps_device`: False
370
+ - `seed`: 42
371
+ - `data_seed`: None
372
+ - `jit_mode_eval`: False
373
+ - `use_ipex`: False
374
+ - `bf16`: True
375
+ - `fp16`: False
376
+ - `fp16_opt_level`: O1
377
+ - `half_precision_backend`: auto
378
+ - `bf16_full_eval`: False
379
+ - `fp16_full_eval`: False
380
+ - `tf32`: None
381
+ - `local_rank`: 0
382
+ - `ddp_backend`: None
383
+ - `tpu_num_cores`: None
384
+ - `tpu_metrics_debug`: False
385
+ - `debug`: []
386
+ - `dataloader_drop_last`: False
387
+ - `dataloader_num_workers`: 0
388
+ - `dataloader_prefetch_factor`: None
389
+ - `past_index`: -1
390
+ - `disable_tqdm`: False
391
+ - `remove_unused_columns`: True
392
+ - `label_names`: None
393
+ - `load_best_model_at_end`: False
394
+ - `ignore_data_skip`: False
395
+ - `fsdp`: []
396
+ - `fsdp_min_num_params`: 0
397
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
398
+ - `fsdp_transformer_layer_cls_to_wrap`: None
399
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
400
+ - `deepspeed`: None
401
+ - `label_smoothing_factor`: 0.0
402
+ - `optim`: adamw_torch
403
+ - `optim_args`: None
404
+ - `adafactor`: False
405
+ - `group_by_length`: False
406
+ - `length_column_name`: length
407
+ - `ddp_find_unused_parameters`: None
408
+ - `ddp_bucket_cap_mb`: None
409
+ - `ddp_broadcast_buffers`: False
410
+ - `dataloader_pin_memory`: True
411
+ - `dataloader_persistent_workers`: False
412
+ - `skip_memory_metrics`: True
413
+ - `use_legacy_prediction_loop`: False
414
+ - `push_to_hub`: False
415
+ - `resume_from_checkpoint`: None
416
+ - `hub_model_id`: None
417
+ - `hub_strategy`: every_save
418
+ - `hub_private_repo`: False
419
+ - `hub_always_push`: False
420
+ - `gradient_checkpointing`: False
421
+ - `gradient_checkpointing_kwargs`: None
422
+ - `include_inputs_for_metrics`: False
423
+ - `eval_do_concat_batches`: True
424
+ - `fp16_backend`: auto
425
+ - `push_to_hub_model_id`: None
426
+ - `push_to_hub_organization`: None
427
+ - `mp_parameters`:
428
+ - `auto_find_batch_size`: False
429
+ - `full_determinism`: False
430
+ - `torchdynamo`: None
431
+ - `ray_scope`: last
432
+ - `ddp_timeout`: 1800
433
+ - `torch_compile`: False
434
+ - `torch_compile_backend`: None
435
+ - `torch_compile_mode`: None
436
+ - `dispatch_batches`: None
437
+ - `split_batches`: None
438
+ - `include_tokens_per_second`: False
439
+ - `include_num_input_tokens_seen`: False
440
+ - `neftune_noise_alpha`: None
441
+ - `optim_target_modules`: None
442
+ - `batch_eval_metrics`: False
443
+ - `batch_sampler`: no_duplicates
444
+ - `multi_dataset_batch_sampler`: proportional
445
+
446
+ </details>
447
+
448
+ ### Training Logs
449
+ | Epoch | Step | Training Loss | loss | Finance_Embedding_Metric_max_accuracy |
450
+ |:------:|:----:|:-------------:|:------:|:-------------------------------------:|
451
+ | 0.0103 | 10 | 0.9918 | - | - |
452
+ | 0.0206 | 20 | 0.8866 | - | - |
453
+ | 0.0309 | 30 | 0.7545 | - | - |
454
+ | 0.0412 | 40 | 0.6731 | - | - |
455
+ | 0.0515 | 50 | 0.2897 | - | - |
456
+ | 0.0618 | 60 | 0.214 | - | - |
457
+ | 0.0721 | 70 | 0.1677 | - | - |
458
+ | 0.0824 | 80 | 0.0479 | - | - |
459
+ | 0.0927 | 90 | 0.191 | - | - |
460
+ | 0.1030 | 100 | 0.1188 | - | - |
461
+ | 0.1133 | 110 | 0.1909 | - | - |
462
+ | 0.1236 | 120 | 0.0486 | - | - |
463
+ | 0.1339 | 130 | 0.0812 | - | - |
464
+ | 0.1442 | 140 | 0.1282 | - | - |
465
+ | 0.1545 | 150 | 0.15 | - | - |
466
+ | 0.1648 | 160 | 0.0605 | - | - |
467
+ | 0.1751 | 170 | 0.0431 | - | - |
468
+ | 0.1854 | 180 | 0.0613 | - | - |
469
+ | 0.1957 | 190 | 0.0407 | - | - |
470
+ | 0.2008 | 195 | - | 0.0605 | - |
471
+ | 0.2060 | 200 | 0.0567 | - | - |
472
+ | 0.2163 | 210 | 0.0294 | - | - |
473
+ | 0.2266 | 220 | 0.0284 | - | - |
474
+ | 0.2369 | 230 | 0.0444 | - | - |
475
+ | 0.2472 | 240 | 0.0559 | - | - |
476
+ | 0.2575 | 250 | 0.0301 | - | - |
477
+ | 0.2678 | 260 | 0.0225 | - | - |
478
+ | 0.2781 | 270 | 0.0256 | - | - |
479
+ | 0.2884 | 280 | 0.016 | - | - |
480
+ | 0.2987 | 290 | 0.0063 | - | - |
481
+ | 0.3090 | 300 | 0.0442 | - | - |
482
+ | 0.3193 | 310 | 0.0425 | - | - |
483
+ | 0.3296 | 320 | 0.0534 | - | - |
484
+ | 0.3399 | 330 | 0.0264 | - | - |
485
+ | 0.3502 | 340 | 0.043 | - | - |
486
+ | 0.3605 | 350 | 0.035 | - | - |
487
+ | 0.3708 | 360 | 0.0212 | - | - |
488
+ | 0.3811 | 370 | 0.0171 | - | - |
489
+ | 0.3913 | 380 | 0.0497 | - | - |
490
+ | 0.4016 | 390 | 0.0294 | 0.0381 | - |
491
+ | 0.4119 | 400 | 0.0317 | - | - |
492
+ | 0.4222 | 410 | 0.0571 | - | - |
493
+ | 0.4325 | 420 | 0.0251 | - | - |
494
+ | 0.4428 | 430 | 0.0162 | - | - |
495
+ | 0.4531 | 440 | 0.0504 | - | - |
496
+ | 0.4634 | 450 | 0.0257 | - | - |
497
+ | 0.4737 | 460 | 0.0185 | - | - |
498
+ | 0.4840 | 470 | 0.0414 | - | - |
499
+ | 0.4943 | 480 | 0.016 | - | - |
500
+ | 0.5046 | 490 | 0.0432 | - | - |
501
+ | 0.5149 | 500 | 0.0369 | - | - |
502
+ | 0.5252 | 510 | 0.0115 | - | - |
503
+ | 0.5355 | 520 | 0.034 | - | - |
504
+ | 0.5458 | 530 | 0.0143 | - | - |
505
+ | 0.5561 | 540 | 0.0225 | - | - |
506
+ | 0.5664 | 550 | 0.0185 | - | - |
507
+ | 0.5767 | 560 | 0.0085 | - | - |
508
+ | 0.5870 | 570 | 0.0262 | - | - |
509
+ | 0.5973 | 580 | 0.0465 | - | - |
510
+ | 0.6025 | 585 | - | 0.0541 | - |
511
+ | 0.6076 | 590 | 0.0121 | - | - |
512
+ | 0.6179 | 600 | 0.0256 | - | - |
513
+ | 0.6282 | 610 | 0.0203 | - | - |
514
+ | 0.6385 | 620 | 0.0301 | - | - |
515
+ | 0.6488 | 630 | 0.017 | - | - |
516
+ | 0.6591 | 640 | 0.0321 | - | - |
517
+ | 0.6694 | 650 | 0.0087 | - | - |
518
+ | 0.6797 | 660 | 0.0276 | - | - |
519
+ | 0.6900 | 670 | 0.0043 | - | - |
520
+ | 0.7003 | 680 | 0.0063 | - | - |
521
+ | 0.7106 | 690 | 0.0293 | - | - |
522
+ | 0.7209 | 700 | 0.01 | - | - |
523
+ | 0.7312 | 710 | 0.0121 | - | - |
524
+ | 0.7415 | 720 | 0.0164 | - | - |
525
+ | 0.7518 | 730 | 0.0052 | - | - |
526
+ | 0.7621 | 740 | 0.0271 | - | - |
527
+ | 0.7724 | 750 | 0.0363 | - | - |
528
+ | 0.7827 | 760 | 0.0523 | - | - |
529
+ | 0.7930 | 770 | 0.0153 | - | - |
530
+ | 0.8033 | 780 | 0.015 | 0.0513 | - |
531
+ | 0.8136 | 790 | 0.0042 | - | - |
532
+ | 0.8239 | 800 | 0.0088 | - | - |
533
+ | 0.8342 | 810 | 0.0217 | - | - |
534
+ | 0.8445 | 820 | 0.0345 | - | - |
535
+ | 0.8548 | 830 | 0.01 | - | - |
536
+ | 0.8651 | 840 | 0.0243 | - | - |
537
+ | 0.8754 | 850 | 0.0074 | - | - |
538
+ | 0.8857 | 860 | 0.0082 | - | - |
539
+ | 0.8960 | 870 | 0.0104 | - | - |
540
+ | 0.9063 | 880 | 0.0078 | - | - |
541
+ | 0.9166 | 890 | 0.0163 | - | - |
542
+ | 0.9269 | 900 | 0.0168 | - | - |
543
+ | 0.9372 | 910 | 0.0088 | - | - |
544
+ | 0.9475 | 920 | 0.0186 | - | - |
545
+ | 0.9578 | 930 | 0.0055 | - | - |
546
+ | 0.9681 | 940 | 0.0142 | - | - |
547
+ | 0.9784 | 950 | 0.0251 | - | - |
548
+ | 0.9887 | 960 | 0.0468 | - | - |
549
+ | 0.9990 | 970 | 0.0031 | - | - |
550
+ | 1.0 | 971 | - | - | 0.9791 |
551
+
552
+
553
+ ### Framework Versions
554
+ - Python: 3.10.12
555
+ - Sentence Transformers: 3.0.1
556
+ - Transformers: 4.41.2
557
+ - PyTorch: 2.3.0+cu121
558
+ - Accelerate: 0.31.0
559
+ - Datasets: 2.19.2
560
+ - Tokenizers: 0.19.1
561
+
562
+ ## Citation
563
+
564
+ ### BibTeX
565
+
566
+ #### Sentence Transformers
567
+ ```bibtex
568
+ @inproceedings{reimers-2019-sentence-bert,
569
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
570
+ author = "Reimers, Nils and Gurevych, Iryna",
571
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
572
+ month = "11",
573
+ year = "2019",
574
+ publisher = "Association for Computational Linguistics",
575
+ url = "https://arxiv.org/abs/1908.10084",
576
+ }
577
+ ```
578
+
579
+ #### MultipleNegativesRankingLoss
580
+ ```bibtex
581
+ @misc{henderson2017efficient,
582
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
583
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
584
+ year={2017},
585
+ eprint={1705.00652},
586
+ archivePrefix={arXiv},
587
+ primaryClass={cs.CL}
588
+ }
589
+ ```
590
+
591
+ <!--
592
+ ## Glossary
593
+
594
+ *Clearly define terms in order to be accessible across audiences.*
595
+ -->
596
+
597
+ <!--
598
+ ## Model Card Authors
599
+
600
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
601
+ -->
602
+
603
+ <!--
604
+ ## Model Card Contact
605
+
606
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
607
+ -->
config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-small-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 384,
11
+ "id2label": {
12
+ "0": "LABEL_0"
13
+ },
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 1536,
16
+ "label2id": {
17
+ "LABEL_0": 0
18
+ },
19
+ "layer_norm_eps": 1e-12,
20
+ "max_position_embeddings": 512,
21
+ "model_type": "bert",
22
+ "num_attention_heads": 12,
23
+ "num_hidden_layers": 12,
24
+ "pad_token_id": 0,
25
+ "position_embedding_type": "absolute",
26
+ "torch_dtype": "float32",
27
+ "transformers_version": "4.41.2",
28
+ "type_vocab_size": 2,
29
+ "use_cache": true,
30
+ "vocab_size": 30522
31
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.3.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2373644feb8c22b76053c00166da47ab13ab96a06a4bb7c3c4544c334970a701
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff