Delete mx-01
Browse files- mx-01/1_Pooling/config.json +0 -10
- mx-01/README.md +0 -1415
- mx-01/config.json +0 -26
- mx-01/config_sentence_transformers.json +0 -10
- mx-01/log.txt +0 -912
- mx-01/model.safetensors +0 -3
- mx-01/modules.json +0 -14
- mx-01/mx_eval.csv +0 -2
- mx-01/sentence_bert_config.json +0 -4
- mx-01/special_tokens_map.json +0 -7
- mx-01/tokenizer.json +0 -0
- mx-01/tokenizer_config.json +0 -57
- mx-01/vocab.txt +0 -0
mx-01/1_Pooling/config.json
DELETED
@@ -1,10 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"word_embedding_dimension": 384,
|
3 |
-
"pooling_mode_cls_token": false,
|
4 |
-
"pooling_mode_mean_tokens": true,
|
5 |
-
"pooling_mode_max_tokens": false,
|
6 |
-
"pooling_mode_mean_sqrt_len_tokens": false,
|
7 |
-
"pooling_mode_weightedmean_tokens": false,
|
8 |
-
"pooling_mode_lasttoken": false,
|
9 |
-
"include_prompt": true
|
10 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mx-01/README.md
DELETED
@@ -1,1415 +0,0 @@
|
|
1 |
-
---
|
2 |
-
base_model: nreimers/MiniLM-L6-H384-uncased
|
3 |
-
datasets: []
|
4 |
-
language: []
|
5 |
-
library_name: sentence-transformers
|
6 |
-
pipeline_tag: sentence-similarity
|
7 |
-
tags:
|
8 |
-
- sentence-transformers
|
9 |
-
- sentence-similarity
|
10 |
-
- feature-extraction
|
11 |
-
- generated_from_trainer
|
12 |
-
- dataset_size:730454
|
13 |
-
- loss:MultipleNegativesRankingLoss
|
14 |
-
widget:
|
15 |
-
- source_sentence: Continuous finite-time control approach for series elastic actuator
|
16 |
-
sentences:
|
17 |
-
- Distributed coordination is difficult, especially when the system may suffer intrusions
|
18 |
-
that corrupt some component processes. We introduce the abstraction of a failure
|
19 |
-
detector that a process can use to (imperfectly) detect the corruption (Byzantine
|
20 |
-
failure) of another process. In general, our failure detectors can be unreliable,
|
21 |
-
both by reporting a correct process to be faulty or by reporting a faulty process
|
22 |
-
to be correct. However, we show that if these detectors satisfy certain plausible
|
23 |
-
properties, then the well known distributed consensus problem can be solved. We
|
24 |
-
also present a randomized protocol using failure detectors that solves the consensus
|
25 |
-
problem if either the requisite properties of failure detectors hold or if certain
|
26 |
-
highly probable events eventually occur. This work can be viewed as a generalization
|
27 |
-
of benign failure detectors popular in the distributed computing literature.
|
28 |
-
- 'This paper deals with multilevel partial-response class-IV (PRIV) transmission
|
29 |
-
over unshielded twisted-pair (UTP) cables. Specifically, transmission at a rate
|
30 |
-
of 155.52 Mb/s over data-grade UTP cables for local-area networking is considered.
|
31 |
-
As a low-complexity method used to compensate for cable-length dependent signal
|
32 |
-
distortion, adaptive analog equalization with two controlled parameters is proposed:
|
33 |
-
one parameter determines a frequency-independent receiver gain, the other parameter
|
34 |
-
controls the transfer characteristic of a variable analog receive-filter section.
|
35 |
-
For the stepwise design of the transmit and receive filters, a combination of
|
36 |
-
analytic techniques and simulated annealing is employed. First, the variable equalizer
|
37 |
-
section, then the remaining fixed analog receive filter section are developed
|
38 |
-
and finally the analog transmit filter is determined. The paper also describes
|
39 |
-
the adjustment of the equalizer section, and the control of the sampling phase
|
40 |
-
in the receiver front-end. The two equalizer parameters are controlled by an algorithm
|
41 |
-
that operates on the sampled signals and adjusts these parameters to optimum settings
|
42 |
-
independently of the sampling phase. The latter is controlled by a decision-directed
|
43 |
-
phase-locked loop algorithm that becomes effective when equalization has been
|
44 |
-
achieved. The dynamic behaviour and mean-square error in steady-state obtained
|
45 |
-
with these control algorithms are investigated.'
|
46 |
-
- 'In this paper, a practical control approach is suggested for series elastic actuators(SEAs)
|
47 |
-
to generate the desired torque. Firstly, based on the analysis of a nonlinear
|
48 |
-
SEA, the generic dynamics for a class of SEAs is summarized. Then the dynamic
|
49 |
-
equations are transformed into a novel state-space form which is convenient for
|
50 |
-
controller design. Finally, based on the recently developed finite-time control
|
51 |
-
technique, a finite time disturbance observer and a continuous terminal sliding-mode
|
52 |
-
control scheme are introduced to synthesize the control law. The finite-time stability
|
53 |
-
of the proposed controller is theoretically ensured by Lyapunov analysis. Compared
|
54 |
-
with most existing methods, the contribution of the paper is two-fold: (i) The
|
55 |
-
proposed controller is suitable for not only linear, but also a class of nonlinear
|
56 |
-
SEAs, which means that it is a more generic method for SEA torque control; (ii)
|
57 |
-
It achieves faster convergence rate and works well even in the presence of unknown
|
58 |
-
payload parameters and external disturbances. A series of experiments are carried
|
59 |
-
out on the self-built SEA testbed to demonstrate the superior performance of the
|
60 |
-
proposed controller by comparing it with the cascade-PID controller.'
|
61 |
-
- source_sentence: Matrix Methods for Solving Algebraic Systems
|
62 |
-
sentences:
|
63 |
-
- We present our public-domain software for the following tasks in sparse (or toric)
|
64 |
-
elimination theory, given a well-constrained polynomial system. First, C code
|
65 |
-
for computing the mixed volume of the system. Second, Maple code for defining
|
66 |
-
an overconstrained system and constructing a Sylvester-type matrix of its sparse
|
67 |
-
resultant. Third, C code for a Sylvester-type matrix of the sparse resultant and
|
68 |
-
a superset of all common roots of the initial well-constrained system by computing
|
69 |
-
the eigen-decomposition of a square matrix obtained from the resultant matrix.
|
70 |
-
We conclude with experiments in computing molecular conformations.
|
71 |
-
- 'Design trade-offs between estimation performance, processing delay and communication
|
72 |
-
cost for a sensor scheduling problem is discussed. We consider a heterogeneous
|
73 |
-
sensor network with two types of sensors: the first type has low-quality measurements,
|
74 |
-
small processing delay and a light communication cost, while the second type is
|
75 |
-
of high quality, but imposes a large processing delay and a high communication
|
76 |
-
cost. Such a heterogeneous sensor network is common in applications, where for
|
77 |
-
instance in a localization system the poor sensor can be an ultrasound sensor
|
78 |
-
while the more powerful sensor can be a camera. Using a time-periodic Kalman filter,
|
79 |
-
we show how one can find an optimal schedule of the sensor communication. One
|
80 |
-
can significantly improve estimation quality by only using the expensive sensor
|
81 |
-
rarely. We also demonstrate how simple sensor switching rules based on the Riccati
|
82 |
-
equation drives the filter into a stable time-periodic Kalman filter.'
|
83 |
-
- The Multi-stage Genetic Algorithm, MGA, is introduced to solve a class of compositional
|
84 |
-
design problems. The problem with complicated constraints is formulated as a set
|
85 |
-
of local subproblems with simple constraints and a supervising problem. Every
|
86 |
-
subproblem is solved by GA to generate a set of suboptimal solutions. And in the
|
87 |
-
supervising problem, the elements of each set are optimally combined by GA to
|
88 |
-
yield the optimal solution for the original problem. The method is a learning
|
89 |
-
method where the empirical knowledge obtained by solving the problem is effectively
|
90 |
-
utilized to solve similar problems efficiently. Extended knapsack problems are
|
91 |
-
solved to demonstrate the proposed method, and the efficiency of the method is
|
92 |
-
shown. In addition, the method is successfully applied to optimal realization
|
93 |
-
of cooperative robot soccer behaviors.
|
94 |
-
- source_sentence: Low-power partial-parallel Chien search architecture with polynomial
|
95 |
-
degree reduction
|
96 |
-
sentences:
|
97 |
-
- In this paper, we present a novel attentive and immersive user interface based
|
98 |
-
on gaze and hand gestures for interactive large-scale displays. The combination
|
99 |
-
of gaze and hand gestures provide more interesting and immersive ways to manipulate
|
100 |
-
3D information.
|
101 |
-
- There is significant interest in the synthesis of discrete-state random fields,
|
102 |
-
particularly those possessing structure over a wide range of scales. However,
|
103 |
-
given a model on some finest, pixellated scale, it is computationally very difficult
|
104 |
-
to synthesize both large- and small-scale structures, motivating research into
|
105 |
-
hierarchical methods. In this paper, we propose a frozen-state approach to hierarchical
|
106 |
-
modeling, in which simulated annealing is performed on each scale, constrained
|
107 |
-
by the state estimates at the parent scale. This approach leads to significant
|
108 |
-
advantages in both modeling flexibility and computational complexity. In particular,
|
109 |
-
a complex structure can be realized with very simple, local, scale-dependent models,
|
110 |
-
and by constraining the domain to be annealed at finer scales to only the uncertain
|
111 |
-
portions of coarser scales; the approach leads to huge improvements in computational
|
112 |
-
complexity. Results are shown for a synthesis problem in porous media.
|
113 |
-
- The Chien search for the error locator polynomial root computation in BCH and
|
114 |
-
Reed-Solomon decoding accounts for a significant part of the overall decoder power
|
115 |
-
consumption, especially r long codes over finite fields of high order. For serial
|
116 |
-
Chien search, the power consumption is substantially lowered by a polynomial degree
|
117 |
-
reduction (PDR) scheme. Every time a root is found, it is factored out of the
|
118 |
-
error locator polynomial. Only the hardware units associated with the reduced-degree
|
119 |
-
polynomial coefficients are active. However, this PDR scheme can not be directly
|
120 |
-
extended to partial-parallel Chien search, which is needed in any systems to achieve
|
121 |
-
high throughput. By analyzing the formulas of the evaluation values over finite
|
122 |
-
field elements and available intermediate results of the Chien search, this paper
|
123 |
-
proposes a partial-parallel Chien search architecture that reduces the error locator
|
124 |
-
polynomial degree on the fly whenever a root is found without using long division.
|
125 |
-
For a 122-error-correcting BCH code over GF(215), an 8-parallel Chien search using
|
126 |
-
the proposed architecture achieves 32% power reduction over existing partial-parallel
|
127 |
-
architectures for a typical case.
|
128 |
-
- source_sentence: An efficient network-switch scheduling for real-time applications
|
129 |
-
sentences:
|
130 |
-
- Bursts consist of a varying number of asynchronous transfer mode cells corresponding
|
131 |
-
to a datagram. Here, we generalized weighted fair queueing to a burst-based algorithm
|
132 |
-
with preemption. The new algorithm enhances the performance of the switch service
|
133 |
-
for real-time applications, and it preserves the quality of service guarantees.
|
134 |
-
We study this algorithm theoretically and via simulations.
|
135 |
-
- Online Social Network (OSN) is one of the hottest innovations in the past years,
|
136 |
-
and the active users are more than a billion. For OSN, users' behavior is one
|
137 |
-
of the important factors to study. This demonstration proposal presents Harbinger,
|
138 |
-
an analyzing and predicting system for OSN users' behavior. In Harbinger, we focus
|
139 |
-
on tweets' timestamps (when users post or share messages), visualize users' post
|
140 |
-
behavior as well as message retweet number and build adjustable models to predict
|
141 |
-
users' behavior. Predictions of users' behavior can be performed with the discovered
|
142 |
-
behavior models and the results can be applied to many applications such as tweet
|
143 |
-
crawler and advertisement.
|
144 |
-
- The computation and memory required for kernel machines with N training samples
|
145 |
-
is at least O(N2). Such a complexity is significant even for moderate size problems
|
146 |
-
and is prohibitive for large datasets. We present an approximation technique based
|
147 |
-
on the improved fast Gauss transform to reduce the computation to O(N). We also
|
148 |
-
give an error bound for the approximation, and provide experimental results on
|
149 |
-
the UCI datasets.
|
150 |
-
- source_sentence: Summarizing the Evidence on the International Trade in Illegal
|
151 |
-
Wildlife
|
152 |
-
sentences:
|
153 |
-
- This paper proposes a method to represent classifiers or learned regression functions
|
154 |
-
using an OWL ontology. Also proposed are methods for finding an appropriate learned
|
155 |
-
function to answer a simple query. The ontology standardizes variable names and
|
156 |
-
dependence properties, so that feature values can be given by users or found on
|
157 |
-
the semantic web.
|
158 |
-
- The global trade in illegal wildlife is a multi-billion dollar industry that threatens
|
159 |
-
biodiversity and acts as a potential avenue for invasive species and disease spread.
|
160 |
-
Despite the broad-sweeping implications of illegal wildlife sales, scientists
|
161 |
-
have yet to describe the scope and scale of the trade. Here, we provide the most
|
162 |
-
thorough and current description of the illegal wildlife trade using 12 years
|
163 |
-
of seizure records compiled by TRAFFIC, the wildlife trade monitoring network.
|
164 |
-
These records comprise 967 seizures including massive quantities of ivory, tiger
|
165 |
-
skins, live reptiles, and other endangered wildlife and wildlife products. Most
|
166 |
-
seizures originate in Southeast Asia, a recently identified hotspot for future
|
167 |
-
emerging infectious diseases. To date, regulation and enforcement have been insufficient
|
168 |
-
to effectively control the global trade in illegal wildlife at national and international
|
169 |
-
scales. Effective control will require a multi-pronged approach including community-scale
|
170 |
-
education and empowering local people to value wildlife, coordinated international
|
171 |
-
regulation, and a greater allocation of national resources to on-the-ground enforcement.
|
172 |
-
- Griffithsin (GRFT) is a red alga-derived lectin with demonstrated broad spectrum
|
173 |
-
antiviral activity against enveloped viruses, including severe acute respiratory
|
174 |
-
syndrome–Coronavirus (SARS-CoV), Japanese encephalitis virus (JEV), hepatitis
|
175 |
-
C virus (HCV), and herpes simplex virus-2 (HSV-2). However, its pharmacokinetic
|
176 |
-
profile remains largely undefined. Here, Sprague Dawley rats were administered
|
177 |
-
a single dose of GRFT at 10 or 20 mg/kg by intravenous, oral, and subcutaneous
|
178 |
-
routes, respectively, and serum GRFT levels were measured at select time points.
|
179 |
-
In addition, the potential for systemic accumulation after oral dosing was assessed
|
180 |
-
in rats after 10 daily treatments with GRFT (20 or 40 mg/kg). We found that parenterally-administered
|
181 |
-
GRFT in rats displayed a complex elimination profile, which varied according to
|
182 |
-
administration routes. However, GRFT was not orally bioavailable, even after chronic
|
183 |
-
treatment. Nonetheless, active GRFT capable of neutralizing HIV-Env pseudoviruses
|
184 |
-
was detected in rat fecal extracts after chronic oral dosing. These findings support
|
185 |
-
further evaluation of GRFT for pre-exposure prophylaxis against emerging epidemics
|
186 |
-
for which specific therapeutics are not available, including systemic and enteric
|
187 |
-
infections caused by susceptible enveloped viruses. In addition, GRFT should be
|
188 |
-
considered for antiviral therapy and the prevention of rectal transmission of
|
189 |
-
HIV-1 and other susceptible viruses.
|
190 |
-
---
|
191 |
-
|
192 |
-
# SentenceTransformer based on nreimers/MiniLM-L6-H384-uncased
|
193 |
-
|
194 |
-
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nreimers/MiniLM-L6-H384-uncased](https://huggingface.co/nreimers/MiniLM-L6-H384-uncased). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
195 |
-
|
196 |
-
## Model Details
|
197 |
-
|
198 |
-
### Model Description
|
199 |
-
- **Model Type:** Sentence Transformer
|
200 |
-
- **Base model:** [nreimers/MiniLM-L6-H384-uncased](https://huggingface.co/nreimers/MiniLM-L6-H384-uncased) <!-- at revision 3276f0fac9d818781d7a1327b3ff818fc4e643c0 -->
|
201 |
-
- **Maximum Sequence Length:** 512 tokens
|
202 |
-
- **Output Dimensionality:** 384 tokens
|
203 |
-
- **Similarity Function:** Cosine Similarity
|
204 |
-
<!-- - **Training Dataset:** Unknown -->
|
205 |
-
<!-- - **Language:** Unknown -->
|
206 |
-
<!-- - **License:** Unknown -->
|
207 |
-
|
208 |
-
### Model Sources
|
209 |
-
|
210 |
-
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
|
211 |
-
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
|
212 |
-
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
|
213 |
-
|
214 |
-
### Full Model Architecture
|
215 |
-
|
216 |
-
```
|
217 |
-
SentenceTransformer(
|
218 |
-
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
|
219 |
-
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
|
220 |
-
)
|
221 |
-
```
|
222 |
-
|
223 |
-
## Usage
|
224 |
-
|
225 |
-
### Direct Usage (Sentence Transformers)
|
226 |
-
|
227 |
-
First install the Sentence Transformers library:
|
228 |
-
|
229 |
-
```bash
|
230 |
-
pip install -U sentence-transformers
|
231 |
-
```
|
232 |
-
|
233 |
-
Then you can load this model and run inference.
|
234 |
-
```python
|
235 |
-
from sentence_transformers import SentenceTransformer
|
236 |
-
|
237 |
-
# Download from the 🤗 Hub
|
238 |
-
model = SentenceTransformer("sentence_transformers_model_id")
|
239 |
-
# Run inference
|
240 |
-
sentences = [
|
241 |
-
'Summarizing the Evidence on the International Trade in Illegal Wildlife',
|
242 |
-
'The global trade in illegal wildlife is a multi-billion dollar industry that threatens biodiversity and acts as a potential avenue for invasive species and disease spread. Despite the broad-sweeping implications of illegal wildlife sales, scientists have yet to describe the scope and scale of the trade. Here, we provide the most thorough and current description of the illegal wildlife trade using 12 years of seizure records compiled by TRAFFIC, the wildlife trade monitoring network. These records comprise 967 seizures including massive quantities of ivory, tiger skins, live reptiles, and other endangered wildlife and wildlife products. Most seizures originate in Southeast Asia, a recently identified hotspot for future emerging infectious diseases. To date, regulation and enforcement have been insufficient to effectively control the global trade in illegal wildlife at national and international scales. Effective control will require a multi-pronged approach including community-scale education and empowering local people to value wildlife, coordinated international regulation, and a greater allocation of national resources to on-the-ground enforcement.',
|
243 |
-
'This paper proposes a method to represent classifiers or learned regression functions using an OWL ontology. Also proposed are methods for finding an appropriate learned function to answer a simple query. The ontology standardizes variable names and dependence properties, so that feature values can be given by users or found on the semantic web.',
|
244 |
-
]
|
245 |
-
embeddings = model.encode(sentences)
|
246 |
-
print(embeddings.shape)
|
247 |
-
# [3, 384]
|
248 |
-
|
249 |
-
# Get the similarity scores for the embeddings
|
250 |
-
similarities = model.similarity(embeddings, embeddings)
|
251 |
-
print(similarities.shape)
|
252 |
-
# [3, 3]
|
253 |
-
```
|
254 |
-
|
255 |
-
<!--
|
256 |
-
### Direct Usage (Transformers)
|
257 |
-
|
258 |
-
<details><summary>Click to see the direct usage in Transformers</summary>
|
259 |
-
|
260 |
-
</details>
|
261 |
-
-->
|
262 |
-
|
263 |
-
<!--
|
264 |
-
### Downstream Usage (Sentence Transformers)
|
265 |
-
|
266 |
-
You can finetune this model on your own dataset.
|
267 |
-
|
268 |
-
<details><summary>Click to expand</summary>
|
269 |
-
|
270 |
-
</details>
|
271 |
-
-->
|
272 |
-
|
273 |
-
<!--
|
274 |
-
### Out-of-Scope Use
|
275 |
-
|
276 |
-
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
|
277 |
-
-->
|
278 |
-
|
279 |
-
<!--
|
280 |
-
## Bias, Risks and Limitations
|
281 |
-
|
282 |
-
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
|
283 |
-
-->
|
284 |
-
|
285 |
-
<!--
|
286 |
-
### Recommendations
|
287 |
-
|
288 |
-
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
|
289 |
-
-->
|
290 |
-
|
291 |
-
## Training Details
|
292 |
-
|
293 |
-
### Training Dataset
|
294 |
-
|
295 |
-
#### Unnamed Dataset
|
296 |
-
|
297 |
-
|
298 |
-
* Size: 730,454 training samples
|
299 |
-
* Columns: <code>sentence_0</code> and <code>sentence_1</code>
|
300 |
-
* Approximate statistics based on the first 1000 samples:
|
301 |
-
| | sentence_0 | sentence_1 |
|
302 |
-
|:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
|
303 |
-
| type | string | string |
|
304 |
-
| details | <ul><li>min: 5 tokens</li><li>mean: 15.55 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>min: 21 tokens</li><li>mean: 195.91 tokens</li><li>max: 512 tokens</li></ul> |
|
305 |
-
* Samples:
|
306 |
-
| sentence_0 | sentence_1 |
|
307 |
-
|:-----------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
308 |
-
| <code>A parallel algorithm for constructing independent spanning trees in twisted cubes</code> | <code>A long-standing conjecture mentions that a kk-connected graph GG admits kk independent spanning trees (ISTs for short) rooted at an arbitrary node of GG. An nn-dimensional twisted cube, denoted by TQnTQn, is a variation of hypercube with connectivity nn and has many features superior to those of hypercube. Yang (2010) first proposed an algorithm to construct nn edge-disjoint spanning trees in TQnTQn for any odd integer n⩾3n⩾3 and showed that half of them are ISTs. At a later stage, Wang et al. (2012) inferred that the above conjecture in affirmative for TQnTQn by providing an O(NlogN)O(NlogN) time algorithm to construct nn ISTs, where N=2nN=2n is the number of nodes in TQnTQn. However, this algorithm is executed in a recursive fashion and thus is hard to be parallelized. In this paper, we revisit the problem of constructing ISTs in twisted cubes and present a non-recursive algorithm. Our approach can be fully parallelized to make the use of all nodes of TQnTQn as processors for computation in such a way that each node can determine its parent in all spanning trees directly by referring its address and tree indices in O(logN)O(logN) time.</code> |
|
309 |
-
| <code>A Novel Method for Separating and Locating Multiple Partial Discharge Sources in a Substation</code> | <code>To separate and locate multi-partial discharge (PD) sources in a substation, the use of spectrum differences of ultra-high frequency signals radiated from various sources as characteristic parameters has been previously reported. However, the separation success rate was poor when signal-to-noise ratio was low, and the localization result was a coordinate on two-dimensional plane. In this paper, a novel method is proposed to improve the separation rate and the localization accuracy. A directional measuring platform is built using two directional antennas. The time delay (TD) of the signals captured by the antennas is calculated, and TD sequences are obtained by rotating the platform at different angles. The sequences are separated with the TD distribution feature, and the directions of the multi-PD sources are calculated. The PD sources are located by directions using the error probability method. To verify the method, a simulated model with three PD sources was established by XFdtd. Simulation results show that the separation rate is increased from 71% to 95% compared with the previous method, and an accurate three-dimensional localization result was obtained. A field test with two PD sources was carried out, and the sources were separated and located accurately by the proposed method.</code> |
|
310 |
-
| <code>Every ternary permutation constraint satisfaction problem parameterized above average has a kernel with a quadratic number of variables</code> | <code>A ternary Permutation-CSP is specified by a subset @P of the symmetric group S"3. An instance of such a problem consists of a set of variables V and a multiset of constraints, which are ordered triples of distinct variables of V. The objective is to find a linear ordering @a of V that maximizes the number of triples whose rearrangement (under @a) follows a permutation in @P. We prove that every ternary Permutation-CSP parameterized above average has a kernel with a quadratic number of variables.</code> |
|
311 |
-
* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
|
312 |
-
```json
|
313 |
-
{
|
314 |
-
"scale": 20.0,
|
315 |
-
"similarity_fct": "cos_sim"
|
316 |
-
}
|
317 |
-
```
|
318 |
-
|
319 |
-
### Training Hyperparameters
|
320 |
-
#### Non-Default Hyperparameters
|
321 |
-
|
322 |
-
- `num_train_epochs`: 5
|
323 |
-
- `multi_dataset_batch_sampler`: round_robin
|
324 |
-
|
325 |
-
#### All Hyperparameters
|
326 |
-
<details><summary>Click to expand</summary>
|
327 |
-
|
328 |
-
- `overwrite_output_dir`: False
|
329 |
-
- `do_predict`: False
|
330 |
-
- `eval_strategy`: no
|
331 |
-
- `prediction_loss_only`: True
|
332 |
-
- `per_device_train_batch_size`: 8
|
333 |
-
- `per_device_eval_batch_size`: 8
|
334 |
-
- `per_gpu_train_batch_size`: None
|
335 |
-
- `per_gpu_eval_batch_size`: None
|
336 |
-
- `gradient_accumulation_steps`: 1
|
337 |
-
- `eval_accumulation_steps`: None
|
338 |
-
- `learning_rate`: 5e-05
|
339 |
-
- `weight_decay`: 0.0
|
340 |
-
- `adam_beta1`: 0.9
|
341 |
-
- `adam_beta2`: 0.999
|
342 |
-
- `adam_epsilon`: 1e-08
|
343 |
-
- `max_grad_norm`: 1
|
344 |
-
- `num_train_epochs`: 5
|
345 |
-
- `max_steps`: -1
|
346 |
-
- `lr_scheduler_type`: linear
|
347 |
-
- `lr_scheduler_kwargs`: {}
|
348 |
-
- `warmup_ratio`: 0.0
|
349 |
-
- `warmup_steps`: 0
|
350 |
-
- `log_level`: passive
|
351 |
-
- `log_level_replica`: warning
|
352 |
-
- `log_on_each_node`: True
|
353 |
-
- `logging_nan_inf_filter`: True
|
354 |
-
- `save_safetensors`: True
|
355 |
-
- `save_on_each_node`: False
|
356 |
-
- `save_only_model`: False
|
357 |
-
- `restore_callback_states_from_checkpoint`: False
|
358 |
-
- `no_cuda`: False
|
359 |
-
- `use_cpu`: False
|
360 |
-
- `use_mps_device`: False
|
361 |
-
- `seed`: 42
|
362 |
-
- `data_seed`: None
|
363 |
-
- `jit_mode_eval`: False
|
364 |
-
- `use_ipex`: False
|
365 |
-
- `bf16`: False
|
366 |
-
- `fp16`: False
|
367 |
-
- `fp16_opt_level`: O1
|
368 |
-
- `half_precision_backend`: auto
|
369 |
-
- `bf16_full_eval`: False
|
370 |
-
- `fp16_full_eval`: False
|
371 |
-
- `tf32`: None
|
372 |
-
- `local_rank`: 0
|
373 |
-
- `ddp_backend`: None
|
374 |
-
- `tpu_num_cores`: None
|
375 |
-
- `tpu_metrics_debug`: False
|
376 |
-
- `debug`: []
|
377 |
-
- `dataloader_drop_last`: False
|
378 |
-
- `dataloader_num_workers`: 0
|
379 |
-
- `dataloader_prefetch_factor`: None
|
380 |
-
- `past_index`: -1
|
381 |
-
- `disable_tqdm`: False
|
382 |
-
- `remove_unused_columns`: True
|
383 |
-
- `label_names`: None
|
384 |
-
- `load_best_model_at_end`: False
|
385 |
-
- `ignore_data_skip`: False
|
386 |
-
- `fsdp`: []
|
387 |
-
- `fsdp_min_num_params`: 0
|
388 |
-
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
|
389 |
-
- `fsdp_transformer_layer_cls_to_wrap`: None
|
390 |
-
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
|
391 |
-
- `deepspeed`: None
|
392 |
-
- `label_smoothing_factor`: 0.0
|
393 |
-
- `optim`: adamw_torch
|
394 |
-
- `optim_args`: None
|
395 |
-
- `adafactor`: False
|
396 |
-
- `group_by_length`: False
|
397 |
-
- `length_column_name`: length
|
398 |
-
- `ddp_find_unused_parameters`: None
|
399 |
-
- `ddp_bucket_cap_mb`: None
|
400 |
-
- `ddp_broadcast_buffers`: False
|
401 |
-
- `dataloader_pin_memory`: True
|
402 |
-
- `dataloader_persistent_workers`: False
|
403 |
-
- `skip_memory_metrics`: True
|
404 |
-
- `use_legacy_prediction_loop`: False
|
405 |
-
- `push_to_hub`: False
|
406 |
-
- `resume_from_checkpoint`: None
|
407 |
-
- `hub_model_id`: None
|
408 |
-
- `hub_strategy`: every_save
|
409 |
-
- `hub_private_repo`: False
|
410 |
-
- `hub_always_push`: False
|
411 |
-
- `gradient_checkpointing`: False
|
412 |
-
- `gradient_checkpointing_kwargs`: None
|
413 |
-
- `include_inputs_for_metrics`: False
|
414 |
-
- `eval_do_concat_batches`: True
|
415 |
-
- `fp16_backend`: auto
|
416 |
-
- `push_to_hub_model_id`: None
|
417 |
-
- `push_to_hub_organization`: None
|
418 |
-
- `mp_parameters`:
|
419 |
-
- `auto_find_batch_size`: False
|
420 |
-
- `full_determinism`: False
|
421 |
-
- `torchdynamo`: None
|
422 |
-
- `ray_scope`: last
|
423 |
-
- `ddp_timeout`: 1800
|
424 |
-
- `torch_compile`: False
|
425 |
-
- `torch_compile_backend`: None
|
426 |
-
- `torch_compile_mode`: None
|
427 |
-
- `dispatch_batches`: None
|
428 |
-
- `split_batches`: None
|
429 |
-
- `include_tokens_per_second`: False
|
430 |
-
- `include_num_input_tokens_seen`: False
|
431 |
-
- `neftune_noise_alpha`: None
|
432 |
-
- `optim_target_modules`: None
|
433 |
-
- `batch_eval_metrics`: False
|
434 |
-
- `eval_on_start`: False
|
435 |
-
- `batch_sampler`: batch_sampler
|
436 |
-
- `multi_dataset_batch_sampler`: round_robin
|
437 |
-
|
438 |
-
</details>
|
439 |
-
|
440 |
-
### Training Logs
|
441 |
-
<details><summary>Click to expand</summary>
|
442 |
-
|
443 |
-
| Epoch | Step | Training Loss |
|
444 |
-
|:------:|:------:|:-------------:|
|
445 |
-
| 0.0055 | 500 | 1.6701 |
|
446 |
-
| 0.0110 | 1000 | 0.8225 |
|
447 |
-
| 0.0164 | 1500 | 0.3883 |
|
448 |
-
| 0.0219 | 2000 | 0.2685 |
|
449 |
-
| 0.0274 | 2500 | 0.2349 |
|
450 |
-
| 0.0329 | 3000 | 0.1685 |
|
451 |
-
| 0.0383 | 3500 | 0.1409 |
|
452 |
-
| 0.0438 | 4000 | 0.1262 |
|
453 |
-
| 0.0493 | 4500 | 0.1195 |
|
454 |
-
| 0.0548 | 5000 | 0.1044 |
|
455 |
-
| 0.0602 | 5500 | 0.0989 |
|
456 |
-
| 0.0657 | 6000 | 0.0787 |
|
457 |
-
| 0.0712 | 6500 | 0.0895 |
|
458 |
-
| 0.0767 | 7000 | 0.0708 |
|
459 |
-
| 0.0821 | 7500 | 0.0834 |
|
460 |
-
| 0.0876 | 8000 | 0.0634 |
|
461 |
-
| 0.0931 | 8500 | 0.0643 |
|
462 |
-
| 0.0986 | 9000 | 0.0567 |
|
463 |
-
| 0.1040 | 9500 | 0.0646 |
|
464 |
-
| 0.1095 | 10000 | 0.0607 |
|
465 |
-
| 0.1150 | 10500 | 0.0564 |
|
466 |
-
| 0.1205 | 11000 | 0.068 |
|
467 |
-
| 0.1259 | 11500 | 0.0536 |
|
468 |
-
| 0.1314 | 12000 | 0.0594 |
|
469 |
-
| 0.1369 | 12500 | 0.057 |
|
470 |
-
| 0.1424 | 13000 | 0.0555 |
|
471 |
-
| 0.1479 | 13500 | 0.0485 |
|
472 |
-
| 0.1533 | 14000 | 0.0528 |
|
473 |
-
| 0.1588 | 14500 | 0.0478 |
|
474 |
-
| 0.1643 | 15000 | 0.0586 |
|
475 |
-
| 0.1698 | 15500 | 0.0539 |
|
476 |
-
| 0.1752 | 16000 | 0.0432 |
|
477 |
-
| 0.1807 | 16500 | 0.0542 |
|
478 |
-
| 0.1862 | 17000 | 0.0536 |
|
479 |
-
| 0.1917 | 17500 | 0.0492 |
|
480 |
-
| 0.1971 | 18000 | 0.0427 |
|
481 |
-
| 0.2026 | 18500 | 0.0489 |
|
482 |
-
| 0.2081 | 19000 | 0.0502 |
|
483 |
-
| 0.2136 | 19500 | 0.0432 |
|
484 |
-
| 0.2190 | 20000 | 0.0459 |
|
485 |
-
| 0.2245 | 20500 | 0.0376 |
|
486 |
-
| 0.2300 | 21000 | 0.0489 |
|
487 |
-
| 0.2355 | 21500 | 0.0515 |
|
488 |
-
| 0.2409 | 22000 | 0.0429 |
|
489 |
-
| 0.2464 | 22500 | 0.0417 |
|
490 |
-
| 0.2519 | 23000 | 0.0478 |
|
491 |
-
| 0.2574 | 23500 | 0.0359 |
|
492 |
-
| 0.2628 | 24000 | 0.0452 |
|
493 |
-
| 0.2683 | 24500 | 0.0443 |
|
494 |
-
| 0.2738 | 25000 | 0.0409 |
|
495 |
-
| 0.2793 | 25500 | 0.0421 |
|
496 |
-
| 0.2848 | 26000 | 0.0393 |
|
497 |
-
| 0.2902 | 26500 | 0.0409 |
|
498 |
-
| 0.2957 | 27000 | 0.032 |
|
499 |
-
| 0.3012 | 27500 | 0.0468 |
|
500 |
-
| 0.3067 | 28000 | 0.0285 |
|
501 |
-
| 0.3121 | 28500 | 0.0311 |
|
502 |
-
| 0.3176 | 29000 | 0.0304 |
|
503 |
-
| 0.3231 | 29500 | 0.0349 |
|
504 |
-
| 0.3286 | 30000 | 0.0352 |
|
505 |
-
| 0.3340 | 30500 | 0.0367 |
|
506 |
-
| 0.3395 | 31000 | 0.0385 |
|
507 |
-
| 0.3450 | 31500 | 0.0325 |
|
508 |
-
| 0.3505 | 32000 | 0.0302 |
|
509 |
-
| 0.3559 | 32500 | 0.0393 |
|
510 |
-
| 0.3614 | 33000 | 0.032 |
|
511 |
-
| 0.3669 | 33500 | 0.0263 |
|
512 |
-
| 0.3724 | 34000 | 0.0343 |
|
513 |
-
| 0.3778 | 34500 | 0.0349 |
|
514 |
-
| 0.3833 | 35000 | 0.0282 |
|
515 |
-
| 0.3888 | 35500 | 0.034 |
|
516 |
-
| 0.3943 | 36000 | 0.0376 |
|
517 |
-
| 0.3998 | 36500 | 0.0265 |
|
518 |
-
| 0.4052 | 37000 | 0.0267 |
|
519 |
-
| 0.4107 | 37500 | 0.0241 |
|
520 |
-
| 0.4162 | 38000 | 0.033 |
|
521 |
-
| 0.4217 | 38500 | 0.0323 |
|
522 |
-
| 0.4271 | 39000 | 0.0278 |
|
523 |
-
| 0.4326 | 39500 | 0.025 |
|
524 |
-
| 0.4381 | 40000 | 0.0363 |
|
525 |
-
| 0.4436 | 40500 | 0.0312 |
|
526 |
-
| 0.4490 | 41000 | 0.0307 |
|
527 |
-
| 0.4545 | 41500 | 0.0305 |
|
528 |
-
| 0.4600 | 42000 | 0.028 |
|
529 |
-
| 0.4655 | 42500 | 0.0279 |
|
530 |
-
| 0.4709 | 43000 | 0.0265 |
|
531 |
-
| 0.4764 | 43500 | 0.0262 |
|
532 |
-
| 0.4819 | 44000 | 0.0308 |
|
533 |
-
| 0.4874 | 44500 | 0.0282 |
|
534 |
-
| 0.4928 | 45000 | 0.0243 |
|
535 |
-
| 0.4983 | 45500 | 0.0236 |
|
536 |
-
| 0.5038 | 46000 | 0.02 |
|
537 |
-
| 0.5093 | 46500 | 0.0254 |
|
538 |
-
| 0.5147 | 47000 | 0.0275 |
|
539 |
-
| 0.5202 | 47500 | 0.0309 |
|
540 |
-
| 0.5257 | 48000 | 0.031 |
|
541 |
-
| 0.5312 | 48500 | 0.0271 |
|
542 |
-
| 0.5367 | 49000 | 0.0218 |
|
543 |
-
| 0.5421 | 49500 | 0.0249 |
|
544 |
-
| 0.5476 | 50000 | 0.0285 |
|
545 |
-
| 0.5531 | 50500 | 0.03 |
|
546 |
-
| 0.5586 | 51000 | 0.0284 |
|
547 |
-
| 0.5640 | 51500 | 0.0258 |
|
548 |
-
| 0.5695 | 52000 | 0.0228 |
|
549 |
-
| 0.5750 | 52500 | 0.0305 |
|
550 |
-
| 0.5805 | 53000 | 0.0234 |
|
551 |
-
| 0.5859 | 53500 | 0.0209 |
|
552 |
-
| 0.5914 | 54000 | 0.0341 |
|
553 |
-
| 0.5969 | 54500 | 0.0269 |
|
554 |
-
| 0.6024 | 55000 | 0.0267 |
|
555 |
-
| 0.6078 | 55500 | 0.0245 |
|
556 |
-
| 0.6133 | 56000 | 0.0263 |
|
557 |
-
| 0.6188 | 56500 | 0.0195 |
|
558 |
-
| 0.6243 | 57000 | 0.0209 |
|
559 |
-
| 0.6297 | 57500 | 0.0313 |
|
560 |
-
| 0.6352 | 58000 | 0.0247 |
|
561 |
-
| 0.6407 | 58500 | 0.0285 |
|
562 |
-
| 0.6462 | 59000 | 0.0301 |
|
563 |
-
| 0.6516 | 59500 | 0.0227 |
|
564 |
-
| 0.6571 | 60000 | 0.0235 |
|
565 |
-
| 0.6626 | 60500 | 0.0272 |
|
566 |
-
| 0.6681 | 61000 | 0.025 |
|
567 |
-
| 0.6736 | 61500 | 0.0276 |
|
568 |
-
| 0.6790 | 62000 | 0.0289 |
|
569 |
-
| 0.6845 | 62500 | 0.0232 |
|
570 |
-
| 0.6900 | 63000 | 0.0258 |
|
571 |
-
| 0.6955 | 63500 | 0.0254 |
|
572 |
-
| 0.7009 | 64000 | 0.0205 |
|
573 |
-
| 0.7064 | 64500 | 0.0216 |
|
574 |
-
| 0.7119 | 65000 | 0.0304 |
|
575 |
-
| 0.7174 | 65500 | 0.0234 |
|
576 |
-
| 0.7228 | 66000 | 0.0233 |
|
577 |
-
| 0.7283 | 66500 | 0.0239 |
|
578 |
-
| 0.7338 | 67000 | 0.0166 |
|
579 |
-
| 0.7393 | 67500 | 0.0211 |
|
580 |
-
| 0.7447 | 68000 | 0.0212 |
|
581 |
-
| 0.7502 | 68500 | 0.0247 |
|
582 |
-
| 0.7557 | 69000 | 0.023 |
|
583 |
-
| 0.7612 | 69500 | 0.0261 |
|
584 |
-
| 0.7666 | 70000 | 0.0204 |
|
585 |
-
| 0.7721 | 70500 | 0.026 |
|
586 |
-
| 0.7776 | 71000 | 0.0299 |
|
587 |
-
| 0.7831 | 71500 | 0.0183 |
|
588 |
-
| 0.7885 | 72000 | 0.0228 |
|
589 |
-
| 0.7940 | 72500 | 0.0181 |
|
590 |
-
| 0.7995 | 73000 | 0.0237 |
|
591 |
-
| 0.8050 | 73500 | 0.0237 |
|
592 |
-
| 0.8105 | 74000 | 0.0158 |
|
593 |
-
| 0.8159 | 74500 | 0.0222 |
|
594 |
-
| 0.8214 | 75000 | 0.0196 |
|
595 |
-
| 0.8269 | 75500 | 0.0242 |
|
596 |
-
| 0.8324 | 76000 | 0.0218 |
|
597 |
-
| 0.8378 | 76500 | 0.0201 |
|
598 |
-
| 0.8433 | 77000 | 0.026 |
|
599 |
-
| 0.8488 | 77500 | 0.0232 |
|
600 |
-
| 0.8543 | 78000 | 0.0254 |
|
601 |
-
| 0.8597 | 78500 | 0.0218 |
|
602 |
-
| 0.8652 | 79000 | 0.0219 |
|
603 |
-
| 0.8707 | 79500 | 0.0255 |
|
604 |
-
| 0.8762 | 80000 | 0.0201 |
|
605 |
-
| 0.8816 | 80500 | 0.0301 |
|
606 |
-
| 0.8871 | 81000 | 0.0275 |
|
607 |
-
| 0.8926 | 81500 | 0.018 |
|
608 |
-
| 0.8981 | 82000 | 0.028 |
|
609 |
-
| 0.9035 | 82500 | 0.0223 |
|
610 |
-
| 0.9090 | 83000 | 0.0201 |
|
611 |
-
| 0.9145 | 83500 | 0.0299 |
|
612 |
-
| 0.9200 | 84000 | 0.0251 |
|
613 |
-
| 0.9254 | 84500 | 0.0203 |
|
614 |
-
| 0.9309 | 85000 | 0.0209 |
|
615 |
-
| 0.9364 | 85500 | 0.0236 |
|
616 |
-
| 0.9419 | 86000 | 0.0191 |
|
617 |
-
| 0.9474 | 86500 | 0.0168 |
|
618 |
-
| 0.9528 | 87000 | 0.017 |
|
619 |
-
| 0.9583 | 87500 | 0.0201 |
|
620 |
-
| 0.9638 | 88000 | 0.0171 |
|
621 |
-
| 0.9693 | 88500 | 0.0217 |
|
622 |
-
| 0.9747 | 89000 | 0.0208 |
|
623 |
-
| 0.9802 | 89500 | 0.0157 |
|
624 |
-
| 0.9857 | 90000 | 0.0218 |
|
625 |
-
| 0.9912 | 90500 | 0.021 |
|
626 |
-
| 0.9966 | 91000 | 0.0159 |
|
627 |
-
| 1.0021 | 91500 | 0.0189 |
|
628 |
-
| 1.0076 | 92000 | 0.0182 |
|
629 |
-
| 1.0131 | 92500 | 0.0206 |
|
630 |
-
| 1.0185 | 93000 | 0.0179 |
|
631 |
-
| 1.0240 | 93500 | 0.0168 |
|
632 |
-
| 1.0295 | 94000 | 0.019 |
|
633 |
-
| 1.0350 | 94500 | 0.0173 |
|
634 |
-
| 1.0404 | 95000 | 0.0172 |
|
635 |
-
| 1.0459 | 95500 | 0.0187 |
|
636 |
-
| 1.0514 | 96000 | 0.0199 |
|
637 |
-
| 1.0569 | 96500 | 0.0202 |
|
638 |
-
| 1.0624 | 97000 | 0.0198 |
|
639 |
-
| 1.0678 | 97500 | 0.0157 |
|
640 |
-
| 1.0733 | 98000 | 0.0178 |
|
641 |
-
| 1.0788 | 98500 | 0.0147 |
|
642 |
-
| 1.0843 | 99000 | 0.0152 |
|
643 |
-
| 1.0897 | 99500 | 0.0152 |
|
644 |
-
| 1.0952 | 100000 | 0.0126 |
|
645 |
-
| 1.1007 | 100500 | 0.0115 |
|
646 |
-
| 1.1062 | 101000 | 0.0122 |
|
647 |
-
| 1.1116 | 101500 | 0.0097 |
|
648 |
-
| 1.1171 | 102000 | 0.0149 |
|
649 |
-
| 1.1226 | 102500 | 0.0151 |
|
650 |
-
| 1.1281 | 103000 | 0.0134 |
|
651 |
-
| 1.1335 | 103500 | 0.0157 |
|
652 |
-
| 1.1390 | 104000 | 0.0141 |
|
653 |
-
| 1.1445 | 104500 | 0.0139 |
|
654 |
-
| 1.1500 | 105000 | 0.0149 |
|
655 |
-
| 1.1554 | 105500 | 0.0103 |
|
656 |
-
| 1.1609 | 106000 | 0.0138 |
|
657 |
-
| 1.1664 | 106500 | 0.0116 |
|
658 |
-
| 1.1719 | 107000 | 0.0146 |
|
659 |
-
| 1.1773 | 107500 | 0.0168 |
|
660 |
-
| 1.1828 | 108000 | 0.0166 |
|
661 |
-
| 1.1883 | 108500 | 0.0136 |
|
662 |
-
| 1.1938 | 109000 | 0.0103 |
|
663 |
-
| 1.1993 | 109500 | 0.0128 |
|
664 |
-
| 1.2047 | 110000 | 0.0112 |
|
665 |
-
| 1.2102 | 110500 | 0.0103 |
|
666 |
-
| 1.2157 | 111000 | 0.0133 |
|
667 |
-
| 1.2212 | 111500 | 0.0118 |
|
668 |
-
| 1.2266 | 112000 | 0.009 |
|
669 |
-
| 1.2321 | 112500 | 0.0151 |
|
670 |
-
| 1.2376 | 113000 | 0.0146 |
|
671 |
-
| 1.2431 | 113500 | 0.0143 |
|
672 |
-
| 1.2485 | 114000 | 0.01 |
|
673 |
-
| 1.2540 | 114500 | 0.0147 |
|
674 |
-
| 1.2595 | 115000 | 0.011 |
|
675 |
-
| 1.2650 | 115500 | 0.0121 |
|
676 |
-
| 1.2704 | 116000 | 0.0117 |
|
677 |
-
| 1.2759 | 116500 | 0.0151 |
|
678 |
-
| 1.2814 | 117000 | 0.0143 |
|
679 |
-
| 1.2869 | 117500 | 0.0163 |
|
680 |
-
| 1.2923 | 118000 | 0.0135 |
|
681 |
-
| 1.2978 | 118500 | 0.0118 |
|
682 |
-
| 1.3033 | 119000 | 0.0129 |
|
683 |
-
| 1.3088 | 119500 | 0.0062 |
|
684 |
-
| 1.3142 | 120000 | 0.0127 |
|
685 |
-
| 1.3197 | 120500 | 0.014 |
|
686 |
-
| 1.3252 | 121000 | 0.0131 |
|
687 |
-
| 1.3307 | 121500 | 0.0162 |
|
688 |
-
| 1.3362 | 122000 | 0.0107 |
|
689 |
-
| 1.3416 | 122500 | 0.0125 |
|
690 |
-
| 1.3471 | 123000 | 0.0136 |
|
691 |
-
| 1.3526 | 123500 | 0.0112 |
|
692 |
-
| 1.3581 | 124000 | 0.0126 |
|
693 |
-
| 1.3635 | 124500 | 0.0079 |
|
694 |
-
| 1.3690 | 125000 | 0.0104 |
|
695 |
-
| 1.3745 | 125500 | 0.0137 |
|
696 |
-
| 1.3800 | 126000 | 0.0075 |
|
697 |
-
| 1.3854 | 126500 | 0.0108 |
|
698 |
-
| 1.3909 | 127000 | 0.0087 |
|
699 |
-
| 1.3964 | 127500 | 0.0138 |
|
700 |
-
| 1.4019 | 128000 | 0.0056 |
|
701 |
-
| 1.4073 | 128500 | 0.0067 |
|
702 |
-
| 1.4128 | 129000 | 0.0103 |
|
703 |
-
| 1.4183 | 129500 | 0.0102 |
|
704 |
-
| 1.4238 | 130000 | 0.0119 |
|
705 |
-
| 1.4292 | 130500 | 0.0094 |
|
706 |
-
| 1.4347 | 131000 | 0.0075 |
|
707 |
-
| 1.4402 | 131500 | 0.0146 |
|
708 |
-
| 1.4457 | 132000 | 0.0103 |
|
709 |
-
| 1.4511 | 132500 | 0.0123 |
|
710 |
-
| 1.4566 | 133000 | 0.0107 |
|
711 |
-
| 1.4621 | 133500 | 0.0071 |
|
712 |
-
| 1.4676 | 134000 | 0.0087 |
|
713 |
-
| 1.4731 | 134500 | 0.0072 |
|
714 |
-
| 1.4785 | 135000 | 0.0094 |
|
715 |
-
| 1.4840 | 135500 | 0.0083 |
|
716 |
-
| 1.4895 | 136000 | 0.0104 |
|
717 |
-
| 1.4950 | 136500 | 0.0076 |
|
718 |
-
| 1.5004 | 137000 | 0.006 |
|
719 |
-
| 1.5059 | 137500 | 0.0085 |
|
720 |
-
| 1.5114 | 138000 | 0.0061 |
|
721 |
-
| 1.5169 | 138500 | 0.0106 |
|
722 |
-
| 1.5223 | 139000 | 0.0088 |
|
723 |
-
| 1.5278 | 139500 | 0.0111 |
|
724 |
-
| 1.5333 | 140000 | 0.0094 |
|
725 |
-
| 1.5388 | 140500 | 0.0079 |
|
726 |
-
| 1.5442 | 141000 | 0.0095 |
|
727 |
-
| 1.5497 | 141500 | 0.0098 |
|
728 |
-
| 1.5552 | 142000 | 0.0139 |
|
729 |
-
| 1.5607 | 142500 | 0.0085 |
|
730 |
-
| 1.5661 | 143000 | 0.0094 |
|
731 |
-
| 1.5716 | 143500 | 0.0088 |
|
732 |
-
| 1.5771 | 144000 | 0.0092 |
|
733 |
-
| 1.5826 | 144500 | 0.0071 |
|
734 |
-
| 1.5880 | 145000 | 0.0101 |
|
735 |
-
| 1.5935 | 145500 | 0.011 |
|
736 |
-
| 1.5990 | 146000 | 0.0097 |
|
737 |
-
| 1.6045 | 146500 | 0.0071 |
|
738 |
-
| 1.6100 | 147000 | 0.0114 |
|
739 |
-
| 1.6154 | 147500 | 0.0087 |
|
740 |
-
| 1.6209 | 148000 | 0.0075 |
|
741 |
-
| 1.6264 | 148500 | 0.0039 |
|
742 |
-
| 1.6319 | 149000 | 0.0091 |
|
743 |
-
| 1.6373 | 149500 | 0.0117 |
|
744 |
-
| 1.6428 | 150000 | 0.01 |
|
745 |
-
| 1.6483 | 150500 | 0.0099 |
|
746 |
-
| 1.6538 | 151000 | 0.0069 |
|
747 |
-
| 1.6592 | 151500 | 0.0084 |
|
748 |
-
| 1.6647 | 152000 | 0.0118 |
|
749 |
-
| 1.6702 | 152500 | 0.0078 |
|
750 |
-
| 1.6757 | 153000 | 0.0067 |
|
751 |
-
| 1.6811 | 153500 | 0.0133 |
|
752 |
-
| 1.6866 | 154000 | 0.0079 |
|
753 |
-
| 1.6921 | 154500 | 0.0092 |
|
754 |
-
| 1.6976 | 155000 | 0.0069 |
|
755 |
-
| 1.7030 | 155500 | 0.008 |
|
756 |
-
| 1.7085 | 156000 | 0.0124 |
|
757 |
-
| 1.7140 | 156500 | 0.0112 |
|
758 |
-
| 1.7195 | 157000 | 0.0074 |
|
759 |
-
| 1.7249 | 157500 | 0.0091 |
|
760 |
-
| 1.7304 | 158000 | 0.0088 |
|
761 |
-
| 1.7359 | 158500 | 0.0061 |
|
762 |
-
| 1.7414 | 159000 | 0.0089 |
|
763 |
-
| 1.7469 | 159500 | 0.0082 |
|
764 |
-
| 1.7523 | 160000 | 0.0103 |
|
765 |
-
| 1.7578 | 160500 | 0.0094 |
|
766 |
-
| 1.7633 | 161000 | 0.0073 |
|
767 |
-
| 1.7688 | 161500 | 0.0116 |
|
768 |
-
| 1.7742 | 162000 | 0.0112 |
|
769 |
-
| 1.7797 | 162500 | 0.0057 |
|
770 |
-
| 1.7852 | 163000 | 0.0075 |
|
771 |
-
| 1.7907 | 163500 | 0.0062 |
|
772 |
-
| 1.7961 | 164000 | 0.0046 |
|
773 |
-
| 1.8016 | 164500 | 0.0091 |
|
774 |
-
| 1.8071 | 165000 | 0.0066 |
|
775 |
-
| 1.8126 | 165500 | 0.0051 |
|
776 |
-
| 1.8180 | 166000 | 0.0066 |
|
777 |
-
| 1.8235 | 166500 | 0.0093 |
|
778 |
-
| 1.8290 | 167000 | 0.0079 |
|
779 |
-
| 1.8345 | 167500 | 0.0067 |
|
780 |
-
| 1.8399 | 168000 | 0.007 |
|
781 |
-
| 1.8454 | 168500 | 0.0133 |
|
782 |
-
| 1.8509 | 169000 | 0.0071 |
|
783 |
-
| 1.8564 | 169500 | 0.0091 |
|
784 |
-
| 1.8619 | 170000 | 0.0067 |
|
785 |
-
| 1.8673 | 170500 | 0.0091 |
|
786 |
-
| 1.8728 | 171000 | 0.0103 |
|
787 |
-
| 1.8783 | 171500 | 0.0058 |
|
788 |
-
| 1.8838 | 172000 | 0.0116 |
|
789 |
-
| 1.8892 | 172500 | 0.0089 |
|
790 |
-
| 1.8947 | 173000 | 0.0137 |
|
791 |
-
| 1.9002 | 173500 | 0.0065 |
|
792 |
-
| 1.9057 | 174000 | 0.0098 |
|
793 |
-
| 1.9111 | 174500 | 0.0083 |
|
794 |
-
| 1.9166 | 175000 | 0.0115 |
|
795 |
-
| 1.9221 | 175500 | 0.0083 |
|
796 |
-
| 1.9276 | 176000 | 0.0084 |
|
797 |
-
| 1.9330 | 176500 | 0.0091 |
|
798 |
-
| 1.9385 | 177000 | 0.0092 |
|
799 |
-
| 1.9440 | 177500 | 0.0054 |
|
800 |
-
| 1.9495 | 178000 | 0.0049 |
|
801 |
-
| 1.9549 | 178500 | 0.0072 |
|
802 |
-
| 1.9604 | 179000 | 0.0052 |
|
803 |
-
| 1.9659 | 179500 | 0.0063 |
|
804 |
-
| 1.9714 | 180000 | 0.0107 |
|
805 |
-
| 1.9768 | 180500 | 0.0061 |
|
806 |
-
| 1.9823 | 181000 | 0.0059 |
|
807 |
-
| 1.9878 | 181500 | 0.0067 |
|
808 |
-
| 1.9933 | 182000 | 0.0078 |
|
809 |
-
| 1.9988 | 182500 | 0.007 |
|
810 |
-
| 2.0042 | 183000 | 0.0065 |
|
811 |
-
| 2.0097 | 183500 | 0.0073 |
|
812 |
-
| 2.0152 | 184000 | 0.01 |
|
813 |
-
| 2.0207 | 184500 | 0.0072 |
|
814 |
-
| 2.0261 | 185000 | 0.0055 |
|
815 |
-
| 2.0316 | 185500 | 0.0087 |
|
816 |
-
| 2.0371 | 186000 | 0.0077 |
|
817 |
-
| 2.0426 | 186500 | 0.0067 |
|
818 |
-
| 2.0480 | 187000 | 0.008 |
|
819 |
-
| 2.0535 | 187500 | 0.0074 |
|
820 |
-
| 2.0590 | 188000 | 0.0072 |
|
821 |
-
| 2.0645 | 188500 | 0.0045 |
|
822 |
-
| 2.0699 | 189000 | 0.0082 |
|
823 |
-
| 2.0754 | 189500 | 0.0042 |
|
824 |
-
| 2.0809 | 190000 | 0.0076 |
|
825 |
-
| 2.0864 | 190500 | 0.0058 |
|
826 |
-
| 2.0918 | 191000 | 0.005 |
|
827 |
-
| 2.0973 | 191500 | 0.0047 |
|
828 |
-
| 2.1028 | 192000 | 0.0045 |
|
829 |
-
| 2.1083 | 192500 | 0.0043 |
|
830 |
-
| 2.1137 | 193000 | 0.0049 |
|
831 |
-
| 2.1192 | 193500 | 0.0058 |
|
832 |
-
| 2.1247 | 194000 | 0.0081 |
|
833 |
-
| 2.1302 | 194500 | 0.0057 |
|
834 |
-
| 2.1357 | 195000 | 0.0047 |
|
835 |
-
| 2.1411 | 195500 | 0.0073 |
|
836 |
-
| 2.1466 | 196000 | 0.0056 |
|
837 |
-
| 2.1521 | 196500 | 0.006 |
|
838 |
-
| 2.1576 | 197000 | 0.0061 |
|
839 |
-
| 2.1630 | 197500 | 0.0042 |
|
840 |
-
| 2.1685 | 198000 | 0.0057 |
|
841 |
-
| 2.1740 | 198500 | 0.0055 |
|
842 |
-
| 2.1795 | 199000 | 0.0053 |
|
843 |
-
| 2.1849 | 199500 | 0.0085 |
|
844 |
-
| 2.1904 | 200000 | 0.005 |
|
845 |
-
| 2.1959 | 200500 | 0.0055 |
|
846 |
-
| 2.2014 | 201000 | 0.0032 |
|
847 |
-
| 2.2068 | 201500 | 0.0054 |
|
848 |
-
| 2.2123 | 202000 | 0.0037 |
|
849 |
-
| 2.2178 | 202500 | 0.0046 |
|
850 |
-
| 2.2233 | 203000 | 0.0029 |
|
851 |
-
| 2.2287 | 203500 | 0.0043 |
|
852 |
-
| 2.2342 | 204000 | 0.0063 |
|
853 |
-
| 2.2397 | 204500 | 0.0064 |
|
854 |
-
| 2.2452 | 205000 | 0.0046 |
|
855 |
-
| 2.2506 | 205500 | 0.0061 |
|
856 |
-
| 2.2561 | 206000 | 0.0034 |
|
857 |
-
| 2.2616 | 206500 | 0.0046 |
|
858 |
-
| 2.2671 | 207000 | 0.0059 |
|
859 |
-
| 2.2726 | 207500 | 0.0044 |
|
860 |
-
| 2.2780 | 208000 | 0.0054 |
|
861 |
-
| 2.2835 | 208500 | 0.0049 |
|
862 |
-
| 2.2890 | 209000 | 0.0096 |
|
863 |
-
| 2.2945 | 209500 | 0.0045 |
|
864 |
-
| 2.2999 | 210000 | 0.0057 |
|
865 |
-
| 2.3054 | 210500 | 0.0032 |
|
866 |
-
| 2.3109 | 211000 | 0.0031 |
|
867 |
-
| 2.3164 | 211500 | 0.0043 |
|
868 |
-
| 2.3218 | 212000 | 0.0068 |
|
869 |
-
| 2.3273 | 212500 | 0.0048 |
|
870 |
-
| 2.3328 | 213000 | 0.0042 |
|
871 |
-
| 2.3383 | 213500 | 0.0068 |
|
872 |
-
| 2.3437 | 214000 | 0.0041 |
|
873 |
-
| 2.3492 | 214500 | 0.0042 |
|
874 |
-
| 2.3547 | 215000 | 0.0051 |
|
875 |
-
| 2.3602 | 215500 | 0.0049 |
|
876 |
-
| 2.3656 | 216000 | 0.0019 |
|
877 |
-
| 2.3711 | 216500 | 0.0039 |
|
878 |
-
| 2.3766 | 217000 | 0.0068 |
|
879 |
-
| 2.3821 | 217500 | 0.0033 |
|
880 |
-
| 2.3875 | 218000 | 0.0048 |
|
881 |
-
| 2.3930 | 218500 | 0.0052 |
|
882 |
-
| 2.3985 | 219000 | 0.0063 |
|
883 |
-
| 2.4040 | 219500 | 0.003 |
|
884 |
-
| 2.4095 | 220000 | 0.0036 |
|
885 |
-
| 2.4149 | 220500 | 0.004 |
|
886 |
-
| 2.4204 | 221000 | 0.006 |
|
887 |
-
| 2.4259 | 221500 | 0.0048 |
|
888 |
-
| 2.4314 | 222000 | 0.0037 |
|
889 |
-
| 2.4368 | 222500 | 0.0034 |
|
890 |
-
| 2.4423 | 223000 | 0.0049 |
|
891 |
-
| 2.4478 | 223500 | 0.0036 |
|
892 |
-
| 2.4533 | 224000 | 0.0046 |
|
893 |
-
| 2.4587 | 224500 | 0.0039 |
|
894 |
-
| 2.4642 | 225000 | 0.0021 |
|
895 |
-
| 2.4697 | 225500 | 0.0035 |
|
896 |
-
| 2.4752 | 226000 | 0.0034 |
|
897 |
-
| 2.4806 | 226500 | 0.003 |
|
898 |
-
| 2.4861 | 227000 | 0.0032 |
|
899 |
-
| 2.4916 | 227500 | 0.005 |
|
900 |
-
| 2.4971 | 228000 | 0.0025 |
|
901 |
-
| 2.5025 | 228500 | 0.0036 |
|
902 |
-
| 2.5080 | 229000 | 0.0021 |
|
903 |
-
| 2.5135 | 229500 | 0.0025 |
|
904 |
-
| 2.5190 | 230000 | 0.0036 |
|
905 |
-
| 2.5245 | 230500 | 0.0033 |
|
906 |
-
| 2.5299 | 231000 | 0.0049 |
|
907 |
-
| 2.5354 | 231500 | 0.0044 |
|
908 |
-
| 2.5409 | 232000 | 0.0029 |
|
909 |
-
| 2.5464 | 232500 | 0.0028 |
|
910 |
-
| 2.5518 | 233000 | 0.0091 |
|
911 |
-
| 2.5573 | 233500 | 0.004 |
|
912 |
-
| 2.5628 | 234000 | 0.0036 |
|
913 |
-
| 2.5683 | 234500 | 0.0029 |
|
914 |
-
| 2.5737 | 235000 | 0.0035 |
|
915 |
-
| 2.5792 | 235500 | 0.0038 |
|
916 |
-
| 2.5847 | 236000 | 0.0028 |
|
917 |
-
| 2.5902 | 236500 | 0.0041 |
|
918 |
-
| 2.5956 | 237000 | 0.0037 |
|
919 |
-
| 2.6011 | 237500 | 0.0031 |
|
920 |
-
| 2.6066 | 238000 | 0.0036 |
|
921 |
-
| 2.6121 | 238500 | 0.0052 |
|
922 |
-
| 2.6175 | 239000 | 0.0031 |
|
923 |
-
| 2.6230 | 239500 | 0.0023 |
|
924 |
-
| 2.6285 | 240000 | 0.0043 |
|
925 |
-
| 2.6340 | 240500 | 0.0027 |
|
926 |
-
| 2.6394 | 241000 | 0.0048 |
|
927 |
-
| 2.6449 | 241500 | 0.0046 |
|
928 |
-
| 2.6504 | 242000 | 0.0038 |
|
929 |
-
| 2.6559 | 242500 | 0.0033 |
|
930 |
-
| 2.6614 | 243000 | 0.003 |
|
931 |
-
| 2.6668 | 243500 | 0.0057 |
|
932 |
-
| 2.6723 | 244000 | 0.0044 |
|
933 |
-
| 2.6778 | 244500 | 0.0058 |
|
934 |
-
| 2.6833 | 245000 | 0.003 |
|
935 |
-
| 2.6887 | 245500 | 0.0042 |
|
936 |
-
| 2.6942 | 246000 | 0.0045 |
|
937 |
-
| 2.6997 | 246500 | 0.0031 |
|
938 |
-
| 2.7052 | 247000 | 0.0021 |
|
939 |
-
| 2.7106 | 247500 | 0.0043 |
|
940 |
-
| 2.7161 | 248000 | 0.0058 |
|
941 |
-
| 2.7216 | 248500 | 0.0041 |
|
942 |
-
| 2.7271 | 249000 | 0.0038 |
|
943 |
-
| 2.7325 | 249500 | 0.0019 |
|
944 |
-
| 2.7380 | 250000 | 0.0029 |
|
945 |
-
| 2.7435 | 250500 | 0.003 |
|
946 |
-
| 2.7490 | 251000 | 0.0038 |
|
947 |
-
| 2.7544 | 251500 | 0.004 |
|
948 |
-
| 2.7599 | 252000 | 0.0049 |
|
949 |
-
| 2.7654 | 252500 | 0.0039 |
|
950 |
-
| 2.7709 | 253000 | 0.005 |
|
951 |
-
| 2.7763 | 253500 | 0.0046 |
|
952 |
-
| 2.7818 | 254000 | 0.0025 |
|
953 |
-
| 2.7873 | 254500 | 0.0044 |
|
954 |
-
| 2.7928 | 255000 | 0.0023 |
|
955 |
-
| 2.7983 | 255500 | 0.0038 |
|
956 |
-
| 2.8037 | 256000 | 0.0032 |
|
957 |
-
| 2.8092 | 256500 | 0.0021 |
|
958 |
-
| 2.8147 | 257000 | 0.0023 |
|
959 |
-
| 2.8202 | 257500 | 0.0042 |
|
960 |
-
| 2.8256 | 258000 | 0.0042 |
|
961 |
-
| 2.8311 | 258500 | 0.0053 |
|
962 |
-
| 2.8366 | 259000 | 0.0021 |
|
963 |
-
| 2.8421 | 259500 | 0.0033 |
|
964 |
-
| 2.8475 | 260000 | 0.0047 |
|
965 |
-
| 2.8530 | 260500 | 0.0048 |
|
966 |
-
| 2.8585 | 261000 | 0.0022 |
|
967 |
-
| 2.8640 | 261500 | 0.0036 |
|
968 |
-
| 2.8694 | 262000 | 0.0034 |
|
969 |
-
| 2.8749 | 262500 | 0.0029 |
|
970 |
-
| 2.8804 | 263000 | 0.0038 |
|
971 |
-
| 2.8859 | 263500 | 0.0067 |
|
972 |
-
| 2.8913 | 264000 | 0.003 |
|
973 |
-
| 2.8968 | 264500 | 0.0049 |
|
974 |
-
| 2.9023 | 265000 | 0.0027 |
|
975 |
-
| 2.9078 | 265500 | 0.004 |
|
976 |
-
| 2.9132 | 266000 | 0.0042 |
|
977 |
-
| 2.9187 | 266500 | 0.0042 |
|
978 |
-
| 2.9242 | 267000 | 0.0038 |
|
979 |
-
| 2.9297 | 267500 | 0.0029 |
|
980 |
-
| 2.9352 | 268000 | 0.0039 |
|
981 |
-
| 2.9406 | 268500 | 0.0039 |
|
982 |
-
| 2.9461 | 269000 | 0.002 |
|
983 |
-
| 2.9516 | 269500 | 0.0022 |
|
984 |
-
| 2.9571 | 270000 | 0.002 |
|
985 |
-
| 2.9625 | 270500 | 0.003 |
|
986 |
-
| 2.9680 | 271000 | 0.0019 |
|
987 |
-
| 2.9735 | 271500 | 0.0044 |
|
988 |
-
| 2.9790 | 272000 | 0.0028 |
|
989 |
-
| 2.9844 | 272500 | 0.0031 |
|
990 |
-
| 2.9899 | 273000 | 0.0025 |
|
991 |
-
| 2.9954 | 273500 | 0.0021 |
|
992 |
-
| 3.0009 | 274000 | 0.0025 |
|
993 |
-
| 3.0063 | 274500 | 0.0038 |
|
994 |
-
| 3.0118 | 275000 | 0.0045 |
|
995 |
-
| 3.0173 | 275500 | 0.002 |
|
996 |
-
| 3.0228 | 276000 | 0.0035 |
|
997 |
-
| 3.0282 | 276500 | 0.0046 |
|
998 |
-
| 3.0337 | 277000 | 0.0033 |
|
999 |
-
| 3.0392 | 277500 | 0.002 |
|
1000 |
-
| 3.0447 | 278000 | 0.0036 |
|
1001 |
-
| 3.0501 | 278500 | 0.0025 |
|
1002 |
-
| 3.0556 | 279000 | 0.0039 |
|
1003 |
-
| 3.0611 | 279500 | 0.0029 |
|
1004 |
-
| 3.0666 | 280000 | 0.004 |
|
1005 |
-
| 3.0721 | 280500 | 0.0023 |
|
1006 |
-
| 3.0775 | 281000 | 0.0019 |
|
1007 |
-
| 3.0830 | 281500 | 0.0019 |
|
1008 |
-
| 3.0885 | 282000 | 0.0027 |
|
1009 |
-
| 3.0940 | 282500 | 0.0014 |
|
1010 |
-
| 3.0994 | 283000 | 0.0019 |
|
1011 |
-
| 3.1049 | 283500 | 0.0018 |
|
1012 |
-
| 3.1104 | 284000 | 0.0016 |
|
1013 |
-
| 3.1159 | 284500 | 0.0017 |
|
1014 |
-
| 3.1213 | 285000 | 0.0049 |
|
1015 |
-
| 3.1268 | 285500 | 0.0022 |
|
1016 |
-
| 3.1323 | 286000 | 0.0023 |
|
1017 |
-
| 3.1378 | 286500 | 0.0016 |
|
1018 |
-
| 3.1432 | 287000 | 0.002 |
|
1019 |
-
| 3.1487 | 287500 | 0.0025 |
|
1020 |
-
| 3.1542 | 288000 | 0.0012 |
|
1021 |
-
| 3.1597 | 288500 | 0.0021 |
|
1022 |
-
| 3.1651 | 289000 | 0.0017 |
|
1023 |
-
| 3.1706 | 289500 | 0.0019 |
|
1024 |
-
| 3.1761 | 290000 | 0.0019 |
|
1025 |
-
| 3.1816 | 290500 | 0.0042 |
|
1026 |
-
| 3.1871 | 291000 | 0.0027 |
|
1027 |
-
| 3.1925 | 291500 | 0.0011 |
|
1028 |
-
| 3.1980 | 292000 | 0.002 |
|
1029 |
-
| 3.2035 | 292500 | 0.0021 |
|
1030 |
-
| 3.2090 | 293000 | 0.0015 |
|
1031 |
-
| 3.2144 | 293500 | 0.0017 |
|
1032 |
-
| 3.2199 | 294000 | 0.002 |
|
1033 |
-
| 3.2254 | 294500 | 0.0012 |
|
1034 |
-
| 3.2309 | 295000 | 0.0017 |
|
1035 |
-
| 3.2363 | 295500 | 0.0029 |
|
1036 |
-
| 3.2418 | 296000 | 0.0019 |
|
1037 |
-
| 3.2473 | 296500 | 0.0017 |
|
1038 |
-
| 3.2528 | 297000 | 0.0019 |
|
1039 |
-
| 3.2582 | 297500 | 0.0012 |
|
1040 |
-
| 3.2637 | 298000 | 0.0024 |
|
1041 |
-
| 3.2692 | 298500 | 0.0017 |
|
1042 |
-
| 3.2747 | 299000 | 0.0022 |
|
1043 |
-
| 3.2801 | 299500 | 0.002 |
|
1044 |
-
| 3.2856 | 300000 | 0.0028 |
|
1045 |
-
| 3.2911 | 300500 | 0.0036 |
|
1046 |
-
| 3.2966 | 301000 | 0.0015 |
|
1047 |
-
| 3.3020 | 301500 | 0.0024 |
|
1048 |
-
| 3.3075 | 302000 | 0.0015 |
|
1049 |
-
| 3.3130 | 302500 | 0.0012 |
|
1050 |
-
| 3.3185 | 303000 | 0.0022 |
|
1051 |
-
| 3.3240 | 303500 | 0.0015 |
|
1052 |
-
| 3.3294 | 304000 | 0.0023 |
|
1053 |
-
| 3.3349 | 304500 | 0.0017 |
|
1054 |
-
| 3.3404 | 305000 | 0.0021 |
|
1055 |
-
| 3.3459 | 305500 | 0.0017 |
|
1056 |
-
| 3.3513 | 306000 | 0.0015 |
|
1057 |
-
| 3.3568 | 306500 | 0.0023 |
|
1058 |
-
| 3.3623 | 307000 | 0.0014 |
|
1059 |
-
| 3.3678 | 307500 | 0.0019 |
|
1060 |
-
| 3.3732 | 308000 | 0.0017 |
|
1061 |
-
| 3.3787 | 308500 | 0.0027 |
|
1062 |
-
| 3.3842 | 309000 | 0.0016 |
|
1063 |
-
| 3.3897 | 309500 | 0.0019 |
|
1064 |
-
| 3.3951 | 310000 | 0.0037 |
|
1065 |
-
| 3.4006 | 310500 | 0.0016 |
|
1066 |
-
| 3.4061 | 311000 | 0.0012 |
|
1067 |
-
| 3.4116 | 311500 | 0.0024 |
|
1068 |
-
| 3.4170 | 312000 | 0.0016 |
|
1069 |
-
| 3.4225 | 312500 | 0.0022 |
|
1070 |
-
| 3.4280 | 313000 | 0.0015 |
|
1071 |
-
| 3.4335 | 313500 | 0.0017 |
|
1072 |
-
| 3.4389 | 314000 | 0.0015 |
|
1073 |
-
| 3.4444 | 314500 | 0.0018 |
|
1074 |
-
| 3.4499 | 315000 | 0.0015 |
|
1075 |
-
| 3.4554 | 315500 | 0.0019 |
|
1076 |
-
| 3.4609 | 316000 | 0.0009 |
|
1077 |
-
| 3.4663 | 316500 | 0.001 |
|
1078 |
-
| 3.4718 | 317000 | 0.001 |
|
1079 |
-
| 3.4773 | 317500 | 0.0023 |
|
1080 |
-
| 3.4828 | 318000 | 0.0012 |
|
1081 |
-
| 3.4882 | 318500 | 0.0012 |
|
1082 |
-
| 3.4937 | 319000 | 0.0011 |
|
1083 |
-
| 3.4992 | 319500 | 0.0008 |
|
1084 |
-
| 3.5047 | 320000 | 0.0018 |
|
1085 |
-
| 3.5101 | 320500 | 0.0009 |
|
1086 |
-
| 3.5156 | 321000 | 0.0016 |
|
1087 |
-
| 3.5211 | 321500 | 0.0012 |
|
1088 |
-
| 3.5266 | 322000 | 0.0015 |
|
1089 |
-
| 3.5320 | 322500 | 0.0024 |
|
1090 |
-
| 3.5375 | 323000 | 0.0016 |
|
1091 |
-
| 3.5430 | 323500 | 0.0014 |
|
1092 |
-
| 3.5485 | 324000 | 0.0014 |
|
1093 |
-
| 3.5539 | 324500 | 0.0047 |
|
1094 |
-
| 3.5594 | 325000 | 0.0013 |
|
1095 |
-
| 3.5649 | 325500 | 0.0012 |
|
1096 |
-
| 3.5704 | 326000 | 0.0013 |
|
1097 |
-
| 3.5758 | 326500 | 0.0011 |
|
1098 |
-
| 3.5813 | 327000 | 0.0011 |
|
1099 |
-
| 3.5868 | 327500 | 0.0016 |
|
1100 |
-
| 3.5923 | 328000 | 0.0022 |
|
1101 |
-
| 3.5978 | 328500 | 0.0017 |
|
1102 |
-
| 3.6032 | 329000 | 0.0012 |
|
1103 |
-
| 3.6087 | 329500 | 0.002 |
|
1104 |
-
| 3.6142 | 330000 | 0.0016 |
|
1105 |
-
| 3.6197 | 330500 | 0.0009 |
|
1106 |
-
| 3.6251 | 331000 | 0.0011 |
|
1107 |
-
| 3.6306 | 331500 | 0.0019 |
|
1108 |
-
| 3.6361 | 332000 | 0.0011 |
|
1109 |
-
| 3.6416 | 332500 | 0.0021 |
|
1110 |
-
| 3.6470 | 333000 | 0.0029 |
|
1111 |
-
| 3.6525 | 333500 | 0.001 |
|
1112 |
-
| 3.6580 | 334000 | 0.0016 |
|
1113 |
-
| 3.6635 | 334500 | 0.0016 |
|
1114 |
-
| 3.6689 | 335000 | 0.0036 |
|
1115 |
-
| 3.6744 | 335500 | 0.0012 |
|
1116 |
-
| 3.6799 | 336000 | 0.003 |
|
1117 |
-
| 3.6854 | 336500 | 0.0014 |
|
1118 |
-
| 3.6908 | 337000 | 0.0018 |
|
1119 |
-
| 3.6963 | 337500 | 0.001 |
|
1120 |
-
| 3.7018 | 338000 | 0.001 |
|
1121 |
-
| 3.7073 | 338500 | 0.0016 |
|
1122 |
-
| 3.7127 | 339000 | 0.0025 |
|
1123 |
-
| 3.7182 | 339500 | 0.001 |
|
1124 |
-
| 3.7237 | 340000 | 0.0018 |
|
1125 |
-
| 3.7292 | 340500 | 0.0015 |
|
1126 |
-
| 3.7347 | 341000 | 0.001 |
|
1127 |
-
| 3.7401 | 341500 | 0.0009 |
|
1128 |
-
| 3.7456 | 342000 | 0.0013 |
|
1129 |
-
| 3.7511 | 342500 | 0.0014 |
|
1130 |
-
| 3.7566 | 343000 | 0.0013 |
|
1131 |
-
| 3.7620 | 343500 | 0.0011 |
|
1132 |
-
| 3.7675 | 344000 | 0.0026 |
|
1133 |
-
| 3.7730 | 344500 | 0.0014 |
|
1134 |
-
| 3.7785 | 345000 | 0.0021 |
|
1135 |
-
| 3.7839 | 345500 | 0.0015 |
|
1136 |
-
| 3.7894 | 346000 | 0.0013 |
|
1137 |
-
| 3.7949 | 346500 | 0.0013 |
|
1138 |
-
| 3.8004 | 347000 | 0.0019 |
|
1139 |
-
| 3.8058 | 347500 | 0.0009 |
|
1140 |
-
| 3.8113 | 348000 | 0.0009 |
|
1141 |
-
| 3.8168 | 348500 | 0.0014 |
|
1142 |
-
| 3.8223 | 349000 | 0.0012 |
|
1143 |
-
| 3.8277 | 349500 | 0.0032 |
|
1144 |
-
| 3.8332 | 350000 | 0.0015 |
|
1145 |
-
| 3.8387 | 350500 | 0.0011 |
|
1146 |
-
| 3.8442 | 351000 | 0.002 |
|
1147 |
-
| 3.8497 | 351500 | 0.0012 |
|
1148 |
-
| 3.8551 | 352000 | 0.0026 |
|
1149 |
-
| 3.8606 | 352500 | 0.001 |
|
1150 |
-
| 3.8661 | 353000 | 0.0018 |
|
1151 |
-
| 3.8716 | 353500 | 0.0014 |
|
1152 |
-
| 3.8770 | 354000 | 0.001 |
|
1153 |
-
| 3.8825 | 354500 | 0.0018 |
|
1154 |
-
| 3.8880 | 355000 | 0.0027 |
|
1155 |
-
| 3.8935 | 355500 | 0.0027 |
|
1156 |
-
| 3.8989 | 356000 | 0.0011 |
|
1157 |
-
| 3.9044 | 356500 | 0.0024 |
|
1158 |
-
| 3.9099 | 357000 | 0.0012 |
|
1159 |
-
| 3.9154 | 357500 | 0.0018 |
|
1160 |
-
| 3.9208 | 358000 | 0.0012 |
|
1161 |
-
| 3.9263 | 358500 | 0.0015 |
|
1162 |
-
| 3.9318 | 359000 | 0.0015 |
|
1163 |
-
| 3.9373 | 359500 | 0.0018 |
|
1164 |
-
| 3.9427 | 360000 | 0.0017 |
|
1165 |
-
| 3.9482 | 360500 | 0.0009 |
|
1166 |
-
| 3.9537 | 361000 | 0.001 |
|
1167 |
-
| 3.9592 | 361500 | 0.0013 |
|
1168 |
-
| 3.9646 | 362000 | 0.0008 |
|
1169 |
-
| 3.9701 | 362500 | 0.0018 |
|
1170 |
-
| 3.9756 | 363000 | 0.0027 |
|
1171 |
-
| 3.9811 | 363500 | 0.0009 |
|
1172 |
-
| 3.9866 | 364000 | 0.0008 |
|
1173 |
-
| 3.9920 | 364500 | 0.001 |
|
1174 |
-
| 3.9975 | 365000 | 0.0009 |
|
1175 |
-
| 4.0030 | 365500 | 0.0012 |
|
1176 |
-
| 4.0085 | 366000 | 0.0011 |
|
1177 |
-
| 4.0139 | 366500 | 0.0023 |
|
1178 |
-
| 4.0194 | 367000 | 0.0023 |
|
1179 |
-
| 4.0249 | 367500 | 0.0012 |
|
1180 |
-
| 4.0304 | 368000 | 0.0018 |
|
1181 |
-
| 4.0358 | 368500 | 0.0013 |
|
1182 |
-
| 4.0413 | 369000 | 0.0009 |
|
1183 |
-
| 4.0468 | 369500 | 0.0016 |
|
1184 |
-
| 4.0523 | 370000 | 0.0011 |
|
1185 |
-
| 4.0577 | 370500 | 0.0011 |
|
1186 |
-
| 4.0632 | 371000 | 0.0009 |
|
1187 |
-
| 4.0687 | 371500 | 0.0012 |
|
1188 |
-
| 4.0742 | 372000 | 0.0011 |
|
1189 |
-
| 4.0796 | 372500 | 0.0008 |
|
1190 |
-
| 4.0851 | 373000 | 0.001 |
|
1191 |
-
| 4.0906 | 373500 | 0.0008 |
|
1192 |
-
| 4.0961 | 374000 | 0.0009 |
|
1193 |
-
| 4.1015 | 374500 | 0.0008 |
|
1194 |
-
| 4.1070 | 375000 | 0.0008 |
|
1195 |
-
| 4.1125 | 375500 | 0.0008 |
|
1196 |
-
| 4.1180 | 376000 | 0.0009 |
|
1197 |
-
| 4.1235 | 376500 | 0.0021 |
|
1198 |
-
| 4.1289 | 377000 | 0.0007 |
|
1199 |
-
| 4.1344 | 377500 | 0.0014 |
|
1200 |
-
| 4.1399 | 378000 | 0.0008 |
|
1201 |
-
| 4.1454 | 378500 | 0.0015 |
|
1202 |
-
| 4.1508 | 379000 | 0.0008 |
|
1203 |
-
| 4.1563 | 379500 | 0.0008 |
|
1204 |
-
| 4.1618 | 380000 | 0.0015 |
|
1205 |
-
| 4.1673 | 380500 | 0.0008 |
|
1206 |
-
| 4.1727 | 381000 | 0.0009 |
|
1207 |
-
| 4.1782 | 381500 | 0.0018 |
|
1208 |
-
| 4.1837 | 382000 | 0.0013 |
|
1209 |
-
| 4.1892 | 382500 | 0.0012 |
|
1210 |
-
| 4.1946 | 383000 | 0.0008 |
|
1211 |
-
| 4.2001 | 383500 | 0.0008 |
|
1212 |
-
| 4.2056 | 384000 | 0.0008 |
|
1213 |
-
| 4.2111 | 384500 | 0.0008 |
|
1214 |
-
| 4.2165 | 385000 | 0.001 |
|
1215 |
-
| 4.2220 | 385500 | 0.0008 |
|
1216 |
-
| 4.2275 | 386000 | 0.0008 |
|
1217 |
-
| 4.2330 | 386500 | 0.0009 |
|
1218 |
-
| 4.2384 | 387000 | 0.0008 |
|
1219 |
-
| 4.2439 | 387500 | 0.0008 |
|
1220 |
-
| 4.2494 | 388000 | 0.0011 |
|
1221 |
-
| 4.2549 | 388500 | 0.0009 |
|
1222 |
-
| 4.2604 | 389000 | 0.0007 |
|
1223 |
-
| 4.2658 | 389500 | 0.001 |
|
1224 |
-
| 4.2713 | 390000 | 0.0007 |
|
1225 |
-
| 4.2768 | 390500 | 0.0011 |
|
1226 |
-
| 4.2823 | 391000 | 0.0007 |
|
1227 |
-
| 4.2877 | 391500 | 0.0019 |
|
1228 |
-
| 4.2932 | 392000 | 0.0009 |
|
1229 |
-
| 4.2987 | 392500 | 0.0011 |
|
1230 |
-
| 4.3042 | 393000 | 0.0008 |
|
1231 |
-
| 4.3096 | 393500 | 0.0006 |
|
1232 |
-
| 4.3151 | 394000 | 0.0009 |
|
1233 |
-
| 4.3206 | 394500 | 0.001 |
|
1234 |
-
| 4.3261 | 395000 | 0.0007 |
|
1235 |
-
| 4.3315 | 395500 | 0.0011 |
|
1236 |
-
| 4.3370 | 396000 | 0.0008 |
|
1237 |
-
| 4.3425 | 396500 | 0.0007 |
|
1238 |
-
| 4.3480 | 397000 | 0.0007 |
|
1239 |
-
| 4.3534 | 397500 | 0.0007 |
|
1240 |
-
| 4.3589 | 398000 | 0.001 |
|
1241 |
-
| 4.3644 | 398500 | 0.0008 |
|
1242 |
-
| 4.3699 | 399000 | 0.001 |
|
1243 |
-
| 4.3753 | 399500 | 0.0014 |
|
1244 |
-
| 4.3808 | 400000 | 0.0006 |
|
1245 |
-
| 4.3863 | 400500 | 0.0006 |
|
1246 |
-
| 4.3918 | 401000 | 0.001 |
|
1247 |
-
| 4.3973 | 401500 | 0.002 |
|
1248 |
-
| 4.4027 | 402000 | 0.0006 |
|
1249 |
-
| 4.4082 | 402500 | 0.0007 |
|
1250 |
-
| 4.4137 | 403000 | 0.001 |
|
1251 |
-
| 4.4192 | 403500 | 0.0008 |
|
1252 |
-
| 4.4246 | 404000 | 0.0008 |
|
1253 |
-
| 4.4301 | 404500 | 0.0009 |
|
1254 |
-
| 4.4356 | 405000 | 0.0005 |
|
1255 |
-
| 4.4411 | 405500 | 0.0008 |
|
1256 |
-
| 4.4465 | 406000 | 0.0008 |
|
1257 |
-
| 4.4520 | 406500 | 0.0007 |
|
1258 |
-
| 4.4575 | 407000 | 0.0006 |
|
1259 |
-
| 4.4630 | 407500 | 0.0006 |
|
1260 |
-
| 4.4684 | 408000 | 0.0006 |
|
1261 |
-
| 4.4739 | 408500 | 0.0006 |
|
1262 |
-
| 4.4794 | 409000 | 0.0009 |
|
1263 |
-
| 4.4849 | 409500 | 0.0007 |
|
1264 |
-
| 4.4903 | 410000 | 0.0009 |
|
1265 |
-
| 4.4958 | 410500 | 0.0006 |
|
1266 |
-
| 4.5013 | 411000 | 0.0007 |
|
1267 |
-
| 4.5068 | 411500 | 0.0006 |
|
1268 |
-
| 4.5122 | 412000 | 0.0007 |
|
1269 |
-
| 4.5177 | 412500 | 0.0006 |
|
1270 |
-
| 4.5232 | 413000 | 0.0008 |
|
1271 |
-
| 4.5287 | 413500 | 0.0007 |
|
1272 |
-
| 4.5342 | 414000 | 0.0013 |
|
1273 |
-
| 4.5396 | 414500 | 0.0006 |
|
1274 |
-
| 4.5451 | 415000 | 0.0009 |
|
1275 |
-
| 4.5506 | 415500 | 0.0015 |
|
1276 |
-
| 4.5561 | 416000 | 0.0014 |
|
1277 |
-
| 4.5615 | 416500 | 0.0007 |
|
1278 |
-
| 4.5670 | 417000 | 0.0007 |
|
1279 |
-
| 4.5725 | 417500 | 0.0008 |
|
1280 |
-
| 4.5780 | 418000 | 0.0008 |
|
1281 |
-
| 4.5834 | 418500 | 0.0007 |
|
1282 |
-
| 4.5889 | 419000 | 0.0006 |
|
1283 |
-
| 4.5944 | 419500 | 0.0008 |
|
1284 |
-
| 4.5999 | 420000 | 0.0008 |
|
1285 |
-
| 4.6053 | 420500 | 0.0006 |
|
1286 |
-
| 4.6108 | 421000 | 0.001 |
|
1287 |
-
| 4.6163 | 421500 | 0.0005 |
|
1288 |
-
| 4.6218 | 422000 | 0.0007 |
|
1289 |
-
| 4.6272 | 422500 | 0.0006 |
|
1290 |
-
| 4.6327 | 423000 | 0.0007 |
|
1291 |
-
| 4.6382 | 423500 | 0.0009 |
|
1292 |
-
| 4.6437 | 424000 | 0.0014 |
|
1293 |
-
| 4.6492 | 424500 | 0.0008 |
|
1294 |
-
| 4.6546 | 425000 | 0.0006 |
|
1295 |
-
| 4.6601 | 425500 | 0.0006 |
|
1296 |
-
| 4.6656 | 426000 | 0.0016 |
|
1297 |
-
| 4.6711 | 426500 | 0.0006 |
|
1298 |
-
| 4.6765 | 427000 | 0.0006 |
|
1299 |
-
| 4.6820 | 427500 | 0.0012 |
|
1300 |
-
| 4.6875 | 428000 | 0.0007 |
|
1301 |
-
| 4.6930 | 428500 | 0.0009 |
|
1302 |
-
| 4.6984 | 429000 | 0.0006 |
|
1303 |
-
| 4.7039 | 429500 | 0.0005 |
|
1304 |
-
| 4.7094 | 430000 | 0.0007 |
|
1305 |
-
| 4.7149 | 430500 | 0.0007 |
|
1306 |
-
| 4.7203 | 431000 | 0.0006 |
|
1307 |
-
| 4.7258 | 431500 | 0.0006 |
|
1308 |
-
| 4.7313 | 432000 | 0.0006 |
|
1309 |
-
| 4.7368 | 432500 | 0.0006 |
|
1310 |
-
| 4.7422 | 433000 | 0.0006 |
|
1311 |
-
| 4.7477 | 433500 | 0.0006 |
|
1312 |
-
| 4.7532 | 434000 | 0.0006 |
|
1313 |
-
| 4.7587 | 434500 | 0.0006 |
|
1314 |
-
| 4.7641 | 435000 | 0.0006 |
|
1315 |
-
| 4.7696 | 435500 | 0.0018 |
|
1316 |
-
| 4.7751 | 436000 | 0.0009 |
|
1317 |
-
| 4.7806 | 436500 | 0.0007 |
|
1318 |
-
| 4.7861 | 437000 | 0.0007 |
|
1319 |
-
| 4.7915 | 437500 | 0.0005 |
|
1320 |
-
| 4.7970 | 438000 | 0.0009 |
|
1321 |
-
| 4.8025 | 438500 | 0.0013 |
|
1322 |
-
| 4.8080 | 439000 | 0.0007 |
|
1323 |
-
| 4.8134 | 439500 | 0.0006 |
|
1324 |
-
| 4.8189 | 440000 | 0.0007 |
|
1325 |
-
| 4.8244 | 440500 | 0.001 |
|
1326 |
-
| 4.8299 | 441000 | 0.0019 |
|
1327 |
-
| 4.8353 | 441500 | 0.0006 |
|
1328 |
-
| 4.8408 | 442000 | 0.0006 |
|
1329 |
-
| 4.8463 | 442500 | 0.0009 |
|
1330 |
-
| 4.8518 | 443000 | 0.0006 |
|
1331 |
-
| 4.8572 | 443500 | 0.001 |
|
1332 |
-
| 4.8627 | 444000 | 0.0011 |
|
1333 |
-
| 4.8682 | 444500 | 0.0007 |
|
1334 |
-
| 4.8737 | 445000 | 0.0007 |
|
1335 |
-
| 4.8791 | 445500 | 0.0007 |
|
1336 |
-
| 4.8846 | 446000 | 0.0018 |
|
1337 |
-
| 4.8901 | 446500 | 0.0007 |
|
1338 |
-
| 4.8956 | 447000 | 0.0012 |
|
1339 |
-
| 4.9010 | 447500 | 0.0007 |
|
1340 |
-
| 4.9065 | 448000 | 0.0009 |
|
1341 |
-
| 4.9120 | 448500 | 0.0007 |
|
1342 |
-
| 4.9175 | 449000 | 0.001 |
|
1343 |
-
| 4.9230 | 449500 | 0.0007 |
|
1344 |
-
| 4.9284 | 450000 | 0.0007 |
|
1345 |
-
| 4.9339 | 450500 | 0.0007 |
|
1346 |
-
| 4.9394 | 451000 | 0.0011 |
|
1347 |
-
| 4.9449 | 451500 | 0.0005 |
|
1348 |
-
| 4.9503 | 452000 | 0.0007 |
|
1349 |
-
| 4.9558 | 452500 | 0.0006 |
|
1350 |
-
| 4.9613 | 453000 | 0.0009 |
|
1351 |
-
| 4.9668 | 453500 | 0.0008 |
|
1352 |
-
| 4.9722 | 454000 | 0.0015 |
|
1353 |
-
| 4.9777 | 454500 | 0.0008 |
|
1354 |
-
| 4.9832 | 455000 | 0.0006 |
|
1355 |
-
| 4.9887 | 455500 | 0.0006 |
|
1356 |
-
| 4.9941 | 456000 | 0.0007 |
|
1357 |
-
| 4.9996 | 456500 | 0.0006 |
|
1358 |
-
|
1359 |
-
</details>
|
1360 |
-
|
1361 |
-
### Framework Versions
|
1362 |
-
- Python: 3.12.2
|
1363 |
-
- Sentence Transformers: 3.0.1
|
1364 |
-
- Transformers: 4.42.3
|
1365 |
-
- PyTorch: 2.3.1+cu121
|
1366 |
-
- Accelerate: 0.32.1
|
1367 |
-
- Datasets: 2.20.0
|
1368 |
-
- Tokenizers: 0.19.1
|
1369 |
-
|
1370 |
-
## Citation
|
1371 |
-
|
1372 |
-
### BibTeX
|
1373 |
-
|
1374 |
-
#### Sentence Transformers
|
1375 |
-
```bibtex
|
1376 |
-
@inproceedings{reimers-2019-sentence-bert,
|
1377 |
-
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
|
1378 |
-
author = "Reimers, Nils and Gurevych, Iryna",
|
1379 |
-
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
|
1380 |
-
month = "11",
|
1381 |
-
year = "2019",
|
1382 |
-
publisher = "Association for Computational Linguistics",
|
1383 |
-
url = "https://arxiv.org/abs/1908.10084",
|
1384 |
-
}
|
1385 |
-
```
|
1386 |
-
|
1387 |
-
#### MultipleNegativesRankingLoss
|
1388 |
-
```bibtex
|
1389 |
-
@misc{henderson2017efficient,
|
1390 |
-
title={Efficient Natural Language Response Suggestion for Smart Reply},
|
1391 |
-
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
|
1392 |
-
year={2017},
|
1393 |
-
eprint={1705.00652},
|
1394 |
-
archivePrefix={arXiv},
|
1395 |
-
primaryClass={cs.CL}
|
1396 |
-
}
|
1397 |
-
```
|
1398 |
-
|
1399 |
-
<!--
|
1400 |
-
## Glossary
|
1401 |
-
|
1402 |
-
*Clearly define terms in order to be accessible across audiences.*
|
1403 |
-
-->
|
1404 |
-
|
1405 |
-
<!--
|
1406 |
-
## Model Card Authors
|
1407 |
-
|
1408 |
-
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
|
1409 |
-
-->
|
1410 |
-
|
1411 |
-
<!--
|
1412 |
-
## Model Card Contact
|
1413 |
-
|
1414 |
-
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
|
1415 |
-
-->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mx-01/config.json
DELETED
@@ -1,26 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_name_or_path": "nreimers/MiniLM-L6-H384-uncased",
|
3 |
-
"architectures": [
|
4 |
-
"BertModel"
|
5 |
-
],
|
6 |
-
"attention_probs_dropout_prob": 0.1,
|
7 |
-
"classifier_dropout": null,
|
8 |
-
"gradient_checkpointing": false,
|
9 |
-
"hidden_act": "gelu",
|
10 |
-
"hidden_dropout_prob": 0.1,
|
11 |
-
"hidden_size": 384,
|
12 |
-
"initializer_range": 0.02,
|
13 |
-
"intermediate_size": 1536,
|
14 |
-
"layer_norm_eps": 1e-12,
|
15 |
-
"max_position_embeddings": 512,
|
16 |
-
"model_type": "bert",
|
17 |
-
"num_attention_heads": 12,
|
18 |
-
"num_hidden_layers": 6,
|
19 |
-
"pad_token_id": 0,
|
20 |
-
"position_embedding_type": "absolute",
|
21 |
-
"torch_dtype": "float32",
|
22 |
-
"transformers_version": "4.42.3",
|
23 |
-
"type_vocab_size": 2,
|
24 |
-
"use_cache": true,
|
25 |
-
"vocab_size": 30522
|
26 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mx-01/config_sentence_transformers.json
DELETED
@@ -1,10 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"__version__": {
|
3 |
-
"sentence_transformers": "3.0.1",
|
4 |
-
"transformers": "4.42.3",
|
5 |
-
"pytorch": "2.3.1+cu121"
|
6 |
-
},
|
7 |
-
"prompts": {},
|
8 |
-
"default_prompt_name": null,
|
9 |
-
"similarity_fn_name": null
|
10 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mx-01/log.txt
DELETED
@@ -1,912 +0,0 @@
|
|
1 |
-
{'loss': 0.3883, 'grad_norm': 21.324216842651367, 'learning_rate': 3e-06, 'epoch': 0.02}
|
2 |
-
{'loss': 0.2685, 'grad_norm': 7.115039825439453, 'learning_rate': 4.000000000000001e-06, 'epoch': 0.02}
|
3 |
-
{'loss': 0.2349, 'grad_norm': 22.00225067138672, 'learning_rate': 5e-06, 'epoch': 0.03}
|
4 |
-
{'loss': 0.1685, 'grad_norm': 8.233646392822266, 'learning_rate': 6e-06, 'epoch': 0.03}
|
5 |
-
{'loss': 0.1409, 'grad_norm': 1.1980726718902588, 'learning_rate': 7e-06, 'epoch': 0.04}
|
6 |
-
{'loss': 0.1262, 'grad_norm': 2.665900707244873, 'learning_rate': 8.000000000000001e-06, 'epoch': 0.04}
|
7 |
-
{'loss': 0.1195, 'grad_norm': 12.662271499633789, 'learning_rate': 9e-06, 'epoch': 0.05}
|
8 |
-
{'loss': 0.1044, 'grad_norm': 20.355819702148438, 'learning_rate': 1e-05, 'epoch': 0.05}
|
9 |
-
{'loss': 0.0989, 'grad_norm': 9.962722778320312, 'learning_rate': 1.1000000000000001e-05, 'epoch': 0.06}
|
10 |
-
{'loss': 0.0787, 'grad_norm': 2.361100912094116, 'learning_rate': 1.2e-05, 'epoch': 0.07}
|
11 |
-
{'loss': 0.0895, 'grad_norm': 0.5440080165863037, 'learning_rate': 1.3000000000000001e-05, 'epoch': 0.07}
|
12 |
-
{'loss': 0.0708, 'grad_norm': 22.654308319091797, 'learning_rate': 1.4e-05, 'epoch': 0.08}
|
13 |
-
{'loss': 0.0834, 'grad_norm': 1.5862770080566406, 'learning_rate': 1.5000000000000002e-05, 'epoch': 0.08}
|
14 |
-
{'loss': 0.0634, 'grad_norm': 2.121326446533203, 'learning_rate': 1.6000000000000003e-05, 'epoch': 0.09}
|
15 |
-
{'loss': 0.0643, 'grad_norm': 1.7471628189086914, 'learning_rate': 1.7e-05, 'epoch': 0.09}
|
16 |
-
{'loss': 0.0567, 'grad_norm': 1.2325271368026733, 'learning_rate': 1.8e-05, 'epoch': 0.1}
|
17 |
-
{'loss': 0.0646, 'grad_norm': 16.829893112182617, 'learning_rate': 1.9e-05, 'epoch': 0.1}
|
18 |
-
{'loss': 0.0607, 'grad_norm': 1.4897233247756958, 'learning_rate': 2e-05, 'epoch': 0.11}
|
19 |
-
{'loss': 0.0564, 'grad_norm': 1.4202508926391602, 'learning_rate': 1.997760508812398e-05, 'epoch': 0.11}
|
20 |
-
{'loss': 0.068, 'grad_norm': 0.0861266702413559, 'learning_rate': 1.995521017624796e-05, 'epoch': 0.12}
|
21 |
-
{'loss': 0.0536, 'grad_norm': 0.28206154704093933, 'learning_rate': 1.9932815264371937e-05, 'epoch': 0.13}
|
22 |
-
{'loss': 0.0594, 'grad_norm': 8.399012565612793, 'learning_rate': 1.9910420352495915e-05, 'epoch': 0.13}
|
23 |
-
{'loss': 0.057, 'grad_norm': 14.871772766113281, 'learning_rate': 1.9888025440619893e-05, 'epoch': 0.14}
|
24 |
-
{'loss': 0.0555, 'grad_norm': 0.14331811666488647, 'learning_rate': 1.9865630528743872e-05, 'epoch': 0.14}
|
25 |
-
{'loss': 0.0485, 'grad_norm': 18.870864868164062, 'learning_rate': 1.984323561686785e-05, 'epoch': 0.15}
|
26 |
-
{'loss': 0.0528, 'grad_norm': 1.9486477375030518, 'learning_rate': 1.982084070499183e-05, 'epoch': 0.15}
|
27 |
-
{'loss': 0.0478, 'grad_norm': 4.394039630889893, 'learning_rate': 1.9798445793115807e-05, 'epoch': 0.16}
|
28 |
-
{'loss': 0.0586, 'grad_norm': 0.8561568856239319, 'learning_rate': 1.9776050881239782e-05, 'epoch': 0.16}
|
29 |
-
{'loss': 0.0539, 'grad_norm': 1.0208560228347778, 'learning_rate': 1.9753655969363764e-05, 'epoch': 0.17}
|
30 |
-
{'loss': 0.0432, 'grad_norm': 4.224825859069824, 'learning_rate': 1.9731261057487742e-05, 'epoch': 0.18}
|
31 |
-
{'loss': 0.0542, 'grad_norm': 1.7423515319824219, 'learning_rate': 1.9708866145611717e-05, 'epoch': 0.18}
|
32 |
-
{'loss': 0.0536, 'grad_norm': 0.3042142689228058, 'learning_rate': 1.96864712337357e-05, 'epoch': 0.19}
|
33 |
-
{'loss': 0.0492, 'grad_norm': 2.4977452754974365, 'learning_rate': 1.9664076321859677e-05, 'epoch': 0.19}
|
34 |
-
{'loss': 0.0427, 'grad_norm': 3.3112635612487793, 'learning_rate': 1.9641681409983652e-05, 'epoch': 0.2}
|
35 |
-
{'loss': 0.0489, 'grad_norm': 0.42069825530052185, 'learning_rate': 1.9619286498107634e-05, 'epoch': 0.2}
|
36 |
-
{'loss': 0.0502, 'grad_norm': 0.1265694946050644, 'learning_rate': 1.959689158623161e-05, 'epoch': 0.21}
|
37 |
-
{'loss': 0.0432, 'grad_norm': 0.2148071676492691, 'learning_rate': 1.9574496674355587e-05, 'epoch': 0.21}
|
38 |
-
{'loss': 0.0459, 'grad_norm': 3.6947178840637207, 'learning_rate': 1.955210176247957e-05, 'epoch': 0.22}
|
39 |
-
{'loss': 0.0376, 'grad_norm': 4.6488776206970215, 'learning_rate': 1.9529706850603544e-05, 'epoch': 0.22}
|
40 |
-
{'loss': 0.0489, 'grad_norm': 1.4403197765350342, 'learning_rate': 1.9507311938727522e-05, 'epoch': 0.23}
|
41 |
-
{'loss': 0.0515, 'grad_norm': 4.117632865905762, 'learning_rate': 1.94849170268515e-05, 'epoch': 0.24}
|
42 |
-
{'loss': 0.0429, 'grad_norm': 0.196114644408226, 'learning_rate': 1.946252211497548e-05, 'epoch': 0.24}
|
43 |
-
{'loss': 0.0417, 'grad_norm': 0.06215653941035271, 'learning_rate': 1.9440127203099457e-05, 'epoch': 0.25}
|
44 |
-
{'loss': 0.0478, 'grad_norm': 0.19152769446372986, 'learning_rate': 1.9417732291223435e-05, 'epoch': 0.25}
|
45 |
-
{'loss': 0.0359, 'grad_norm': 0.24261559545993805, 'learning_rate': 1.9395337379347414e-05, 'epoch': 0.26}
|
46 |
-
{'loss': 0.0452, 'grad_norm': 0.7501357793807983, 'learning_rate': 1.9372942467471392e-05, 'epoch': 0.26}
|
47 |
-
{'loss': 0.0443, 'grad_norm': 2.0564398765563965, 'learning_rate': 1.935054755559537e-05, 'epoch': 0.27}
|
48 |
-
{'loss': 0.0409, 'grad_norm': 0.031283892691135406, 'learning_rate': 1.932815264371935e-05, 'epoch': 0.27}
|
49 |
-
{'loss': 0.0421, 'grad_norm': 0.15692433714866638, 'learning_rate': 1.9305757731843327e-05, 'epoch': 0.28}
|
50 |
-
{'loss': 0.0393, 'grad_norm': 14.9638090133667, 'learning_rate': 1.9283362819967305e-05, 'epoch': 0.28}
|
51 |
-
{'loss': 0.0409, 'grad_norm': 0.08281882852315903, 'learning_rate': 1.9260967908091284e-05, 'epoch': 0.29}
|
52 |
-
{'loss': 0.032, 'grad_norm': 0.3644435405731201, 'learning_rate': 1.9238572996215262e-05, 'epoch': 0.3}
|
53 |
-
{'loss': 0.0468, 'grad_norm': 0.13462503254413605, 'learning_rate': 1.9216178084339237e-05, 'epoch': 0.3}
|
54 |
-
{'loss': 0.0285, 'grad_norm': 0.3230873942375183, 'learning_rate': 1.919378317246322e-05, 'epoch': 0.31}
|
55 |
-
{'loss': 0.0311, 'grad_norm': 20.536762237548828, 'learning_rate': 1.9171388260587197e-05, 'epoch': 0.31}
|
56 |
-
{'loss': 0.0304, 'grad_norm': 1.3668054342269897, 'learning_rate': 1.9148993348711172e-05, 'epoch': 0.32}
|
57 |
-
{'loss': 0.0349, 'grad_norm': 15.044346809387207, 'learning_rate': 1.9126598436835154e-05, 'epoch': 0.32}
|
58 |
-
{'loss': 0.0352, 'grad_norm': 21.638084411621094, 'learning_rate': 1.9104203524959132e-05, 'epoch': 0.33}
|
59 |
-
{'loss': 0.0367, 'grad_norm': 0.060597069561481476, 'learning_rate': 1.9081808613083107e-05, 'epoch': 0.33}
|
60 |
-
{'loss': 0.0385, 'grad_norm': 1.7127445936203003, 'learning_rate': 1.905941370120709e-05, 'epoch': 0.34}
|
61 |
-
{'loss': 0.0325, 'grad_norm': 0.011513516306877136, 'learning_rate': 1.9037018789331064e-05, 'epoch': 0.34}
|
62 |
-
{'loss': 0.0302, 'grad_norm': 0.3838317394256592, 'learning_rate': 1.9014623877455042e-05, 'epoch': 0.35}
|
63 |
-
{'loss': 0.0393, 'grad_norm': 0.766505777835846, 'learning_rate': 1.8992228965579024e-05, 'epoch': 0.36}
|
64 |
-
{'loss': 0.032, 'grad_norm': 0.01746511273086071, 'learning_rate': 1.8969834053703e-05, 'epoch': 0.36}
|
65 |
-
{'loss': 0.0263, 'grad_norm': 5.542301654815674, 'learning_rate': 1.8947439141826977e-05, 'epoch': 0.37}
|
66 |
-
{'loss': 0.0343, 'grad_norm': 0.06212176755070686, 'learning_rate': 1.8925044229950956e-05, 'epoch': 0.37}
|
67 |
-
{'loss': 0.0349, 'grad_norm': 7.203415870666504, 'learning_rate': 1.8902649318074934e-05, 'epoch': 0.38}
|
68 |
-
{'loss': 0.0282, 'grad_norm': 0.04690209776163101, 'learning_rate': 1.8880254406198912e-05, 'epoch': 0.38}
|
69 |
-
{'loss': 0.034, 'grad_norm': 0.10681267082691193, 'learning_rate': 1.885785949432289e-05, 'epoch': 0.39}
|
70 |
-
{'loss': 0.0376, 'grad_norm': 0.5834813117980957, 'learning_rate': 1.883546458244687e-05, 'epoch': 0.39}
|
71 |
-
{'loss': 0.0265, 'grad_norm': 1.8356763124465942, 'learning_rate': 1.8813069670570847e-05, 'epoch': 0.4}
|
72 |
-
{'loss': 0.0267, 'grad_norm': 1.5589691400527954, 'learning_rate': 1.8790674758694826e-05, 'epoch': 0.41}
|
73 |
-
{'loss': 0.0241, 'grad_norm': 0.20994938910007477, 'learning_rate': 1.8768279846818804e-05, 'epoch': 0.41}
|
74 |
-
{'loss': 0.033, 'grad_norm': 1.8584221601486206, 'learning_rate': 1.8745884934942783e-05, 'epoch': 0.42}
|
75 |
-
{'loss': 0.0323, 'grad_norm': 1.917885661125183, 'learning_rate': 1.872349002306676e-05, 'epoch': 0.42}
|
76 |
-
{'loss': 0.0278, 'grad_norm': 6.673130989074707, 'learning_rate': 1.870109511119074e-05, 'epoch': 0.43}
|
77 |
-
{'loss': 0.025, 'grad_norm': 0.09119334816932678, 'learning_rate': 1.8678700199314718e-05, 'epoch': 0.43}
|
78 |
-
{'loss': 0.0363, 'grad_norm': 0.1400763988494873, 'learning_rate': 1.8656305287438696e-05, 'epoch': 0.44}
|
79 |
-
{'loss': 0.0312, 'grad_norm': 0.1186104416847229, 'learning_rate': 1.8633910375562674e-05, 'epoch': 0.44}
|
80 |
-
{'loss': 0.0307, 'grad_norm': 0.7495352625846863, 'learning_rate': 1.8611515463686653e-05, 'epoch': 0.45}
|
81 |
-
{'loss': 0.0305, 'grad_norm': 1.1061906814575195, 'learning_rate': 1.858912055181063e-05, 'epoch': 0.45}
|
82 |
-
{'loss': 0.028, 'grad_norm': 1.001441240310669, 'learning_rate': 1.856672563993461e-05, 'epoch': 0.46}
|
83 |
-
{'loss': 0.0279, 'grad_norm': 6.4315948486328125, 'learning_rate': 1.8544330728058588e-05, 'epoch': 0.47}
|
84 |
-
{'loss': 0.0265, 'grad_norm': 0.16143333911895752, 'learning_rate': 1.8521935816182566e-05, 'epoch': 0.47}
|
85 |
-
{'loss': 0.0262, 'grad_norm': 0.020146619528532028, 'learning_rate': 1.8499540904306544e-05, 'epoch': 0.48}
|
86 |
-
{'loss': 0.0308, 'grad_norm': 1.1892863512039185, 'learning_rate': 1.847714599243052e-05, 'epoch': 0.48}
|
87 |
-
{'loss': 0.0282, 'grad_norm': 0.5584899187088013, 'learning_rate': 1.84547510805545e-05, 'epoch': 0.49}
|
88 |
-
{'loss': 0.0243, 'grad_norm': 0.2598753869533539, 'learning_rate': 1.843235616867848e-05, 'epoch': 0.49}
|
89 |
-
{'loss': 0.0236, 'grad_norm': 2.231210947036743, 'learning_rate': 1.8409961256802454e-05, 'epoch': 0.5}
|
90 |
-
{'loss': 0.02, 'grad_norm': 0.025564778596162796, 'learning_rate': 1.8387566344926436e-05, 'epoch': 0.5}
|
91 |
-
{'loss': 0.0254, 'grad_norm': 12.388060569763184, 'learning_rate': 1.836517143305041e-05, 'epoch': 0.51}
|
92 |
-
{'loss': 0.0275, 'grad_norm': 0.014444425702095032, 'learning_rate': 1.834277652117439e-05, 'epoch': 0.51}
|
93 |
-
{'loss': 0.0309, 'grad_norm': 16.885656356811523, 'learning_rate': 1.832038160929837e-05, 'epoch': 0.52}
|
94 |
-
{'loss': 0.031, 'grad_norm': 13.603434562683105, 'learning_rate': 1.8297986697422346e-05, 'epoch': 0.53}
|
95 |
-
{'loss': 0.0271, 'grad_norm': 0.5440975427627563, 'learning_rate': 1.8275591785546325e-05, 'epoch': 0.53}
|
96 |
-
{'loss': 0.0218, 'grad_norm': 0.480385959148407, 'learning_rate': 1.8253196873670306e-05, 'epoch': 0.54}
|
97 |
-
{'loss': 0.0249, 'grad_norm': 0.11402398347854614, 'learning_rate': 1.823080196179428e-05, 'epoch': 0.54}
|
98 |
-
{'loss': 0.0285, 'grad_norm': 0.020077334716916084, 'learning_rate': 1.820840704991826e-05, 'epoch': 0.55}
|
99 |
-
{'loss': 0.03, 'grad_norm': 8.4814453125, 'learning_rate': 1.8186012138042238e-05, 'epoch': 0.55}
|
100 |
-
{'loss': 0.0284, 'grad_norm': 0.8492074012756348, 'learning_rate': 1.8163617226166216e-05, 'epoch': 0.56}
|
101 |
-
{'loss': 0.0258, 'grad_norm': 0.10311955213546753, 'learning_rate': 1.8141222314290195e-05, 'epoch': 0.56}
|
102 |
-
{'loss': 0.0228, 'grad_norm': 0.04972610995173454, 'learning_rate': 1.8118827402414173e-05, 'epoch': 0.57}
|
103 |
-
{'loss': 0.0305, 'grad_norm': 2.9211368560791016, 'learning_rate': 1.809643249053815e-05, 'epoch': 0.57}
|
104 |
-
{'loss': 0.0234, 'grad_norm': 0.11432399600744247, 'learning_rate': 1.807403757866213e-05, 'epoch': 0.58}
|
105 |
-
{'loss': 0.0209, 'grad_norm': 0.15198394656181335, 'learning_rate': 1.8051642666786108e-05, 'epoch': 0.59}
|
106 |
-
{'loss': 0.0341, 'grad_norm': 4.720141410827637, 'learning_rate': 1.8029247754910086e-05, 'epoch': 0.59}
|
107 |
-
{'loss': 0.0269, 'grad_norm': 0.02421008236706257, 'learning_rate': 1.8006852843034065e-05, 'epoch': 0.6}
|
108 |
-
{'loss': 0.0267, 'grad_norm': 0.49286821484565735, 'learning_rate': 1.798445793115804e-05, 'epoch': 0.6}
|
109 |
-
{'loss': 0.0245, 'grad_norm': 0.0072012050077319145, 'learning_rate': 1.796206301928202e-05, 'epoch': 0.61}
|
110 |
-
{'loss': 0.0263, 'grad_norm': 0.17817166447639465, 'learning_rate': 1.7939668107406e-05, 'epoch': 0.61}
|
111 |
-
{'loss': 0.0195, 'grad_norm': 0.024026205763220787, 'learning_rate': 1.7917273195529975e-05, 'epoch': 0.62}
|
112 |
-
{'loss': 0.0209, 'grad_norm': 0.21048341691493988, 'learning_rate': 1.7894878283653957e-05, 'epoch': 0.62}
|
113 |
-
{'loss': 0.0313, 'grad_norm': 0.32045114040374756, 'learning_rate': 1.7872483371777935e-05, 'epoch': 0.63}
|
114 |
-
{'loss': 0.0247, 'grad_norm': 0.17071415483951569, 'learning_rate': 1.785008845990191e-05, 'epoch': 0.64}
|
115 |
-
{'loss': 0.0285, 'grad_norm': 0.03091379813849926, 'learning_rate': 1.782769354802589e-05, 'epoch': 0.64}
|
116 |
-
{'loss': 0.0301, 'grad_norm': 0.007939248345792294, 'learning_rate': 1.7805298636149867e-05, 'epoch': 0.65}
|
117 |
-
{'loss': 0.0227, 'grad_norm': 0.4531534016132355, 'learning_rate': 1.7782903724273845e-05, 'epoch': 0.65}
|
118 |
-
{'loss': 0.0235, 'grad_norm': 0.06630811095237732, 'learning_rate': 1.7760508812397827e-05, 'epoch': 0.66}
|
119 |
-
{'loss': 0.0272, 'grad_norm': 0.40065327286720276, 'learning_rate': 1.77381139005218e-05, 'epoch': 0.66}
|
120 |
-
{'loss': 0.025, 'grad_norm': 2.7814598083496094, 'learning_rate': 1.771571898864578e-05, 'epoch': 0.67}
|
121 |
-
{'loss': 0.0276, 'grad_norm': 2.402163028717041, 'learning_rate': 1.769332407676976e-05, 'epoch': 0.67}
|
122 |
-
{'loss': 0.0289, 'grad_norm': 0.03239729255437851, 'learning_rate': 1.7670929164893737e-05, 'epoch': 0.68}
|
123 |
-
{'loss': 0.0232, 'grad_norm': 0.6327053904533386, 'learning_rate': 1.7648534253017715e-05, 'epoch': 0.68}
|
124 |
-
{'loss': 0.0258, 'grad_norm': 0.08029168099164963, 'learning_rate': 1.7626139341141693e-05, 'epoch': 0.69}
|
125 |
-
{'loss': 0.0254, 'grad_norm': 0.1738702803850174, 'learning_rate': 1.7603744429265672e-05, 'epoch': 0.7}
|
126 |
-
{'loss': 0.0205, 'grad_norm': 0.06688190996646881, 'learning_rate': 1.758134951738965e-05, 'epoch': 0.7}
|
127 |
-
{'loss': 0.0216, 'grad_norm': 0.21624340116977692, 'learning_rate': 1.755895460551363e-05, 'epoch': 0.71}
|
128 |
-
{'loss': 0.0304, 'grad_norm': 0.15297529101371765, 'learning_rate': 1.7536559693637607e-05, 'epoch': 0.71}
|
129 |
-
{'loss': 0.0234, 'grad_norm': 0.019339820370078087, 'learning_rate': 1.7514164781761585e-05, 'epoch': 0.72}
|
130 |
-
{'loss': 0.0233, 'grad_norm': 0.25130143761634827, 'learning_rate': 1.7491769869885563e-05, 'epoch': 0.72}
|
131 |
-
{'loss': 0.0239, 'grad_norm': 0.06551504135131836, 'learning_rate': 1.7469374958009542e-05, 'epoch': 0.73}
|
132 |
-
{'loss': 0.0166, 'grad_norm': 6.077746391296387, 'learning_rate': 1.744698004613352e-05, 'epoch': 0.73}
|
133 |
-
{'loss': 0.0211, 'grad_norm': 0.019542552530765533, 'learning_rate': 1.74245851342575e-05, 'epoch': 0.74}
|
134 |
-
{'loss': 0.0212, 'grad_norm': 0.01987328752875328, 'learning_rate': 1.7402190222381477e-05, 'epoch': 0.74}
|
135 |
-
{'loss': 0.0247, 'grad_norm': 0.9539387226104736, 'learning_rate': 1.7379795310505455e-05, 'epoch': 0.75}
|
136 |
-
{'loss': 0.023, 'grad_norm': 1.40300452709198, 'learning_rate': 1.7357400398629434e-05, 'epoch': 0.76}
|
137 |
-
{'loss': 0.0261, 'grad_norm': 0.1152738556265831, 'learning_rate': 1.7335005486753412e-05, 'epoch': 0.76}
|
138 |
-
{'loss': 0.0204, 'grad_norm': 0.1181008443236351, 'learning_rate': 1.731261057487739e-05, 'epoch': 0.77}
|
139 |
-
{'loss': 0.026, 'grad_norm': 0.36036548018455505, 'learning_rate': 1.729021566300137e-05, 'epoch': 0.77}
|
140 |
-
{'loss': 0.0299, 'grad_norm': 0.021303873509168625, 'learning_rate': 1.7267820751125347e-05, 'epoch': 0.78}
|
141 |
-
{'loss': 0.0183, 'grad_norm': 1.2206119298934937, 'learning_rate': 1.7245425839249322e-05, 'epoch': 0.78}
|
142 |
-
{'loss': 0.0228, 'grad_norm': 1.1102793216705322, 'learning_rate': 1.7223030927373304e-05, 'epoch': 0.79}
|
143 |
-
{'loss': 0.0181, 'grad_norm': 1.511096477508545, 'learning_rate': 1.7200636015497282e-05, 'epoch': 0.79}
|
144 |
-
{'loss': 0.0237, 'grad_norm': 0.43338674306869507, 'learning_rate': 1.7178241103621257e-05, 'epoch': 0.8}
|
145 |
-
{'loss': 0.0237, 'grad_norm': 0.20501121878623962, 'learning_rate': 1.715584619174524e-05, 'epoch': 0.8}
|
146 |
-
{'loss': 0.0158, 'grad_norm': 0.09320724755525589, 'learning_rate': 1.7133451279869217e-05, 'epoch': 0.81}
|
147 |
-
{'loss': 0.0222, 'grad_norm': 0.43307095766067505, 'learning_rate': 1.7111056367993192e-05, 'epoch': 0.82}
|
148 |
-
{'loss': 0.0196, 'grad_norm': 0.9130146503448486, 'learning_rate': 1.7088661456117174e-05, 'epoch': 0.82}
|
149 |
-
{'loss': 0.0242, 'grad_norm': 4.501875400543213, 'learning_rate': 1.706626654424115e-05, 'epoch': 0.83}
|
150 |
-
{'loss': 0.0218, 'grad_norm': 0.041434165090322495, 'learning_rate': 1.7043871632365127e-05, 'epoch': 0.83}
|
151 |
-
{'loss': 0.0201, 'grad_norm': 18.74702262878418, 'learning_rate': 1.702147672048911e-05, 'epoch': 0.84}
|
152 |
-
{'loss': 0.026, 'grad_norm': 0.03762982413172722, 'learning_rate': 1.6999081808613084e-05, 'epoch': 0.84}
|
153 |
-
{'loss': 0.0232, 'grad_norm': 0.12932895123958588, 'learning_rate': 1.6976686896737062e-05, 'epoch': 0.85}
|
154 |
-
{'loss': 0.0254, 'grad_norm': 0.8021348118782043, 'learning_rate': 1.695429198486104e-05, 'epoch': 0.85}
|
155 |
-
{'loss': 0.0218, 'grad_norm': 0.1913072168827057, 'learning_rate': 1.693189707298502e-05, 'epoch': 0.86}
|
156 |
-
{'loss': 0.0219, 'grad_norm': 0.00855530146509409, 'learning_rate': 1.6909502161108997e-05, 'epoch': 0.87}
|
157 |
-
{'loss': 0.0255, 'grad_norm': 0.9369354844093323, 'learning_rate': 1.6887107249232976e-05, 'epoch': 0.87}
|
158 |
-
{'loss': 0.0201, 'grad_norm': 0.0015777107328176498, 'learning_rate': 1.6864712337356954e-05, 'epoch': 0.88}
|
159 |
-
{'loss': 0.0301, 'grad_norm': 3.1490933895111084, 'learning_rate': 1.6842317425480932e-05, 'epoch': 0.88}
|
160 |
-
{'loss': 0.0275, 'grad_norm': 0.02967211790382862, 'learning_rate': 1.681992251360491e-05, 'epoch': 0.89}
|
161 |
-
{'loss': 0.018, 'grad_norm': 0.013429056853055954, 'learning_rate': 1.679752760172889e-05, 'epoch': 0.89}
|
162 |
-
{'loss': 0.028, 'grad_norm': 0.269550621509552, 'learning_rate': 1.6775132689852867e-05, 'epoch': 0.9}
|
163 |
-
{'loss': 0.0223, 'grad_norm': 0.30404672026634216, 'learning_rate': 1.6752737777976846e-05, 'epoch': 0.9}
|
164 |
-
{'loss': 0.0201, 'grad_norm': 0.013556144200265408, 'learning_rate': 1.6730342866100824e-05, 'epoch': 0.91}
|
165 |
-
{'loss': 0.0299, 'grad_norm': 0.046162448823451996, 'learning_rate': 1.6707947954224802e-05, 'epoch': 0.91}
|
166 |
-
{'loss': 0.0251, 'grad_norm': 0.020988399162888527, 'learning_rate': 1.6685553042348777e-05, 'epoch': 0.92}
|
167 |
-
{'loss': 0.0203, 'grad_norm': 0.07533540576696396, 'learning_rate': 1.666315813047276e-05, 'epoch': 0.93}
|
168 |
-
{'loss': 0.0209, 'grad_norm': 0.7397226691246033, 'learning_rate': 1.6640763218596737e-05, 'epoch': 0.93}
|
169 |
-
{'loss': 0.0236, 'grad_norm': 0.01838051900267601, 'learning_rate': 1.6618368306720712e-05, 'epoch': 0.94}
|
170 |
-
{'loss': 0.0191, 'grad_norm': 0.03700494021177292, 'learning_rate': 1.6595973394844694e-05, 'epoch': 0.94}
|
171 |
-
{'loss': 0.0168, 'grad_norm': 0.07713836431503296, 'learning_rate': 1.657357848296867e-05, 'epoch': 0.95}
|
172 |
-
{'loss': 0.017, 'grad_norm': 3.437300682067871, 'learning_rate': 1.6551183571092647e-05, 'epoch': 0.95}
|
173 |
-
{'loss': 0.0201, 'grad_norm': 0.06772757321596146, 'learning_rate': 1.652878865921663e-05, 'epoch': 0.96}
|
174 |
-
{'loss': 0.0171, 'grad_norm': 0.8847533464431763, 'learning_rate': 1.6506393747340604e-05, 'epoch': 0.96}
|
175 |
-
{'loss': 0.0217, 'grad_norm': 0.0963427796959877, 'learning_rate': 1.6483998835464583e-05, 'epoch': 0.97}
|
176 |
-
{'loss': 0.0208, 'grad_norm': 0.45162704586982727, 'learning_rate': 1.6461603923588564e-05, 'epoch': 0.97}
|
177 |
-
{'loss': 0.0157, 'grad_norm': 0.1655539721250534, 'learning_rate': 1.643920901171254e-05, 'epoch': 0.98}
|
178 |
-
{'loss': 0.0218, 'grad_norm': 0.010928811505436897, 'learning_rate': 1.6416814099836518e-05, 'epoch': 0.99}
|
179 |
-
{'loss': 0.021, 'grad_norm': 6.908138751983643, 'learning_rate': 1.6394419187960496e-05, 'epoch': 0.99}
|
180 |
-
{'loss': 0.0159, 'grad_norm': 2.0493404865264893, 'learning_rate': 1.6372024276084474e-05, 'epoch': 1.0}
|
181 |
-
{'loss': 0.0189, 'grad_norm': 0.10856210440397263, 'learning_rate': 1.6349629364208453e-05, 'epoch': 1.0}
|
182 |
-
{'loss': 0.0182, 'grad_norm': 2.7368407249450684, 'learning_rate': 1.632723445233243e-05, 'epoch': 1.01}
|
183 |
-
{'loss': 0.0206, 'grad_norm': 0.09092140942811966, 'learning_rate': 1.630483954045641e-05, 'epoch': 1.01}
|
184 |
-
{'loss': 0.0179, 'grad_norm': 0.01954420655965805, 'learning_rate': 1.6282444628580388e-05, 'epoch': 1.02}
|
185 |
-
{'loss': 0.0168, 'grad_norm': 0.043369174003601074, 'learning_rate': 1.6260049716704366e-05, 'epoch': 1.02}
|
186 |
-
{'loss': 0.019, 'grad_norm': 0.01492550503462553, 'learning_rate': 1.6237654804828344e-05, 'epoch': 1.03}
|
187 |
-
{'loss': 0.0173, 'grad_norm': 0.0102744922041893, 'learning_rate': 1.6215259892952323e-05, 'epoch': 1.03}
|
188 |
-
{'loss': 0.0172, 'grad_norm': 0.7786262631416321, 'learning_rate': 1.61928649810763e-05, 'epoch': 1.04}
|
189 |
-
{'loss': 0.0187, 'grad_norm': 0.033732105046510696, 'learning_rate': 1.617047006920028e-05, 'epoch': 1.05}
|
190 |
-
{'loss': 0.0199, 'grad_norm': 0.01999427191913128, 'learning_rate': 1.6148075157324258e-05, 'epoch': 1.05}
|
191 |
-
{'loss': 0.0202, 'grad_norm': 0.0032311684917658567, 'learning_rate': 1.6125680245448236e-05, 'epoch': 1.06}
|
192 |
-
{'loss': 0.0198, 'grad_norm': 0.09005136042833328, 'learning_rate': 1.6103285333572214e-05, 'epoch': 1.06}
|
193 |
-
{'loss': 0.0157, 'grad_norm': 0.48562514781951904, 'learning_rate': 1.6080890421696193e-05, 'epoch': 1.07}
|
194 |
-
{'loss': 0.0178, 'grad_norm': 0.021473940461874008, 'learning_rate': 1.605849550982017e-05, 'epoch': 1.07}
|
195 |
-
{'loss': 0.0147, 'grad_norm': 7.715484142303467, 'learning_rate': 1.603610059794415e-05, 'epoch': 1.08}
|
196 |
-
{'loss': 0.0152, 'grad_norm': 0.1648186594247818, 'learning_rate': 1.6013705686068124e-05, 'epoch': 1.08}
|
197 |
-
{'loss': 0.0152, 'grad_norm': 0.0015509655931964517, 'learning_rate': 1.5991310774192106e-05, 'epoch': 1.09}
|
198 |
-
{'loss': 0.0126, 'grad_norm': 0.23705679178237915, 'learning_rate': 1.5968915862316085e-05, 'epoch': 1.1}
|
199 |
-
{'loss': 0.0115, 'grad_norm': 0.019256843253970146, 'learning_rate': 1.594652095044006e-05, 'epoch': 1.1}
|
200 |
-
{'loss': 0.0122, 'grad_norm': 0.5769898295402527, 'learning_rate': 1.592412603856404e-05, 'epoch': 1.11}
|
201 |
-
{'loss': 0.0097, 'grad_norm': 0.03456917405128479, 'learning_rate': 1.590173112668802e-05, 'epoch': 1.11}
|
202 |
-
{'loss': 0.0149, 'grad_norm': 0.671821117401123, 'learning_rate': 1.5879336214811995e-05, 'epoch': 1.12}
|
203 |
-
{'loss': 0.0151, 'grad_norm': 0.004286791197955608, 'learning_rate': 1.5856941302935976e-05, 'epoch': 1.12}
|
204 |
-
{'loss': 0.0134, 'grad_norm': 0.13815534114837646, 'learning_rate': 1.583454639105995e-05, 'epoch': 1.13}
|
205 |
-
{'loss': 0.0157, 'grad_norm': 0.042440492659807205, 'learning_rate': 1.581215147918393e-05, 'epoch': 1.13}
|
206 |
-
{'loss': 0.0141, 'grad_norm': 0.003109186887741089, 'learning_rate': 1.5789756567307908e-05, 'epoch': 1.14}
|
207 |
-
{'loss': 0.0139, 'grad_norm': 4.196854591369629, 'learning_rate': 1.5767361655431886e-05, 'epoch': 1.14}
|
208 |
-
{'loss': 0.0149, 'grad_norm': 0.36187997460365295, 'learning_rate': 1.5744966743555865e-05, 'epoch': 1.15}
|
209 |
-
{'loss': 0.0103, 'grad_norm': 0.08171387016773224, 'learning_rate': 1.5722571831679843e-05, 'epoch': 1.16}
|
210 |
-
{'loss': 0.0138, 'grad_norm': 0.18907544016838074, 'learning_rate': 1.570017691980382e-05, 'epoch': 1.16}
|
211 |
-
{'loss': 0.0116, 'grad_norm': 0.01975160650908947, 'learning_rate': 1.56777820079278e-05, 'epoch': 1.17}
|
212 |
-
{'loss': 0.0146, 'grad_norm': 0.16162721812725067, 'learning_rate': 1.5655387096051778e-05, 'epoch': 1.17}
|
213 |
-
{'loss': 0.0168, 'grad_norm': 14.283798217773438, 'learning_rate': 1.5632992184175756e-05, 'epoch': 1.18}
|
214 |
-
{'loss': 0.0166, 'grad_norm': 0.984356164932251, 'learning_rate': 1.5610597272299735e-05, 'epoch': 1.18}
|
215 |
-
{'loss': 0.0136, 'grad_norm': 0.4644564688205719, 'learning_rate': 1.5588202360423713e-05, 'epoch': 1.19}
|
216 |
-
{'loss': 0.0103, 'grad_norm': 0.1780320703983307, 'learning_rate': 1.556580744854769e-05, 'epoch': 1.19}
|
217 |
-
{'loss': 0.0128, 'grad_norm': 0.12859980762004852, 'learning_rate': 1.554341253667167e-05, 'epoch': 1.2}
|
218 |
-
{'loss': 0.0112, 'grad_norm': 0.45004957914352417, 'learning_rate': 1.5521017624795648e-05, 'epoch': 1.2}
|
219 |
-
{'loss': 0.0103, 'grad_norm': 0.6201251745223999, 'learning_rate': 1.5498622712919627e-05, 'epoch': 1.21}
|
220 |
-
{'loss': 0.0133, 'grad_norm': 1.4645249843597412, 'learning_rate': 1.5476227801043605e-05, 'epoch': 1.22}
|
221 |
-
{'loss': 0.0118, 'grad_norm': 1.33707594871521, 'learning_rate': 1.545383288916758e-05, 'epoch': 1.22}
|
222 |
-
{'loss': 0.009, 'grad_norm': 0.0075813643634319305, 'learning_rate': 1.543143797729156e-05, 'epoch': 1.23}
|
223 |
-
{'loss': 0.0151, 'grad_norm': 0.06382226198911667, 'learning_rate': 1.540904306541554e-05, 'epoch': 1.23}
|
224 |
-
{'loss': 0.0146, 'grad_norm': 0.06127556413412094, 'learning_rate': 1.5386648153539515e-05, 'epoch': 1.24}
|
225 |
-
{'loss': 0.0143, 'grad_norm': 0.05294317007064819, 'learning_rate': 1.5364253241663497e-05, 'epoch': 1.24}
|
226 |
-
{'loss': 0.01, 'grad_norm': 0.031760863959789276, 'learning_rate': 1.5341858329787475e-05, 'epoch': 1.25}
|
227 |
-
{'loss': 0.0147, 'grad_norm': 0.22437328100204468, 'learning_rate': 1.531946341791145e-05, 'epoch': 1.25}
|
228 |
-
{'loss': 0.011, 'grad_norm': 1.463218092918396, 'learning_rate': 1.5297068506035432e-05, 'epoch': 1.26}
|
229 |
-
{'loss': 0.0121, 'grad_norm': 0.025357956066727638, 'learning_rate': 1.5274673594159407e-05, 'epoch': 1.26}
|
230 |
-
{'loss': 0.0117, 'grad_norm': 0.004972013644874096, 'learning_rate': 1.5252278682283385e-05, 'epoch': 1.27}
|
231 |
-
{'loss': 0.0151, 'grad_norm': 0.14370237290859222, 'learning_rate': 1.5229883770407365e-05, 'epoch': 1.28}
|
232 |
-
{'loss': 0.0143, 'grad_norm': 1.6805877685546875, 'learning_rate': 1.5207488858531343e-05, 'epoch': 1.28}
|
233 |
-
{'loss': 0.0163, 'grad_norm': 0.11494574695825577, 'learning_rate': 1.518509394665532e-05, 'epoch': 1.29}
|
234 |
-
{'loss': 0.0135, 'grad_norm': 0.03341173008084297, 'learning_rate': 1.51626990347793e-05, 'epoch': 1.29}
|
235 |
-
{'loss': 0.0118, 'grad_norm': 0.026476634666323662, 'learning_rate': 1.5140304122903279e-05, 'epoch': 1.3}
|
236 |
-
{'loss': 0.0129, 'grad_norm': 0.020570116117596626, 'learning_rate': 1.5117909211027255e-05, 'epoch': 1.3}
|
237 |
-
{'loss': 0.0062, 'grad_norm': 0.36162054538726807, 'learning_rate': 1.5095514299151235e-05, 'epoch': 1.31}
|
238 |
-
{'loss': 0.0127, 'grad_norm': 0.013678218238055706, 'learning_rate': 1.5073119387275212e-05, 'epoch': 1.31}
|
239 |
-
{'loss': 0.014, 'grad_norm': 2.711167335510254, 'learning_rate': 1.505072447539919e-05, 'epoch': 1.32}
|
240 |
-
{'loss': 0.0131, 'grad_norm': 0.3496088981628418, 'learning_rate': 1.502832956352317e-05, 'epoch': 1.33}
|
241 |
-
{'loss': 0.0162, 'grad_norm': 0.018671611323952675, 'learning_rate': 1.5005934651647147e-05, 'epoch': 1.33}
|
242 |
-
{'loss': 0.0107, 'grad_norm': 0.001666502095758915, 'learning_rate': 1.4983539739771125e-05, 'epoch': 1.34}
|
243 |
-
{'loss': 0.0125, 'grad_norm': 4.59019136428833, 'learning_rate': 1.4961144827895104e-05, 'epoch': 1.34}
|
244 |
-
{'loss': 0.0136, 'grad_norm': 0.2412848323583603, 'learning_rate': 1.4938749916019082e-05, 'epoch': 1.35}
|
245 |
-
{'loss': 0.0112, 'grad_norm': 0.02962004393339157, 'learning_rate': 1.4916355004143059e-05, 'epoch': 1.35}
|
246 |
-
{'loss': 0.0126, 'grad_norm': 0.36205366253852844, 'learning_rate': 1.4893960092267039e-05, 'epoch': 1.36}
|
247 |
-
{'loss': 0.0079, 'grad_norm': 0.031118787825107574, 'learning_rate': 1.4871565180391017e-05, 'epoch': 1.36}
|
248 |
-
{'loss': 0.0104, 'grad_norm': 0.16951577365398407, 'learning_rate': 1.4849170268514994e-05, 'epoch': 1.37}
|
249 |
-
{'loss': 0.0137, 'grad_norm': 0.26067009568214417, 'learning_rate': 1.4826775356638974e-05, 'epoch': 1.37}
|
250 |
-
{'loss': 0.0075, 'grad_norm': 0.04934118688106537, 'learning_rate': 1.4804380444762952e-05, 'epoch': 1.38}
|
251 |
-
{'loss': 0.0108, 'grad_norm': 0.009564828127622604, 'learning_rate': 1.4781985532886929e-05, 'epoch': 1.39}
|
252 |
-
{'loss': 0.0087, 'grad_norm': 0.10579288750886917, 'learning_rate': 1.4759590621010909e-05, 'epoch': 1.39}
|
253 |
-
{'loss': 0.0138, 'grad_norm': 0.31085848808288574, 'learning_rate': 1.4737195709134885e-05, 'epoch': 1.4}
|
254 |
-
{'loss': 0.0056, 'grad_norm': 0.10199588537216187, 'learning_rate': 1.4714800797258864e-05, 'epoch': 1.4}
|
255 |
-
{'loss': 0.0067, 'grad_norm': 0.2795519232749939, 'learning_rate': 1.4692405885382844e-05, 'epoch': 1.41}
|
256 |
-
{'loss': 0.0103, 'grad_norm': 0.023119742050766945, 'learning_rate': 1.467001097350682e-05, 'epoch': 1.41}
|
257 |
-
{'loss': 0.0102, 'grad_norm': 0.08798133581876755, 'learning_rate': 1.4647616061630799e-05, 'epoch': 1.42}
|
258 |
-
{'loss': 0.0119, 'grad_norm': 0.0012799632968381047, 'learning_rate': 1.4625221149754776e-05, 'epoch': 1.42}
|
259 |
-
{'loss': 0.0094, 'grad_norm': 0.01972614973783493, 'learning_rate': 1.4602826237878756e-05, 'epoch': 1.43}
|
260 |
-
{'loss': 0.0075, 'grad_norm': 0.005090704187750816, 'learning_rate': 1.4580431326002732e-05, 'epoch': 1.43}
|
261 |
-
{'loss': 0.0146, 'grad_norm': 0.2999782860279083, 'learning_rate': 1.455803641412671e-05, 'epoch': 1.44}
|
262 |
-
{'loss': 0.0103, 'grad_norm': 0.00834854319691658, 'learning_rate': 1.453564150225069e-05, 'epoch': 1.45}
|
263 |
-
{'loss': 0.0123, 'grad_norm': 0.007013251073658466, 'learning_rate': 1.4513246590374667e-05, 'epoch': 1.45}
|
264 |
-
{'loss': 0.0107, 'grad_norm': 0.11994576454162598, 'learning_rate': 1.4490851678498646e-05, 'epoch': 1.46}
|
265 |
-
{'loss': 0.0071, 'grad_norm': 0.012767240405082703, 'learning_rate': 1.4468456766622626e-05, 'epoch': 1.46}
|
266 |
-
{'loss': 0.0087, 'grad_norm': 0.4222990870475769, 'learning_rate': 1.4446061854746602e-05, 'epoch': 1.47}
|
267 |
-
{'loss': 0.0072, 'grad_norm': 0.6226161122322083, 'learning_rate': 1.442366694287058e-05, 'epoch': 1.47}
|
268 |
-
{'loss': 0.0094, 'grad_norm': 0.23288856446743011, 'learning_rate': 1.4401272030994559e-05, 'epoch': 1.48}
|
269 |
-
{'loss': 0.0083, 'grad_norm': 0.026115866377949715, 'learning_rate': 1.4378877119118537e-05, 'epoch': 1.48}
|
270 |
-
{'loss': 0.0104, 'grad_norm': 0.0783042460680008, 'learning_rate': 1.4356482207242514e-05, 'epoch': 1.49}
|
271 |
-
{'loss': 0.0076, 'grad_norm': 0.5086662769317627, 'learning_rate': 1.4334087295366494e-05, 'epoch': 1.49}
|
272 |
-
{'loss': 0.006, 'grad_norm': 0.63682621717453, 'learning_rate': 1.4311692383490472e-05, 'epoch': 1.5}
|
273 |
-
{'loss': 0.0085, 'grad_norm': 0.05100702494382858, 'learning_rate': 1.4289297471614449e-05, 'epoch': 1.51}
|
274 |
-
{'loss': 0.0061, 'grad_norm': 0.741942286491394, 'learning_rate': 1.4266902559738429e-05, 'epoch': 1.51}
|
275 |
-
{'loss': 0.0106, 'grad_norm': 0.002769783604890108, 'learning_rate': 1.4244507647862407e-05, 'epoch': 1.52}
|
276 |
-
{'loss': 0.0088, 'grad_norm': 0.028197582811117172, 'learning_rate': 1.4222112735986384e-05, 'epoch': 1.52}
|
277 |
-
{'loss': 0.0111, 'grad_norm': 0.02364266850054264, 'learning_rate': 1.4199717824110364e-05, 'epoch': 1.53}
|
278 |
-
{'loss': 0.0094, 'grad_norm': 0.16233578324317932, 'learning_rate': 1.4177322912234341e-05, 'epoch': 1.53}
|
279 |
-
{'loss': 0.0079, 'grad_norm': 0.05021541193127632, 'learning_rate': 1.415492800035832e-05, 'epoch': 1.54}
|
280 |
-
{'loss': 0.0095, 'grad_norm': 0.00331246224232018, 'learning_rate': 1.41325330884823e-05, 'epoch': 1.54}
|
281 |
-
{'loss': 0.0098, 'grad_norm': 0.07592795044183731, 'learning_rate': 1.4110138176606276e-05, 'epoch': 1.55}
|
282 |
-
{'loss': 0.0139, 'grad_norm': 0.09074956923723221, 'learning_rate': 1.4087743264730254e-05, 'epoch': 1.56}
|
283 |
-
{'loss': 0.0085, 'grad_norm': 0.10638327151536942, 'learning_rate': 1.4065348352854233e-05, 'epoch': 1.56}
|
284 |
-
{'loss': 0.0094, 'grad_norm': 0.02220524474978447, 'learning_rate': 1.4042953440978211e-05, 'epoch': 1.57}
|
285 |
-
{'loss': 0.0088, 'grad_norm': 0.045808251947164536, 'learning_rate': 1.4020558529102188e-05, 'epoch': 1.57}
|
286 |
-
{'loss': 0.0092, 'grad_norm': 0.15339775383472443, 'learning_rate': 1.3998163617226168e-05, 'epoch': 1.58}
|
287 |
-
{'loss': 0.0071, 'grad_norm': 0.009142019785940647, 'learning_rate': 1.3975768705350146e-05, 'epoch': 1.58}
|
288 |
-
{'loss': 0.0101, 'grad_norm': 0.15891438722610474, 'learning_rate': 1.3953373793474123e-05, 'epoch': 1.59}
|
289 |
-
{'loss': 0.011, 'grad_norm': 0.010278990492224693, 'learning_rate': 1.3930978881598103e-05, 'epoch': 1.59}
|
290 |
-
{'loss': 0.0097, 'grad_norm': 0.18057747185230255, 'learning_rate': 1.3908583969722081e-05, 'epoch': 1.6}
|
291 |
-
{'loss': 0.0071, 'grad_norm': 0.7384783625602722, 'learning_rate': 1.3886189057846058e-05, 'epoch': 1.6}
|
292 |
-
{'loss': 0.0114, 'grad_norm': 0.0011770115233957767, 'learning_rate': 1.3863794145970038e-05, 'epoch': 1.61}
|
293 |
-
{'loss': 0.0087, 'grad_norm': 0.033389296382665634, 'learning_rate': 1.3841399234094014e-05, 'epoch': 1.62}
|
294 |
-
{'loss': 0.0075, 'grad_norm': 0.06636679172515869, 'learning_rate': 1.3819004322217993e-05, 'epoch': 1.62}
|
295 |
-
{'loss': 0.0039, 'grad_norm': 0.018852446228265762, 'learning_rate': 1.3796609410341973e-05, 'epoch': 1.63}
|
296 |
-
{'loss': 0.0091, 'grad_norm': 0.3978893458843231, 'learning_rate': 1.377421449846595e-05, 'epoch': 1.63}
|
297 |
-
{'loss': 0.0117, 'grad_norm': 0.044826582074165344, 'learning_rate': 1.3751819586589928e-05, 'epoch': 1.64}
|
298 |
-
{'loss': 0.01, 'grad_norm': 0.0314922071993351, 'learning_rate': 1.3729424674713908e-05, 'epoch': 1.64}
|
299 |
-
{'loss': 0.0099, 'grad_norm': 0.008845321834087372, 'learning_rate': 1.3707029762837885e-05, 'epoch': 1.65}
|
300 |
-
{'loss': 0.0069, 'grad_norm': 0.012995203956961632, 'learning_rate': 1.3684634850961863e-05, 'epoch': 1.65}
|
301 |
-
{'loss': 0.0084, 'grad_norm': 0.2505897283554077, 'learning_rate': 1.3662239939085841e-05, 'epoch': 1.66}
|
302 |
-
{'loss': 0.0118, 'grad_norm': 0.576478898525238, 'learning_rate': 1.363984502720982e-05, 'epoch': 1.66}
|
303 |
-
{'loss': 0.0078, 'grad_norm': 0.03707367181777954, 'learning_rate': 1.3617450115333796e-05, 'epoch': 1.67}
|
304 |
-
{'loss': 0.0067, 'grad_norm': 0.013740709982812405, 'learning_rate': 1.3595055203457776e-05, 'epoch': 1.68}
|
305 |
-
{'loss': 0.0133, 'grad_norm': 0.032045792788267136, 'learning_rate': 1.3572660291581755e-05, 'epoch': 1.68}
|
306 |
-
{'loss': 0.0079, 'grad_norm': 0.011074294336140156, 'learning_rate': 1.3550265379705731e-05, 'epoch': 1.69}
|
307 |
-
{'loss': 0.0092, 'grad_norm': 15.06441879272461, 'learning_rate': 1.3527870467829711e-05, 'epoch': 1.69}
|
308 |
-
{'loss': 0.0069, 'grad_norm': 0.02328888699412346, 'learning_rate': 1.3505475555953688e-05, 'epoch': 1.7}
|
309 |
-
{'loss': 0.008, 'grad_norm': 0.09563597291707993, 'learning_rate': 1.3483080644077666e-05, 'epoch': 1.7}
|
310 |
-
{'loss': 0.0124, 'grad_norm': 0.18408024311065674, 'learning_rate': 1.3460685732201646e-05, 'epoch': 1.71}
|
311 |
-
{'loss': 0.0112, 'grad_norm': 5.72014856338501, 'learning_rate': 1.3438290820325623e-05, 'epoch': 1.71}
|
312 |
-
{'loss': 0.0074, 'grad_norm': 0.38538190722465515, 'learning_rate': 1.3415895908449601e-05, 'epoch': 1.72}
|
313 |
-
{'loss': 0.0091, 'grad_norm': 0.000642662460450083, 'learning_rate': 1.3393500996573578e-05, 'epoch': 1.72}
|
314 |
-
{'loss': 0.0088, 'grad_norm': 0.038307420909404755, 'learning_rate': 1.3371106084697558e-05, 'epoch': 1.73}
|
315 |
-
{'loss': 0.0061, 'grad_norm': 0.043493278324604034, 'learning_rate': 1.3348711172821536e-05, 'epoch': 1.74}
|
316 |
-
{'loss': 0.0089, 'grad_norm': 0.17391881346702576, 'learning_rate': 1.3326316260945513e-05, 'epoch': 1.74}
|
317 |
-
{'loss': 0.0082, 'grad_norm': 0.056078795343637466, 'learning_rate': 1.3303921349069493e-05, 'epoch': 1.75}
|
318 |
-
{'loss': 0.0103, 'grad_norm': 13.198871612548828, 'learning_rate': 1.328152643719347e-05, 'epoch': 1.75}
|
319 |
-
{'loss': 0.0094, 'grad_norm': 0.026877380907535553, 'learning_rate': 1.3259131525317448e-05, 'epoch': 1.76}
|
320 |
-
{'loss': 0.0073, 'grad_norm': 0.08020322024822235, 'learning_rate': 1.3236736613441428e-05, 'epoch': 1.76}
|
321 |
-
{'loss': 0.0116, 'grad_norm': 0.13889168202877045, 'learning_rate': 1.3214341701565405e-05, 'epoch': 1.77}
|
322 |
-
{'loss': 0.0112, 'grad_norm': 0.10164140164852142, 'learning_rate': 1.3191946789689383e-05, 'epoch': 1.77}
|
323 |
-
{'loss': 0.0057, 'grad_norm': 0.04719488322734833, 'learning_rate': 1.3169551877813362e-05, 'epoch': 1.78}
|
324 |
-
{'loss': 0.0075, 'grad_norm': 0.004808616824448109, 'learning_rate': 1.314715696593734e-05, 'epoch': 1.79}
|
325 |
-
{'loss': 0.0062, 'grad_norm': 4.645363807678223, 'learning_rate': 1.3124762054061317e-05, 'epoch': 1.79}
|
326 |
-
{'loss': 0.0046, 'grad_norm': 0.0703832358121872, 'learning_rate': 1.3102367142185297e-05, 'epoch': 1.8}
|
327 |
-
{'loss': 0.0091, 'grad_norm': 0.7329868078231812, 'learning_rate': 1.3079972230309275e-05, 'epoch': 1.8}
|
328 |
-
{'loss': 0.0066, 'grad_norm': 0.050435516983270645, 'learning_rate': 1.3057577318433252e-05, 'epoch': 1.81}
|
329 |
-
{'loss': 0.0051, 'grad_norm': 0.06814319640398026, 'learning_rate': 1.3035182406557232e-05, 'epoch': 1.81}
|
330 |
-
{'loss': 0.0066, 'grad_norm': 0.1799684762954712, 'learning_rate': 1.301278749468121e-05, 'epoch': 1.82}
|
331 |
-
{'loss': 0.0093, 'grad_norm': 0.26140815019607544, 'learning_rate': 1.2990392582805187e-05, 'epoch': 1.82}
|
332 |
-
{'loss': 0.0079, 'grad_norm': 0.015023380517959595, 'learning_rate': 1.2967997670929167e-05, 'epoch': 1.83}
|
333 |
-
{'loss': 0.0067, 'grad_norm': 0.018291285261511803, 'learning_rate': 1.2945602759053143e-05, 'epoch': 1.83}
|
334 |
-
{'loss': 0.007, 'grad_norm': 0.07480958849191666, 'learning_rate': 1.2923207847177122e-05, 'epoch': 1.84}
|
335 |
-
{'loss': 0.0133, 'grad_norm': 0.08360631763935089, 'learning_rate': 1.2900812935301102e-05, 'epoch': 1.85}
|
336 |
-
{'loss': 0.0071, 'grad_norm': 0.7749391198158264, 'learning_rate': 1.2878418023425078e-05, 'epoch': 1.85}
|
337 |
-
{'loss': 0.0091, 'grad_norm': 0.2316342443227768, 'learning_rate': 1.2856023111549057e-05, 'epoch': 1.86}
|
338 |
-
{'loss': 0.0067, 'grad_norm': 1.2949588298797607, 'learning_rate': 1.2833628199673037e-05, 'epoch': 1.86}
|
339 |
-
{'loss': 0.0091, 'grad_norm': 0.02135908231139183, 'learning_rate': 1.2811233287797014e-05, 'epoch': 1.87}
|
340 |
-
{'loss': 0.0103, 'grad_norm': 3.2552008628845215, 'learning_rate': 1.2788838375920992e-05, 'epoch': 1.87}
|
341 |
-
{'loss': 0.0058, 'grad_norm': 0.002404365921393037, 'learning_rate': 1.276644346404497e-05, 'epoch': 1.88}
|
342 |
-
{'loss': 0.0116, 'grad_norm': 0.021590234711766243, 'learning_rate': 1.2744048552168949e-05, 'epoch': 1.88}
|
343 |
-
{'loss': 0.0089, 'grad_norm': 0.0267606470733881, 'learning_rate': 1.2721653640292925e-05, 'epoch': 1.89}
|
344 |
-
{'loss': 0.0137, 'grad_norm': 0.08189389854669571, 'learning_rate': 1.2699258728416905e-05, 'epoch': 1.89}
|
345 |
-
{'loss': 0.0065, 'grad_norm': 0.009326275438070297, 'learning_rate': 1.2676863816540884e-05, 'epoch': 1.9}
|
346 |
-
{'loss': 0.0098, 'grad_norm': 0.0413086861371994, 'learning_rate': 1.265446890466486e-05, 'epoch': 1.91}
|
347 |
-
{'loss': 0.0083, 'grad_norm': 0.04068596288561821, 'learning_rate': 1.263207399278884e-05, 'epoch': 1.91}
|
348 |
-
{'loss': 0.0115, 'grad_norm': 0.011158055625855923, 'learning_rate': 1.2609679080912817e-05, 'epoch': 1.92}
|
349 |
-
{'loss': 0.0083, 'grad_norm': 3.782308578491211, 'learning_rate': 1.2587284169036795e-05, 'epoch': 1.92}
|
350 |
-
{'loss': 0.0084, 'grad_norm': 0.10942483693361282, 'learning_rate': 1.2564889257160775e-05, 'epoch': 1.93}
|
351 |
-
{'loss': 0.0091, 'grad_norm': 0.7013315558433533, 'learning_rate': 1.2542494345284752e-05, 'epoch': 1.93}
|
352 |
-
{'loss': 0.0092, 'grad_norm': 0.018937768414616585, 'learning_rate': 1.252009943340873e-05, 'epoch': 1.94}
|
353 |
-
{'loss': 0.0054, 'grad_norm': 1.5005667209625244, 'learning_rate': 1.249770452153271e-05, 'epoch': 1.94}
|
354 |
-
{'loss': 0.0049, 'grad_norm': 0.23089168965816498, 'learning_rate': 1.2475309609656687e-05, 'epoch': 1.95}
|
355 |
-
{'loss': 0.0072, 'grad_norm': 0.008295576088130474, 'learning_rate': 1.2452914697780665e-05, 'epoch': 1.95}
|
356 |
-
{'loss': 0.0052, 'grad_norm': 0.010741750709712505, 'learning_rate': 1.2430519785904644e-05, 'epoch': 1.96}
|
357 |
-
{'loss': 0.0063, 'grad_norm': 0.22365953028202057, 'learning_rate': 1.2408124874028622e-05, 'epoch': 1.97}
|
358 |
-
{'loss': 0.0107, 'grad_norm': 0.034852419048547745, 'learning_rate': 1.2385729962152599e-05, 'epoch': 1.97}
|
359 |
-
{'loss': 0.0061, 'grad_norm': 0.06765995174646378, 'learning_rate': 1.2363335050276579e-05, 'epoch': 1.98}
|
360 |
-
{'loss': 0.0059, 'grad_norm': 0.016805749386548996, 'learning_rate': 1.2340940138400557e-05, 'epoch': 1.98}
|
361 |
-
{'loss': 0.0067, 'grad_norm': 0.5831074118614197, 'learning_rate': 1.2318545226524534e-05, 'epoch': 1.99}
|
362 |
-
{'loss': 0.0078, 'grad_norm': 0.030119124799966812, 'learning_rate': 1.2296150314648514e-05, 'epoch': 1.99}
|
363 |
-
{'loss': 0.007, 'grad_norm': 0.20938828587532043, 'learning_rate': 1.2273755402772492e-05, 'epoch': 2.0}
|
364 |
-
{'loss': 0.0065, 'grad_norm': 0.009562190622091293, 'learning_rate': 1.2251360490896469e-05, 'epoch': 2.0}
|
365 |
-
{'loss': 0.0073, 'grad_norm': 0.094178207218647, 'learning_rate': 1.2228965579020447e-05, 'epoch': 2.01}
|
366 |
-
{'loss': 0.01, 'grad_norm': 0.009488407522439957, 'learning_rate': 1.2206570667144426e-05, 'epoch': 2.02}
|
367 |
-
{'loss': 0.0072, 'grad_norm': 0.012404072098433971, 'learning_rate': 1.2184175755268404e-05, 'epoch': 2.02}
|
368 |
-
{'loss': 0.0055, 'grad_norm': 0.03794926032423973, 'learning_rate': 1.216178084339238e-05, 'epoch': 2.03}
|
369 |
-
{'loss': 0.0087, 'grad_norm': 0.11889325082302094, 'learning_rate': 1.213938593151636e-05, 'epoch': 2.03}
|
370 |
-
{'loss': 0.0077, 'grad_norm': 0.17840054631233215, 'learning_rate': 1.2116991019640339e-05, 'epoch': 2.04}
|
371 |
-
{'loss': 0.0067, 'grad_norm': 0.007003217935562134, 'learning_rate': 1.2094596107764316e-05, 'epoch': 2.04}
|
372 |
-
{'loss': 0.008, 'grad_norm': 0.015604425221681595, 'learning_rate': 1.2072201195888296e-05, 'epoch': 2.05}
|
373 |
-
{'loss': 0.0074, 'grad_norm': 0.027836063876748085, 'learning_rate': 1.2049806284012272e-05, 'epoch': 2.05}
|
374 |
-
{'loss': 0.0072, 'grad_norm': 11.219870567321777, 'learning_rate': 1.202741137213625e-05, 'epoch': 2.06}
|
375 |
-
{'loss': 0.0045, 'grad_norm': 0.5155676603317261, 'learning_rate': 1.200501646026023e-05, 'epoch': 2.06}
|
376 |
-
{'loss': 0.0082, 'grad_norm': 0.0011188465869054198, 'learning_rate': 1.1982621548384207e-05, 'epoch': 2.07}
|
377 |
-
{'loss': 0.0042, 'grad_norm': 1.8003283739089966, 'learning_rate': 1.1960226636508186e-05, 'epoch': 2.08}
|
378 |
-
{'loss': 0.0076, 'grad_norm': 3.2485342025756836, 'learning_rate': 1.1937831724632166e-05, 'epoch': 2.08}
|
379 |
-
{'loss': 0.0058, 'grad_norm': 0.016679977998137474, 'learning_rate': 1.1915436812756143e-05, 'epoch': 2.09}
|
380 |
-
{'loss': 0.005, 'grad_norm': 0.0760900229215622, 'learning_rate': 1.1893041900880121e-05, 'epoch': 2.09}
|
381 |
-
{'loss': 0.0047, 'grad_norm': 0.1045154482126236, 'learning_rate': 1.18706469890041e-05, 'epoch': 2.1}
|
382 |
-
{'loss': 0.0045, 'grad_norm': 0.13673138618469238, 'learning_rate': 1.1848252077128078e-05, 'epoch': 2.1}
|
383 |
-
{'loss': 0.0043, 'grad_norm': 0.003820559475570917, 'learning_rate': 1.1825857165252054e-05, 'epoch': 2.11}
|
384 |
-
{'loss': 0.0049, 'grad_norm': 0.01683652587234974, 'learning_rate': 1.1803462253376034e-05, 'epoch': 2.11}
|
385 |
-
{'loss': 0.0058, 'grad_norm': 0.01756940223276615, 'learning_rate': 1.1781067341500013e-05, 'epoch': 2.12}
|
386 |
-
{'loss': 0.0081, 'grad_norm': 0.2571689188480377, 'learning_rate': 1.175867242962399e-05, 'epoch': 2.12}
|
387 |
-
{'loss': 0.0057, 'grad_norm': 0.00470293452963233, 'learning_rate': 1.173627751774797e-05, 'epoch': 2.13}
|
388 |
-
{'loss': 0.0047, 'grad_norm': 0.15966971218585968, 'learning_rate': 1.1713882605871946e-05, 'epoch': 2.14}
|
389 |
-
{'loss': 0.0073, 'grad_norm': 10.850295066833496, 'learning_rate': 1.1691487693995924e-05, 'epoch': 2.14}
|
390 |
-
{'loss': 0.0056, 'grad_norm': 0.02545018680393696, 'learning_rate': 1.1669092782119904e-05, 'epoch': 2.15}
|
391 |
-
{'loss': 0.006, 'grad_norm': 0.4927498400211334, 'learning_rate': 1.1646697870243881e-05, 'epoch': 2.15}
|
392 |
-
{'loss': 0.0061, 'grad_norm': 0.044341787695884705, 'learning_rate': 1.162430295836786e-05, 'epoch': 2.16}
|
393 |
-
{'loss': 0.0042, 'grad_norm': 0.005484799854457378, 'learning_rate': 1.160190804649184e-05, 'epoch': 2.16}
|
394 |
-
{'loss': 0.0057, 'grad_norm': 0.011644992977380753, 'learning_rate': 1.1579513134615816e-05, 'epoch': 2.17}
|
395 |
-
{'loss': 0.0055, 'grad_norm': 0.478604257106781, 'learning_rate': 1.1557118222739794e-05, 'epoch': 2.17}
|
396 |
-
{'loss': 0.0053, 'grad_norm': 0.013355757109820843, 'learning_rate': 1.1534723310863773e-05, 'epoch': 2.18}
|
397 |
-
{'loss': 0.0085, 'grad_norm': 0.021052315831184387, 'learning_rate': 1.1512328398987751e-05, 'epoch': 2.18}
|
398 |
-
{'loss': 0.005, 'grad_norm': 1.1859304904937744, 'learning_rate': 1.1489933487111728e-05, 'epoch': 2.19}
|
399 |
-
{'loss': 0.0055, 'grad_norm': 0.007172802928835154, 'learning_rate': 1.1467538575235708e-05, 'epoch': 2.2}
|
400 |
-
{'loss': 0.0032, 'grad_norm': 0.0038706334307789803, 'learning_rate': 1.1445143663359686e-05, 'epoch': 2.2}
|
401 |
-
{'loss': 0.0054, 'grad_norm': 1.872606635093689, 'learning_rate': 1.1422748751483663e-05, 'epoch': 2.21}
|
402 |
-
{'loss': 0.0037, 'grad_norm': 1.8617807626724243, 'learning_rate': 1.1400353839607643e-05, 'epoch': 2.21}
|
403 |
-
{'loss': 0.0046, 'grad_norm': 0.05619359761476517, 'learning_rate': 1.1377958927731621e-05, 'epoch': 2.22}
|
404 |
-
{'loss': 0.0029, 'grad_norm': 0.15391820669174194, 'learning_rate': 1.1355564015855598e-05, 'epoch': 2.22}
|
405 |
-
{'loss': 0.0043, 'grad_norm': 0.010528339073061943, 'learning_rate': 1.1333169103979578e-05, 'epoch': 2.23}
|
406 |
-
{'loss': 0.0063, 'grad_norm': 0.4744260907173157, 'learning_rate': 1.1310774192103555e-05, 'epoch': 2.23}
|
407 |
-
{'loss': 0.0064, 'grad_norm': 0.00845412164926529, 'learning_rate': 1.1288379280227533e-05, 'epoch': 2.24}
|
408 |
-
{'loss': 0.0046, 'grad_norm': 0.06398043781518936, 'learning_rate': 1.1265984368351513e-05, 'epoch': 2.25}
|
409 |
-
{'loss': 0.0061, 'grad_norm': 0.3351975381374359, 'learning_rate': 1.124358945647549e-05, 'epoch': 2.25}
|
410 |
-
{'loss': 0.0034, 'grad_norm': 0.025763623416423798, 'learning_rate': 1.1221194544599468e-05, 'epoch': 2.26}
|
411 |
-
{'loss': 0.0046, 'grad_norm': 0.0274388175457716, 'learning_rate': 1.1198799632723446e-05, 'epoch': 2.26}
|
412 |
-
{'loss': 0.0059, 'grad_norm': 0.033901914954185486, 'learning_rate': 1.1176404720847425e-05, 'epoch': 2.27}
|
413 |
-
{'loss': 0.0044, 'grad_norm': 0.03207828849554062, 'learning_rate': 1.1154009808971401e-05, 'epoch': 2.27}
|
414 |
-
{'loss': 0.0054, 'grad_norm': 0.13523073494434357, 'learning_rate': 1.1131614897095381e-05, 'epoch': 2.28}
|
415 |
-
{'loss': 0.0049, 'grad_norm': 0.05645907297730446, 'learning_rate': 1.110921998521936e-05, 'epoch': 2.28}
|
416 |
-
{'loss': 0.0096, 'grad_norm': 0.726065456867218, 'learning_rate': 1.1086825073343336e-05, 'epoch': 2.29}
|
417 |
-
{'loss': 0.0045, 'grad_norm': 0.026955202221870422, 'learning_rate': 1.1064430161467316e-05, 'epoch': 2.29}
|
418 |
-
{'loss': 0.0057, 'grad_norm': 0.09468597918748856, 'learning_rate': 1.1042035249591295e-05, 'epoch': 2.3}
|
419 |
-
{'loss': 0.0032, 'grad_norm': 0.4908299744129181, 'learning_rate': 1.1019640337715271e-05, 'epoch': 2.31}
|
420 |
-
{'loss': 0.0031, 'grad_norm': 0.010838231071829796, 'learning_rate': 1.099724542583925e-05, 'epoch': 2.31}
|
421 |
-
{'loss': 0.0043, 'grad_norm': 0.15813188254833221, 'learning_rate': 1.0974850513963228e-05, 'epoch': 2.32}
|
422 |
-
{'loss': 0.0068, 'grad_norm': 0.04824952781200409, 'learning_rate': 1.0952455602087207e-05, 'epoch': 2.32}
|
423 |
-
{'loss': 0.0048, 'grad_norm': 0.12718328833580017, 'learning_rate': 1.0930060690211183e-05, 'epoch': 2.33}
|
424 |
-
{'loss': 0.0042, 'grad_norm': 0.006453562993556261, 'learning_rate': 1.0907665778335163e-05, 'epoch': 2.33}
|
425 |
-
{'loss': 0.0068, 'grad_norm': 0.034881096333265305, 'learning_rate': 1.0885270866459142e-05, 'epoch': 2.34}
|
426 |
-
{'loss': 0.0041, 'grad_norm': 0.026440760120749474, 'learning_rate': 1.0862875954583118e-05, 'epoch': 2.34}
|
427 |
-
{'loss': 0.0042, 'grad_norm': 0.10058464854955673, 'learning_rate': 1.0840481042707098e-05, 'epoch': 2.35}
|
428 |
-
{'loss': 0.0051, 'grad_norm': 0.3769572377204895, 'learning_rate': 1.0818086130831077e-05, 'epoch': 2.35}
|
429 |
-
{'loss': 0.0049, 'grad_norm': 0.10529354214668274, 'learning_rate': 1.0795691218955053e-05, 'epoch': 2.36}
|
430 |
-
{'loss': 0.0019, 'grad_norm': 0.2557080388069153, 'learning_rate': 1.0773296307079033e-05, 'epoch': 2.37}
|
431 |
-
{'loss': 0.0039, 'grad_norm': 0.19080431759357452, 'learning_rate': 1.075090139520301e-05, 'epoch': 2.37}
|
432 |
-
{'loss': 0.0068, 'grad_norm': 0.009274226613342762, 'learning_rate': 1.0728506483326988e-05, 'epoch': 2.38}
|
433 |
-
{'loss': 0.0033, 'grad_norm': 0.15061549842357635, 'learning_rate': 1.0706111571450968e-05, 'epoch': 2.38}
|
434 |
-
{'loss': 0.0048, 'grad_norm': 0.32283729314804077, 'learning_rate': 1.0683716659574945e-05, 'epoch': 2.39}
|
435 |
-
{'loss': 0.0052, 'grad_norm': 0.008730829693377018, 'learning_rate': 1.0661321747698923e-05, 'epoch': 2.39}
|
436 |
-
{'loss': 0.0063, 'grad_norm': 0.09698653966188431, 'learning_rate': 1.0638926835822902e-05, 'epoch': 2.4}
|
437 |
-
{'loss': 0.003, 'grad_norm': 0.09314418584108353, 'learning_rate': 1.061653192394688e-05, 'epoch': 2.4}
|
438 |
-
{'loss': 0.0036, 'grad_norm': 0.0379941388964653, 'learning_rate': 1.0594137012070857e-05, 'epoch': 2.41}
|
439 |
-
{'loss': 0.004, 'grad_norm': 0.03921454772353172, 'learning_rate': 1.0571742100194837e-05, 'epoch': 2.41}
|
440 |
-
{'loss': 0.006, 'grad_norm': 0.0010623994749039412, 'learning_rate': 1.0549347188318815e-05, 'epoch': 2.42}
|
441 |
-
{'loss': 0.0048, 'grad_norm': 0.005394686013460159, 'learning_rate': 1.0526952276442792e-05, 'epoch': 2.43}
|
442 |
-
{'loss': 0.0037, 'grad_norm': 0.03824278712272644, 'learning_rate': 1.0504557364566772e-05, 'epoch': 2.43}
|
443 |
-
{'loss': 0.0034, 'grad_norm': 0.09540271013975143, 'learning_rate': 1.048216245269075e-05, 'epoch': 2.44}
|
444 |
-
{'loss': 0.0049, 'grad_norm': 0.014622088521718979, 'learning_rate': 1.0459767540814727e-05, 'epoch': 2.44}
|
445 |
-
{'loss': 0.0036, 'grad_norm': 0.02506762184202671, 'learning_rate': 1.0437372628938707e-05, 'epoch': 2.45}
|
446 |
-
{'loss': 0.0046, 'grad_norm': 12.032191276550293, 'learning_rate': 1.0414977717062684e-05, 'epoch': 2.45}
|
447 |
-
{'loss': 0.0039, 'grad_norm': 2.5377721786499023, 'learning_rate': 1.0392582805186662e-05, 'epoch': 2.46}
|
448 |
-
{'loss': 0.0021, 'grad_norm': 0.07715722173452377, 'learning_rate': 1.0370187893310642e-05, 'epoch': 2.46}
|
449 |
-
{'loss': 0.0035, 'grad_norm': 0.41538187861442566, 'learning_rate': 1.0347792981434619e-05, 'epoch': 2.47}
|
450 |
-
{'loss': 0.0034, 'grad_norm': 0.5246536135673523, 'learning_rate': 1.0325398069558597e-05, 'epoch': 2.48}
|
451 |
-
{'loss': 0.003, 'grad_norm': 0.0022522832732647657, 'learning_rate': 1.0303003157682577e-05, 'epoch': 2.48}
|
452 |
-
{'loss': 0.0032, 'grad_norm': 0.011422159150242805, 'learning_rate': 1.0280608245806554e-05, 'epoch': 2.49}
|
453 |
-
{'loss': 0.005, 'grad_norm': 0.011885729618370533, 'learning_rate': 1.025821333393053e-05, 'epoch': 2.49}
|
454 |
-
{'loss': 0.0025, 'grad_norm': 0.06374814361333847, 'learning_rate': 1.023581842205451e-05, 'epoch': 2.5}
|
455 |
-
{'loss': 0.0036, 'grad_norm': 0.0674147829413414, 'learning_rate': 1.0213423510178489e-05, 'epoch': 2.5}
|
456 |
-
{'loss': 0.0021, 'grad_norm': 0.5586390495300293, 'learning_rate': 1.0191028598302465e-05, 'epoch': 2.51}
|
457 |
-
{'loss': 0.0025, 'grad_norm': 1.1011099815368652, 'learning_rate': 1.0168633686426445e-05, 'epoch': 2.51}
|
458 |
-
{'loss': 0.0036, 'grad_norm': 0.12449350208044052, 'learning_rate': 1.0146238774550424e-05, 'epoch': 2.52}
|
459 |
-
{'loss': 0.0033, 'grad_norm': 0.001172301941551268, 'learning_rate': 1.01238438626744e-05, 'epoch': 2.52}
|
460 |
-
{'loss': 0.0049, 'grad_norm': 0.029910240322351456, 'learning_rate': 1.010144895079838e-05, 'epoch': 2.53}
|
461 |
-
{'loss': 0.0044, 'grad_norm': 0.08513263612985611, 'learning_rate': 1.0079054038922357e-05, 'epoch': 2.54}
|
462 |
-
{'loss': 0.0029, 'grad_norm': 0.024861743673682213, 'learning_rate': 1.0056659127046336e-05, 'epoch': 2.54}
|
463 |
-
{'loss': 0.0028, 'grad_norm': 0.08090971410274506, 'learning_rate': 1.0034264215170316e-05, 'epoch': 2.55}
|
464 |
-
{'loss': 0.0091, 'grad_norm': 0.6751871109008789, 'learning_rate': 1.0011869303294292e-05, 'epoch': 2.55}
|
465 |
-
{'loss': 0.004, 'grad_norm': 0.01412627287209034, 'learning_rate': 9.98947439141827e-06, 'epoch': 2.56}
|
466 |
-
{'loss': 0.0036, 'grad_norm': 0.06726730614900589, 'learning_rate': 9.967079479542249e-06, 'epoch': 2.56}
|
467 |
-
{'loss': 0.0029, 'grad_norm': 0.5515012145042419, 'learning_rate': 9.944684567666227e-06, 'epoch': 2.57}
|
468 |
-
{'loss': 0.0035, 'grad_norm': 0.0035773934796452522, 'learning_rate': 9.922289655790206e-06, 'epoch': 2.57}
|
469 |
-
{'loss': 0.0038, 'grad_norm': 0.05018525943160057, 'learning_rate': 9.899894743914184e-06, 'epoch': 2.58}
|
470 |
-
{'loss': 0.0028, 'grad_norm': 0.007242262363433838, 'learning_rate': 9.877499832038162e-06, 'epoch': 2.58}
|
471 |
-
{'loss': 0.0041, 'grad_norm': 0.09467479586601257, 'learning_rate': 9.855104920162139e-06, 'epoch': 2.59}
|
472 |
-
{'loss': 0.0037, 'grad_norm': 0.05528566986322403, 'learning_rate': 9.832710008286119e-06, 'epoch': 2.6}
|
473 |
-
{'loss': 0.0031, 'grad_norm': 0.0195195060223341, 'learning_rate': 9.810315096410097e-06, 'epoch': 2.6}
|
474 |
-
{'loss': 0.0036, 'grad_norm': 0.020678259432315826, 'learning_rate': 9.787920184534074e-06, 'epoch': 2.61}
|
475 |
-
{'loss': 0.0052, 'grad_norm': 0.20698687434196472, 'learning_rate': 9.765525272658052e-06, 'epoch': 2.61}
|
476 |
-
{'loss': 0.0031, 'grad_norm': 0.06637762486934662, 'learning_rate': 9.74313036078203e-06, 'epoch': 2.62}
|
477 |
-
{'loss': 0.0023, 'grad_norm': 0.036431193351745605, 'learning_rate': 9.720735448906009e-06, 'epoch': 2.62}
|
478 |
-
{'loss': 0.0043, 'grad_norm': 0.07816935330629349, 'learning_rate': 9.698340537029987e-06, 'epoch': 2.63}
|
479 |
-
{'loss': 0.0027, 'grad_norm': 0.0019028312526643276, 'learning_rate': 9.675945625153966e-06, 'epoch': 2.63}
|
480 |
-
{'loss': 0.0048, 'grad_norm': 0.011531976982951164, 'learning_rate': 9.653550713277944e-06, 'epoch': 2.64}
|
481 |
-
{'loss': 0.0046, 'grad_norm': 0.07196860760450363, 'learning_rate': 9.631155801401923e-06, 'epoch': 2.64}
|
482 |
-
{'loss': 0.0038, 'grad_norm': 0.08175013959407806, 'learning_rate': 9.608760889525901e-06, 'epoch': 2.65}
|
483 |
-
{'loss': 0.0033, 'grad_norm': 0.001142855384387076, 'learning_rate': 9.58636597764988e-06, 'epoch': 2.66}
|
484 |
-
{'loss': 0.003, 'grad_norm': 0.06008300185203552, 'learning_rate': 9.563971065773858e-06, 'epoch': 2.66}
|
485 |
-
{'loss': 0.0057, 'grad_norm': 0.08218628168106079, 'learning_rate': 9.541576153897834e-06, 'epoch': 2.67}
|
486 |
-
{'loss': 0.0044, 'grad_norm': 0.059504490345716476, 'learning_rate': 9.519181242021813e-06, 'epoch': 2.67}
|
487 |
-
{'loss': 0.0058, 'grad_norm': 0.06249801069498062, 'learning_rate': 9.496786330145793e-06, 'epoch': 2.68}
|
488 |
-
{'loss': 0.003, 'grad_norm': 0.03842584043741226, 'learning_rate': 9.47439141826977e-06, 'epoch': 2.68}
|
489 |
-
{'loss': 0.0042, 'grad_norm': 0.05032949522137642, 'learning_rate': 9.451996506393748e-06, 'epoch': 2.69}
|
490 |
-
{'loss': 0.0045, 'grad_norm': 0.051786765456199646, 'learning_rate': 9.429601594517726e-06, 'epoch': 2.69}
|
491 |
-
{'loss': 0.0031, 'grad_norm': 0.11977092176675797, 'learning_rate': 9.407206682641704e-06, 'epoch': 2.7}
|
492 |
-
{'loss': 0.0021, 'grad_norm': 0.004711544141173363, 'learning_rate': 9.384811770765683e-06, 'epoch': 2.71}
|
493 |
-
{'loss': 0.0043, 'grad_norm': 0.4886787235736847, 'learning_rate': 9.362416858889661e-06, 'epoch': 2.71}
|
494 |
-
{'loss': 0.0058, 'grad_norm': 0.018584702163934708, 'learning_rate': 9.34002194701364e-06, 'epoch': 2.72}
|
495 |
-
{'loss': 0.0041, 'grad_norm': 0.03693871572613716, 'learning_rate': 9.317627035137618e-06, 'epoch': 2.72}
|
496 |
-
{'loss': 0.0038, 'grad_norm': 0.004750245716422796, 'learning_rate': 9.295232123261596e-06, 'epoch': 2.73}
|
497 |
-
{'loss': 0.0019, 'grad_norm': 1.913931131362915, 'learning_rate': 9.272837211385573e-06, 'epoch': 2.73}
|
498 |
-
{'loss': 0.0029, 'grad_norm': 0.017329825088381767, 'learning_rate': 9.250442299509553e-06, 'epoch': 2.74}
|
499 |
-
{'loss': 0.003, 'grad_norm': 0.02129119075834751, 'learning_rate': 9.228047387633531e-06, 'epoch': 2.74}
|
500 |
-
{'loss': 0.0038, 'grad_norm': 0.028891241177916527, 'learning_rate': 9.205652475757508e-06, 'epoch': 2.75}
|
501 |
-
{'loss': 0.004, 'grad_norm': 0.009224362671375275, 'learning_rate': 9.183257563881486e-06, 'epoch': 2.75}
|
502 |
-
{'loss': 0.0049, 'grad_norm': 0.03435930609703064, 'learning_rate': 9.160862652005466e-06, 'epoch': 2.76}
|
503 |
-
{'loss': 0.0039, 'grad_norm': 0.01626667007803917, 'learning_rate': 9.138467740129443e-06, 'epoch': 2.77}
|
504 |
-
{'loss': 0.005, 'grad_norm': 1.1218552589416504, 'learning_rate': 9.116072828253421e-06, 'epoch': 2.77}
|
505 |
-
{'loss': 0.0046, 'grad_norm': 0.030987482517957687, 'learning_rate': 9.0936779163774e-06, 'epoch': 2.78}
|
506 |
-
{'loss': 0.0025, 'grad_norm': 0.09153684228658676, 'learning_rate': 9.071283004501378e-06, 'epoch': 2.78}
|
507 |
-
{'loss': 0.0044, 'grad_norm': 0.0023125149309635162, 'learning_rate': 9.048888092625356e-06, 'epoch': 2.79}
|
508 |
-
{'loss': 0.0023, 'grad_norm': 0.004464196972548962, 'learning_rate': 9.026493180749335e-06, 'epoch': 2.79}
|
509 |
-
{'loss': 0.0038, 'grad_norm': 0.033567965030670166, 'learning_rate': 9.004098268873313e-06, 'epoch': 2.8}
|
510 |
-
{'loss': 0.0032, 'grad_norm': 0.05314967781305313, 'learning_rate': 8.981703356997291e-06, 'epoch': 2.8}
|
511 |
-
{'loss': 0.0021, 'grad_norm': 0.019064532592892647, 'learning_rate': 8.959308445121268e-06, 'epoch': 2.81}
|
512 |
-
{'loss': 0.0023, 'grad_norm': 0.006131445057690144, 'learning_rate': 8.936913533245248e-06, 'epoch': 2.81}
|
513 |
-
{'loss': 0.0042, 'grad_norm': 0.20496051013469696, 'learning_rate': 8.914518621369226e-06, 'epoch': 2.82}
|
514 |
-
{'loss': 0.0042, 'grad_norm': 0.03717898949980736, 'learning_rate': 8.892123709493203e-06, 'epoch': 2.83}
|
515 |
-
{'loss': 0.0053, 'grad_norm': 0.04788793995976448, 'learning_rate': 8.869728797617181e-06, 'epoch': 2.83}
|
516 |
-
{'loss': 0.0021, 'grad_norm': 4.119758605957031, 'learning_rate': 8.847333885741161e-06, 'epoch': 2.84}
|
517 |
-
{'loss': 0.0033, 'grad_norm': 0.24966038763523102, 'learning_rate': 8.824938973865138e-06, 'epoch': 2.84}
|
518 |
-
{'loss': 0.0047, 'grad_norm': 11.138167381286621, 'learning_rate': 8.802544061989116e-06, 'epoch': 2.85}
|
519 |
-
{'loss': 0.0048, 'grad_norm': 0.02488502860069275, 'learning_rate': 8.780149150113095e-06, 'epoch': 2.85}
|
520 |
-
{'loss': 0.0022, 'grad_norm': 0.0015538616571575403, 'learning_rate': 8.757754238237073e-06, 'epoch': 2.86}
|
521 |
-
{'loss': 0.0036, 'grad_norm': 0.011559401638805866, 'learning_rate': 8.735359326361051e-06, 'epoch': 2.86}
|
522 |
-
{'loss': 0.0034, 'grad_norm': 0.41917547583580017, 'learning_rate': 8.71296441448503e-06, 'epoch': 2.87}
|
523 |
-
{'loss': 0.0029, 'grad_norm': 0.09700381010770798, 'learning_rate': 8.690569502609008e-06, 'epoch': 2.87}
|
524 |
-
{'loss': 0.0038, 'grad_norm': 0.10457664728164673, 'learning_rate': 8.668174590732987e-06, 'epoch': 2.88}
|
525 |
-
{'loss': 0.0067, 'grad_norm': 0.009366615675389767, 'learning_rate': 8.645779678856965e-06, 'epoch': 2.89}
|
526 |
-
{'loss': 0.003, 'grad_norm': 0.0037414308171719313, 'learning_rate': 8.623384766980942e-06, 'epoch': 2.89}
|
527 |
-
{'loss': 0.0049, 'grad_norm': 0.09502315521240234, 'learning_rate': 8.600989855104922e-06, 'epoch': 2.9}
|
528 |
-
{'loss': 0.0027, 'grad_norm': 0.390895813703537, 'learning_rate': 8.5785949432289e-06, 'epoch': 2.9}
|
529 |
-
{'loss': 0.004, 'grad_norm': 0.06665816903114319, 'learning_rate': 8.556200031352877e-06, 'epoch': 2.91}
|
530 |
-
{'loss': 0.0042, 'grad_norm': 0.012638445943593979, 'learning_rate': 8.533805119476855e-06, 'epoch': 2.91}
|
531 |
-
{'loss': 0.0042, 'grad_norm': 0.26541146636009216, 'learning_rate': 8.511410207600835e-06, 'epoch': 2.92}
|
532 |
-
{'loss': 0.0038, 'grad_norm': 0.0727955549955368, 'learning_rate': 8.489015295724812e-06, 'epoch': 2.92}
|
533 |
-
{'loss': 0.0029, 'grad_norm': 0.10278739035129547, 'learning_rate': 8.46662038384879e-06, 'epoch': 2.93}
|
534 |
-
{'loss': 0.0039, 'grad_norm': 0.02014051005244255, 'learning_rate': 8.444225471972768e-06, 'epoch': 2.94}
|
535 |
-
{'loss': 0.0039, 'grad_norm': 0.03868388757109642, 'learning_rate': 8.421830560096747e-06, 'epoch': 2.94}
|
536 |
-
{'loss': 0.002, 'grad_norm': 0.007972314022481441, 'learning_rate': 8.399435648220725e-06, 'epoch': 2.95}
|
537 |
-
{'loss': 0.0022, 'grad_norm': 0.004831973928958178, 'learning_rate': 8.377040736344703e-06, 'epoch': 2.95}
|
538 |
-
{'loss': 0.002, 'grad_norm': 0.04136960953474045, 'learning_rate': 8.354645824468682e-06, 'epoch': 2.96}
|
539 |
-
{'loss': 0.003, 'grad_norm': 0.05827214568853378, 'learning_rate': 8.33225091259266e-06, 'epoch': 2.96}
|
540 |
-
{'loss': 0.0019, 'grad_norm': 0.42127054929733276, 'learning_rate': 8.309856000716637e-06, 'epoch': 2.97}
|
541 |
-
{'loss': 0.0044, 'grad_norm': 0.3449774384498596, 'learning_rate': 8.287461088840615e-06, 'epoch': 2.97}
|
542 |
-
{'loss': 0.0028, 'grad_norm': 0.07684598118066788, 'learning_rate': 8.265066176964595e-06, 'epoch': 2.98}
|
543 |
-
{'loss': 0.0031, 'grad_norm': 0.010578151792287827, 'learning_rate': 8.242671265088572e-06, 'epoch': 2.98}
|
544 |
-
{'loss': 0.0025, 'grad_norm': 0.14775557816028595, 'learning_rate': 8.22027635321255e-06, 'epoch': 2.99}
|
545 |
-
{'loss': 0.0021, 'grad_norm': 0.16460050642490387, 'learning_rate': 8.197881441336529e-06, 'epoch': 3.0}
|
546 |
-
{'loss': 0.0025, 'grad_norm': 0.0014739581383764744, 'learning_rate': 8.175486529460507e-06, 'epoch': 3.0}
|
547 |
-
{'loss': 0.0038, 'grad_norm': 0.01254010945558548, 'learning_rate': 8.153091617584485e-06, 'epoch': 3.01}
|
548 |
-
{'loss': 0.0045, 'grad_norm': 0.09135819971561432, 'learning_rate': 8.130696705708464e-06, 'epoch': 3.01}
|
549 |
-
{'loss': 0.002, 'grad_norm': 0.002453828463330865, 'learning_rate': 8.108301793832442e-06, 'epoch': 3.02}
|
550 |
-
{'loss': 0.0035, 'grad_norm': 0.007750564254820347, 'learning_rate': 8.08590688195642e-06, 'epoch': 3.02}
|
551 |
-
{'loss': 0.0046, 'grad_norm': 1.2276639938354492, 'learning_rate': 8.063511970080399e-06, 'epoch': 3.03}
|
552 |
-
{'loss': 0.0033, 'grad_norm': 0.0030335187911987305, 'learning_rate': 8.041117058204377e-06, 'epoch': 3.03}
|
553 |
-
{'loss': 0.002, 'grad_norm': 0.38589444756507874, 'learning_rate': 8.018722146328355e-06, 'epoch': 3.04}
|
554 |
-
{'loss': 0.0036, 'grad_norm': 0.022893747314810753, 'learning_rate': 7.996327234452334e-06, 'epoch': 3.04}
|
555 |
-
{'loss': 0.0025, 'grad_norm': 0.003235406940802932, 'learning_rate': 7.97393232257631e-06, 'epoch': 3.05}
|
556 |
-
{'loss': 0.0039, 'grad_norm': 0.05985206738114357, 'learning_rate': 7.95153741070029e-06, 'epoch': 3.06}
|
557 |
-
{'loss': 0.0029, 'grad_norm': 0.14925901591777802, 'learning_rate': 7.929142498824269e-06, 'epoch': 3.06}
|
558 |
-
{'loss': 0.004, 'grad_norm': 0.026889082044363022, 'learning_rate': 7.906747586948245e-06, 'epoch': 3.07}
|
559 |
-
{'loss': 0.0023, 'grad_norm': 0.057593248784542084, 'learning_rate': 7.884352675072224e-06, 'epoch': 3.07}
|
560 |
-
{'loss': 0.0019, 'grad_norm': 0.20720885694026947, 'learning_rate': 7.861957763196204e-06, 'epoch': 3.08}
|
561 |
-
{'loss': 0.0019, 'grad_norm': 0.002138580894097686, 'learning_rate': 7.83956285132018e-06, 'epoch': 3.08}
|
562 |
-
{'loss': 0.0027, 'grad_norm': 0.013360394164919853, 'learning_rate': 7.817167939444159e-06, 'epoch': 3.09}
|
563 |
-
{'loss': 0.0014, 'grad_norm': 0.12524360418319702, 'learning_rate': 7.794773027568137e-06, 'epoch': 3.09}
|
564 |
-
{'loss': 0.0019, 'grad_norm': 0.04898557439446449, 'learning_rate': 7.772378115692116e-06, 'epoch': 3.1}
|
565 |
-
{'loss': 0.0018, 'grad_norm': 0.007251457776874304, 'learning_rate': 7.749983203816094e-06, 'epoch': 3.1}
|
566 |
-
{'loss': 0.0016, 'grad_norm': 0.005014033988118172, 'learning_rate': 7.72758829194007e-06, 'epoch': 3.11}
|
567 |
-
{'loss': 0.0017, 'grad_norm': 0.008448738604784012, 'learning_rate': 7.70519338006405e-06, 'epoch': 3.12}
|
568 |
-
{'loss': 0.0049, 'grad_norm': 0.03684404864907265, 'learning_rate': 7.682798468188029e-06, 'epoch': 3.12}
|
569 |
-
{'loss': 0.0022, 'grad_norm': 0.0004430219705682248, 'learning_rate': 7.660403556312006e-06, 'epoch': 3.13}
|
570 |
-
{'loss': 0.0023, 'grad_norm': 0.01342267170548439, 'learning_rate': 7.638008644435984e-06, 'epoch': 3.13}
|
571 |
-
{'loss': 0.0016, 'grad_norm': 0.27969247102737427, 'learning_rate': 7.615613732559963e-06, 'epoch': 3.14}
|
572 |
-
{'loss': 0.002, 'grad_norm': 0.37727439403533936, 'learning_rate': 7.593218820683941e-06, 'epoch': 3.14}
|
573 |
-
{'loss': 0.0025, 'grad_norm': 0.2768697142601013, 'learning_rate': 7.570823908807919e-06, 'epoch': 3.15}
|
574 |
-
{'loss': 0.0012, 'grad_norm': 0.12135498970746994, 'learning_rate': 7.548428996931898e-06, 'epoch': 3.15}
|
575 |
-
{'loss': 0.0021, 'grad_norm': 0.05090919882059097, 'learning_rate': 7.526034085055876e-06, 'epoch': 3.16}
|
576 |
-
{'loss': 0.0017, 'grad_norm': 0.14085857570171356, 'learning_rate': 7.503639173179854e-06, 'epoch': 3.17}
|
577 |
-
{'loss': 0.0019, 'grad_norm': 0.3364329934120178, 'learning_rate': 7.481244261303832e-06, 'epoch': 3.17}
|
578 |
-
{'loss': 0.0019, 'grad_norm': 0.024304231628775597, 'learning_rate': 7.45884934942781e-06, 'epoch': 3.18}
|
579 |
-
{'loss': 0.0042, 'grad_norm': 0.12154655903577805, 'learning_rate': 7.436454437551789e-06, 'epoch': 3.18}
|
580 |
-
{'loss': 0.0027, 'grad_norm': 0.020685842260718346, 'learning_rate': 7.4140595256757675e-06, 'epoch': 3.19}
|
581 |
-
{'loss': 0.0011, 'grad_norm': 0.024026449769735336, 'learning_rate': 7.391664613799745e-06, 'epoch': 3.19}
|
582 |
-
{'loss': 0.002, 'grad_norm': 0.07294344902038574, 'learning_rate': 7.369269701923723e-06, 'epoch': 3.2}
|
583 |
-
{'loss': 0.0021, 'grad_norm': 0.0950188934803009, 'learning_rate': 7.3468747900477025e-06, 'epoch': 3.2}
|
584 |
-
{'loss': 0.0015, 'grad_norm': 0.004987840075045824, 'learning_rate': 7.32447987817168e-06, 'epoch': 3.21}
|
585 |
-
{'loss': 0.0017, 'grad_norm': 0.0009321196121163666, 'learning_rate': 7.302084966295658e-06, 'epoch': 3.21}
|
586 |
-
{'loss': 0.002, 'grad_norm': 0.8103981614112854, 'learning_rate': 7.279690054419637e-06, 'epoch': 3.22}
|
587 |
-
{'loss': 0.0012, 'grad_norm': 0.08477653563022614, 'learning_rate': 7.257295142543614e-06, 'epoch': 3.23}
|
588 |
-
{'loss': 0.0017, 'grad_norm': 0.14473630487918854, 'learning_rate': 7.234900230667593e-06, 'epoch': 3.23}
|
589 |
-
{'loss': 0.0029, 'grad_norm': 0.1038050651550293, 'learning_rate': 7.212505318791572e-06, 'epoch': 3.24}
|
590 |
-
{'loss': 0.0019, 'grad_norm': 0.004471446853131056, 'learning_rate': 7.190110406915549e-06, 'epoch': 3.24}
|
591 |
-
{'loss': 0.0017, 'grad_norm': 0.08369725197553635, 'learning_rate': 7.167715495039528e-06, 'epoch': 3.25}
|
592 |
-
{'loss': 0.0019, 'grad_norm': 0.07285201549530029, 'learning_rate': 7.145320583163505e-06, 'epoch': 3.25}
|
593 |
-
{'loss': 0.0012, 'grad_norm': 0.007139412686228752, 'learning_rate': 7.122925671287484e-06, 'epoch': 3.26}
|
594 |
-
{'loss': 0.0024, 'grad_norm': 0.026335667818784714, 'learning_rate': 7.100530759411463e-06, 'epoch': 3.26}
|
595 |
-
{'loss': 0.0017, 'grad_norm': 0.4776710569858551, 'learning_rate': 7.07813584753544e-06, 'epoch': 3.27}
|
596 |
-
{'loss': 0.0022, 'grad_norm': 0.025377823039889336, 'learning_rate': 7.0557409356594185e-06, 'epoch': 3.27}
|
597 |
-
{'loss': 0.002, 'grad_norm': 0.15673436224460602, 'learning_rate': 7.033346023783397e-06, 'epoch': 3.28}
|
598 |
-
{'loss': 0.0028, 'grad_norm': 0.10128195583820343, 'learning_rate': 7.010951111907374e-06, 'epoch': 3.29}
|
599 |
-
{'loss': 0.0036, 'grad_norm': 0.007779085077345371, 'learning_rate': 6.988556200031354e-06, 'epoch': 3.29}
|
600 |
-
{'loss': 0.0015, 'grad_norm': 0.07349961996078491, 'learning_rate': 6.966161288155332e-06, 'epoch': 3.3}
|
601 |
-
{'loss': 0.0024, 'grad_norm': 0.002100712852552533, 'learning_rate': 6.9437663762793094e-06, 'epoch': 3.3}
|
602 |
-
{'loss': 0.0015, 'grad_norm': 0.2998717725276947, 'learning_rate': 6.921371464403288e-06, 'epoch': 3.31}
|
603 |
-
{'loss': 0.0012, 'grad_norm': 0.13683967292308807, 'learning_rate': 6.898976552527267e-06, 'epoch': 3.31}
|
604 |
-
{'loss': 0.0022, 'grad_norm': 0.003665775526314974, 'learning_rate': 6.8765816406512445e-06, 'epoch': 3.32}
|
605 |
-
{'loss': 0.0015, 'grad_norm': 0.000874933844897896, 'learning_rate': 6.854186728775223e-06, 'epoch': 3.32}
|
606 |
-
{'loss': 0.0023, 'grad_norm': 0.15409617125988007, 'learning_rate': 6.831791816899201e-06, 'epoch': 3.33}
|
607 |
-
{'loss': 0.0017, 'grad_norm': 0.03335576876997948, 'learning_rate': 6.809396905023179e-06, 'epoch': 3.33}
|
608 |
-
{'loss': 0.0021, 'grad_norm': 0.013786455616354942, 'learning_rate': 6.787001993147158e-06, 'epoch': 3.34}
|
609 |
-
{'loss': 0.0017, 'grad_norm': 0.0036290634889155626, 'learning_rate': 6.764607081271136e-06, 'epoch': 3.35}
|
610 |
-
{'loss': 0.0015, 'grad_norm': 0.1783692091703415, 'learning_rate': 6.742212169395114e-06, 'epoch': 3.35}
|
611 |
-
{'loss': 0.0023, 'grad_norm': 0.012478599324822426, 'learning_rate': 6.719817257519092e-06, 'epoch': 3.36}
|
612 |
-
{'loss': 0.0014, 'grad_norm': 0.016466649249196053, 'learning_rate': 6.697422345643071e-06, 'epoch': 3.36}
|
613 |
-
{'loss': 0.0019, 'grad_norm': 0.006102473475039005, 'learning_rate': 6.675027433767049e-06, 'epoch': 3.37}
|
614 |
-
{'loss': 0.0017, 'grad_norm': 0.009547678753733635, 'learning_rate': 6.652632521891027e-06, 'epoch': 3.37}
|
615 |
-
{'loss': 0.0027, 'grad_norm': 0.1453057825565338, 'learning_rate': 6.6302376100150055e-06, 'epoch': 3.38}
|
616 |
-
{'loss': 0.0016, 'grad_norm': 0.20028233528137207, 'learning_rate': 6.607842698138983e-06, 'epoch': 3.38}
|
617 |
-
{'loss': 0.0019, 'grad_norm': 0.003452139673754573, 'learning_rate': 6.585447786262961e-06, 'epoch': 3.39}
|
618 |
-
{'loss': 0.0037, 'grad_norm': 0.004863356240093708, 'learning_rate': 6.563052874386939e-06, 'epoch': 3.4}
|
619 |
-
{'loss': 0.0016, 'grad_norm': 0.08551418036222458, 'learning_rate': 6.540657962510918e-06, 'epoch': 3.4}
|
620 |
-
{'loss': 0.0012, 'grad_norm': 0.07263021171092987, 'learning_rate': 6.5182630506348964e-06, 'epoch': 3.41}
|
621 |
-
{'loss': 0.0024, 'grad_norm': 0.02901959978044033, 'learning_rate': 6.495868138758874e-06, 'epoch': 3.41}
|
622 |
-
{'loss': 0.0016, 'grad_norm': 0.010008633136749268, 'learning_rate': 6.473473226882852e-06, 'epoch': 3.42}
|
623 |
-
{'loss': 0.0022, 'grad_norm': 0.02640015073120594, 'learning_rate': 6.4510783150068315e-06, 'epoch': 3.42}
|
624 |
-
{'loss': 0.0015, 'grad_norm': 0.1104823499917984, 'learning_rate': 6.428683403130809e-06, 'epoch': 3.43}
|
625 |
-
{'loss': 0.0017, 'grad_norm': 0.16136541962623596, 'learning_rate': 6.406288491254787e-06, 'epoch': 3.43}
|
626 |
-
{'loss': 0.0015, 'grad_norm': 0.0034606726840138435, 'learning_rate': 6.383893579378766e-06, 'epoch': 3.44}
|
627 |
-
{'loss': 0.0018, 'grad_norm': 0.03437316045165062, 'learning_rate': 6.361498667502743e-06, 'epoch': 3.44}
|
628 |
-
{'loss': 0.0015, 'grad_norm': 0.0036164058838039637, 'learning_rate': 6.339103755626722e-06, 'epoch': 3.45}
|
629 |
-
{'loss': 0.0019, 'grad_norm': 0.01371910609304905, 'learning_rate': 6.316708843750701e-06, 'epoch': 3.46}
|
630 |
-
{'loss': 0.0009, 'grad_norm': 0.12439179420471191, 'learning_rate': 6.294313931874678e-06, 'epoch': 3.46}
|
631 |
-
{'loss': 0.001, 'grad_norm': 0.0557035356760025, 'learning_rate': 6.271919019998657e-06, 'epoch': 3.47}
|
632 |
-
{'loss': 0.001, 'grad_norm': 0.020946547389030457, 'learning_rate': 6.249524108122636e-06, 'epoch': 3.47}
|
633 |
-
{'loss': 0.0023, 'grad_norm': 0.07646912336349487, 'learning_rate': 6.227129196246613e-06, 'epoch': 3.48}
|
634 |
-
{'loss': 0.0012, 'grad_norm': 0.07221906632184982, 'learning_rate': 6.204734284370592e-06, 'epoch': 3.48}
|
635 |
-
{'loss': 0.0012, 'grad_norm': 0.024785563349723816, 'learning_rate': 6.18233937249457e-06, 'epoch': 3.49}
|
636 |
-
{'loss': 0.0011, 'grad_norm': 0.0033317049965262413, 'learning_rate': 6.1599444606185475e-06, 'epoch': 3.49}
|
637 |
-
{'loss': 0.0008, 'grad_norm': 0.051508672535419464, 'learning_rate': 6.137549548742526e-06, 'epoch': 3.5}
|
638 |
-
{'loss': 0.0018, 'grad_norm': 0.29059651494026184, 'learning_rate': 6.115154636866505e-06, 'epoch': 3.5}
|
639 |
-
{'loss': 0.0009, 'grad_norm': 0.0021790487226098776, 'learning_rate': 6.0927597249904826e-06, 'epoch': 3.51}
|
640 |
-
{'loss': 0.0016, 'grad_norm': 2.6697778701782227, 'learning_rate': 6.070364813114461e-06, 'epoch': 3.52}
|
641 |
-
{'loss': 0.0012, 'grad_norm': 1.0531569719314575, 'learning_rate': 6.047969901238439e-06, 'epoch': 3.52}
|
642 |
-
{'loss': 0.0015, 'grad_norm': 0.4647313952445984, 'learning_rate': 6.025574989362417e-06, 'epoch': 3.53}
|
643 |
-
{'loss': 0.0024, 'grad_norm': 0.05964852496981621, 'learning_rate': 6.003180077486396e-06, 'epoch': 3.53}
|
644 |
-
{'loss': 0.0016, 'grad_norm': 0.06724616885185242, 'learning_rate': 5.980785165610374e-06, 'epoch': 3.54}
|
645 |
-
{'loss': 0.0014, 'grad_norm': 0.2721405625343323, 'learning_rate': 5.958390253734352e-06, 'epoch': 3.54}
|
646 |
-
{'loss': 0.0014, 'grad_norm': 0.0075986250303685665, 'learning_rate': 5.93599534185833e-06, 'epoch': 3.55}
|
647 |
-
{'loss': 0.0047, 'grad_norm': 0.07740730047225952, 'learning_rate': 5.913600429982308e-06, 'epoch': 3.55}
|
648 |
-
{'loss': 0.0013, 'grad_norm': 0.07295466959476471, 'learning_rate': 5.891205518106287e-06, 'epoch': 3.56}
|
649 |
-
{'loss': 0.0012, 'grad_norm': 0.015490477904677391, 'learning_rate': 5.868810606230265e-06, 'epoch': 3.56}
|
650 |
-
{'loss': 0.0013, 'grad_norm': 0.10027164220809937, 'learning_rate': 5.846415694354243e-06, 'epoch': 3.57}
|
651 |
-
{'loss': 0.0011, 'grad_norm': 0.0034784649033099413, 'learning_rate': 5.824020782478221e-06, 'epoch': 3.58}
|
652 |
-
{'loss': 0.0011, 'grad_norm': 0.11903531104326248, 'learning_rate': 5.8016258706022e-06, 'epoch': 3.58}
|
653 |
-
{'loss': 0.0016, 'grad_norm': 0.01810777373611927, 'learning_rate': 5.779230958726178e-06, 'epoch': 3.59}
|
654 |
-
{'loss': 0.0022, 'grad_norm': 0.0328763872385025, 'learning_rate': 5.756836046850156e-06, 'epoch': 3.59}
|
655 |
-
{'loss': 0.0017, 'grad_norm': 0.004651115275919437, 'learning_rate': 5.7344411349741345e-06, 'epoch': 3.6}
|
656 |
-
{'loss': 0.0012, 'grad_norm': 0.008656460791826248, 'learning_rate': 5.712046223098112e-06, 'epoch': 3.6}
|
657 |
-
{'loss': 0.002, 'grad_norm': 0.015148227103054523, 'learning_rate': 5.689651311222091e-06, 'epoch': 3.61}
|
658 |
-
{'loss': 0.0016, 'grad_norm': 0.04307083040475845, 'learning_rate': 5.6672563993460696e-06, 'epoch': 3.61}
|
659 |
-
{'loss': 0.0009, 'grad_norm': 0.10523002594709396, 'learning_rate': 5.644861487470047e-06, 'epoch': 3.62}
|
660 |
-
{'loss': 0.0011, 'grad_norm': 0.004528895020484924, 'learning_rate': 5.622466575594025e-06, 'epoch': 3.63}
|
661 |
-
{'loss': 0.0019, 'grad_norm': 0.1139262244105339, 'learning_rate': 5.600071663718004e-06, 'epoch': 3.63}
|
662 |
-
{'loss': 0.0011, 'grad_norm': 0.02509382739663124, 'learning_rate': 5.577676751841981e-06, 'epoch': 3.64}
|
663 |
-
{'loss': 0.0021, 'grad_norm': 0.3114255368709564, 'learning_rate': 5.5552818399659605e-06, 'epoch': 3.64}
|
664 |
-
{'loss': 0.0029, 'grad_norm': 0.14174672961235046, 'learning_rate': 5.532886928089939e-06, 'epoch': 3.65}
|
665 |
-
{'loss': 0.001, 'grad_norm': 0.015288415364921093, 'learning_rate': 5.510492016213916e-06, 'epoch': 3.65}
|
666 |
-
{'loss': 0.0016, 'grad_norm': 0.0520060658454895, 'learning_rate': 5.488097104337895e-06, 'epoch': 3.66}
|
667 |
-
{'loss': 0.0016, 'grad_norm': 0.0958566963672638, 'learning_rate': 5.465702192461874e-06, 'epoch': 3.66}
|
668 |
-
{'loss': 0.0036, 'grad_norm': 0.000693493289873004, 'learning_rate': 5.443307280585851e-06, 'epoch': 3.67}
|
669 |
-
{'loss': 0.0012, 'grad_norm': 0.037046968936920166, 'learning_rate': 5.42091236870983e-06, 'epoch': 3.67}
|
670 |
-
{'loss': 0.003, 'grad_norm': 0.031214630231261253, 'learning_rate': 5.398517456833808e-06, 'epoch': 3.68}
|
671 |
-
{'loss': 0.0014, 'grad_norm': 0.393162339925766, 'learning_rate': 5.376122544957786e-06, 'epoch': 3.69}
|
672 |
-
{'loss': 0.0018, 'grad_norm': 0.16350078582763672, 'learning_rate': 5.353727633081765e-06, 'epoch': 3.69}
|
673 |
-
{'loss': 0.001, 'grad_norm': 0.020479297265410423, 'learning_rate': 5.331332721205742e-06, 'epoch': 3.7}
|
674 |
-
{'loss': 0.001, 'grad_norm': 0.06839997321367264, 'learning_rate': 5.308937809329721e-06, 'epoch': 3.7}
|
675 |
-
{'loss': 0.0016, 'grad_norm': 0.47072646021842957, 'learning_rate': 5.286542897453699e-06, 'epoch': 3.71}
|
676 |
-
{'loss': 0.0025, 'grad_norm': 0.015468220226466656, 'learning_rate': 5.2641479855776765e-06, 'epoch': 3.71}
|
677 |
-
{'loss': 0.001, 'grad_norm': 0.06005273386836052, 'learning_rate': 5.241753073701656e-06, 'epoch': 3.72}
|
678 |
-
{'loss': 0.0018, 'grad_norm': 0.016474580392241478, 'learning_rate': 5.219358161825634e-06, 'epoch': 3.72}
|
679 |
-
{'loss': 0.0015, 'grad_norm': 0.0036705026868730783, 'learning_rate': 5.1969632499496116e-06, 'epoch': 3.73}
|
680 |
-
{'loss': 0.001, 'grad_norm': 0.5551484823226929, 'learning_rate': 5.17456833807359e-06, 'epoch': 3.73}
|
681 |
-
{'loss': 0.0009, 'grad_norm': 0.006879040505737066, 'learning_rate': 5.152173426197568e-06, 'epoch': 3.74}
|
682 |
-
{'loss': 0.0013, 'grad_norm': 0.0026731377001851797, 'learning_rate': 5.129778514321546e-06, 'epoch': 3.75}
|
683 |
-
{'loss': 0.0014, 'grad_norm': 0.10522931814193726, 'learning_rate': 5.107383602445525e-06, 'epoch': 3.75}
|
684 |
-
{'loss': 0.0013, 'grad_norm': 0.07733763009309769, 'learning_rate': 5.084988690569503e-06, 'epoch': 3.76}
|
685 |
-
{'loss': 0.0011, 'grad_norm': 0.08409392833709717, 'learning_rate': 5.062593778693481e-06, 'epoch': 3.76}
|
686 |
-
{'loss': 0.0026, 'grad_norm': 0.03305979073047638, 'learning_rate': 5.040198866817459e-06, 'epoch': 3.77}
|
687 |
-
{'loss': 0.0014, 'grad_norm': 0.006016087252646685, 'learning_rate': 5.017803954941438e-06, 'epoch': 3.77}
|
688 |
-
{'loss': 0.0021, 'grad_norm': 0.02351684682071209, 'learning_rate': 4.995409043065416e-06, 'epoch': 3.78}
|
689 |
-
{'loss': 0.0015, 'grad_norm': 0.009738347493112087, 'learning_rate': 4.973014131189394e-06, 'epoch': 3.78}
|
690 |
-
{'loss': 0.0013, 'grad_norm': 0.02382291853427887, 'learning_rate': 4.9506192193133726e-06, 'epoch': 3.79}
|
691 |
-
{'loss': 0.0013, 'grad_norm': 0.028588024899363518, 'learning_rate': 4.92822430743735e-06, 'epoch': 3.79}
|
692 |
-
{'loss': 0.0019, 'grad_norm': 0.06715335696935654, 'learning_rate': 4.905829395561329e-06, 'epoch': 3.8}
|
693 |
-
{'loss': 0.0009, 'grad_norm': 0.009042341262102127, 'learning_rate': 4.883434483685307e-06, 'epoch': 3.81}
|
694 |
-
{'loss': 0.0009, 'grad_norm': 0.03919120132923126, 'learning_rate': 4.861039571809285e-06, 'epoch': 3.81}
|
695 |
-
{'loss': 0.0014, 'grad_norm': 0.04066384211182594, 'learning_rate': 4.8386446599332635e-06, 'epoch': 3.82}
|
696 |
-
{'loss': 0.0012, 'grad_norm': 0.05810333788394928, 'learning_rate': 4.816249748057242e-06, 'epoch': 3.82}
|
697 |
-
{'loss': 0.0032, 'grad_norm': 0.020592456683516502, 'learning_rate': 4.79385483618122e-06, 'epoch': 3.83}
|
698 |
-
{'loss': 0.0015, 'grad_norm': 0.1887601613998413, 'learning_rate': 4.7714599243051985e-06, 'epoch': 3.83}
|
699 |
-
{'loss': 0.0011, 'grad_norm': 0.020269712433218956, 'learning_rate': 4.749065012429177e-06, 'epoch': 3.84}
|
700 |
-
{'loss': 0.002, 'grad_norm': 0.15431857109069824, 'learning_rate': 4.726670100553154e-06, 'epoch': 3.84}
|
701 |
-
{'loss': 0.0012, 'grad_norm': 0.009703408926725388, 'learning_rate': 4.704275188677134e-06, 'epoch': 3.85}
|
702 |
-
{'loss': 0.0026, 'grad_norm': 0.03211360052227974, 'learning_rate': 4.681880276801111e-06, 'epoch': 3.86}
|
703 |
-
{'loss': 0.001, 'grad_norm': 0.08050722628831863, 'learning_rate': 4.6594853649250894e-06, 'epoch': 3.86}
|
704 |
-
{'loss': 0.0018, 'grad_norm': 0.01742105558514595, 'learning_rate': 4.637090453049068e-06, 'epoch': 3.87}
|
705 |
-
{'loss': 0.0014, 'grad_norm': 0.13857877254486084, 'learning_rate': 4.614695541173046e-06, 'epoch': 3.87}
|
706 |
-
{'loss': 0.001, 'grad_norm': 0.10377497225999832, 'learning_rate': 4.592300629297024e-06, 'epoch': 3.88}
|
707 |
-
{'loss': 0.0018, 'grad_norm': 0.019631896167993546, 'learning_rate': 4.569905717421002e-06, 'epoch': 3.88}
|
708 |
-
{'loss': 0.0027, 'grad_norm': 0.010785219259560108, 'learning_rate': 4.54751080554498e-06, 'epoch': 3.89}
|
709 |
-
{'loss': 0.0027, 'grad_norm': 0.0029205495957285166, 'learning_rate': 4.525115893668959e-06, 'epoch': 3.89}
|
710 |
-
{'loss': 0.0011, 'grad_norm': 0.026202471926808357, 'learning_rate': 4.502720981792937e-06, 'epoch': 3.9}
|
711 |
-
{'loss': 0.0024, 'grad_norm': 0.005275311879813671, 'learning_rate': 4.480326069916915e-06, 'epoch': 3.9}
|
712 |
-
{'loss': 0.0012, 'grad_norm': 0.009359275922179222, 'learning_rate': 4.457931158040894e-06, 'epoch': 3.91}
|
713 |
-
{'loss': 0.0018, 'grad_norm': 0.06890468299388885, 'learning_rate': 4.435536246164871e-06, 'epoch': 3.92}
|
714 |
-
{'loss': 0.0012, 'grad_norm': 0.004848203156143427, 'learning_rate': 4.4131413342888505e-06, 'epoch': 3.92}
|
715 |
-
{'loss': 0.0015, 'grad_norm': 1.0583692789077759, 'learning_rate': 4.390746422412828e-06, 'epoch': 3.93}
|
716 |
-
{'loss': 0.0015, 'grad_norm': 0.08103686571121216, 'learning_rate': 4.368351510536806e-06, 'epoch': 3.93}
|
717 |
-
{'loss': 0.0018, 'grad_norm': 0.006814138498157263, 'learning_rate': 4.345956598660785e-06, 'epoch': 3.94}
|
718 |
-
{'loss': 0.0017, 'grad_norm': 0.28501853346824646, 'learning_rate': 4.323561686784763e-06, 'epoch': 3.94}
|
719 |
-
{'loss': 0.0009, 'grad_norm': 0.005639960058033466, 'learning_rate': 4.301166774908741e-06, 'epoch': 3.95}
|
720 |
-
{'loss': 0.001, 'grad_norm': 0.0073370854370296, 'learning_rate': 4.278771863032719e-06, 'epoch': 3.95}
|
721 |
-
{'loss': 0.0013, 'grad_norm': 0.014588725753128529, 'learning_rate': 4.256376951156698e-06, 'epoch': 3.96}
|
722 |
-
{'loss': 0.0008, 'grad_norm': 0.010143490508198738, 'learning_rate': 4.233982039280676e-06, 'epoch': 3.96}
|
723 |
-
{'loss': 0.0018, 'grad_norm': 0.00340424757450819, 'learning_rate': 4.211587127404654e-06, 'epoch': 3.97}
|
724 |
-
{'loss': 0.0027, 'grad_norm': 0.012705673463642597, 'learning_rate': 4.189192215528632e-06, 'epoch': 3.98}
|
725 |
-
{'loss': 0.0009, 'grad_norm': 0.038429852575063705, 'learning_rate': 4.166797303652611e-06, 'epoch': 3.98}
|
726 |
-
{'loss': 0.0008, 'grad_norm': 0.4789028763771057, 'learning_rate': 4.144402391776588e-06, 'epoch': 3.99}
|
727 |
-
{'loss': 0.001, 'grad_norm': 0.006754585541784763, 'learning_rate': 4.122007479900567e-06, 'epoch': 3.99}
|
728 |
-
{'loss': 0.0009, 'grad_norm': 0.23940064013004303, 'learning_rate': 4.099612568024545e-06, 'epoch': 4.0}
|
729 |
-
{'loss': 0.0012, 'grad_norm': 0.0665973499417305, 'learning_rate': 4.077217656148523e-06, 'epoch': 4.0}
|
730 |
-
{'loss': 0.0011, 'grad_norm': 0.0013757392298430204, 'learning_rate': 4.0548227442725016e-06, 'epoch': 4.01}
|
731 |
-
{'loss': 0.0023, 'grad_norm': 0.8854921460151672, 'learning_rate': 4.03242783239648e-06, 'epoch': 4.01}
|
732 |
-
{'loss': 0.0023, 'grad_norm': 0.06492713838815689, 'learning_rate': 4.010032920520458e-06, 'epoch': 4.02}
|
733 |
-
{'loss': 0.0012, 'grad_norm': 0.003994062077254057, 'learning_rate': 3.987638008644436e-06, 'epoch': 4.02}
|
734 |
-
{'loss': 0.0018, 'grad_norm': 0.024876805022358894, 'learning_rate': 3.965243096768415e-06, 'epoch': 4.03}
|
735 |
-
{'loss': 0.0013, 'grad_norm': 0.21828804910182953, 'learning_rate': 3.9428481848923925e-06, 'epoch': 4.04}
|
736 |
-
{'loss': 0.0009, 'grad_norm': 0.02883763425052166, 'learning_rate': 3.920453273016371e-06, 'epoch': 4.04}
|
737 |
-
{'loss': 0.0016, 'grad_norm': 0.1658484935760498, 'learning_rate': 3.898058361140349e-06, 'epoch': 4.05}
|
738 |
-
{'loss': 0.0011, 'grad_norm': 0.023233819752931595, 'learning_rate': 3.8756634492643275e-06, 'epoch': 4.05}
|
739 |
-
{'loss': 0.0011, 'grad_norm': 0.016315072774887085, 'learning_rate': 3.853268537388306e-06, 'epoch': 4.06}
|
740 |
-
{'loss': 0.0009, 'grad_norm': 0.027211636304855347, 'learning_rate': 3.830873625512284e-06, 'epoch': 4.06}
|
741 |
-
{'loss': 0.0012, 'grad_norm': 0.006255852058529854, 'learning_rate': 3.808478713636262e-06, 'epoch': 4.07}
|
742 |
-
{'loss': 0.0011, 'grad_norm': 0.005831268150359392, 'learning_rate': 3.78608380176024e-06, 'epoch': 4.07}
|
743 |
-
{'loss': 0.0008, 'grad_norm': 0.012144763953983784, 'learning_rate': 3.763688889884219e-06, 'epoch': 4.08}
|
744 |
-
{'loss': 0.001, 'grad_norm': 0.01724362187087536, 'learning_rate': 3.7412939780081968e-06, 'epoch': 4.09}
|
745 |
-
{'loss': 0.0008, 'grad_norm': 0.04438236728310585, 'learning_rate': 3.718899066132175e-06, 'epoch': 4.09}
|
746 |
-
{'loss': 0.0009, 'grad_norm': 0.00658840499818325, 'learning_rate': 3.696504154256153e-06, 'epoch': 4.1}
|
747 |
-
{'loss': 0.0008, 'grad_norm': 0.05471208319067955, 'learning_rate': 3.674109242380132e-06, 'epoch': 4.1}
|
748 |
-
{'loss': 0.0008, 'grad_norm': 0.007816795259714127, 'learning_rate': 3.6517143305041098e-06, 'epoch': 4.11}
|
749 |
-
{'loss': 0.0008, 'grad_norm': 0.02814406529068947, 'learning_rate': 3.6293194186280877e-06, 'epoch': 4.11}
|
750 |
-
{'loss': 0.0009, 'grad_norm': 0.0004428045067470521, 'learning_rate': 3.6069245067520665e-06, 'epoch': 4.12}
|
751 |
-
{'loss': 0.0021, 'grad_norm': 0.001689333003014326, 'learning_rate': 3.5845295948760444e-06, 'epoch': 4.12}
|
752 |
-
{'loss': 0.0007, 'grad_norm': 0.10142877697944641, 'learning_rate': 3.5621346830000223e-06, 'epoch': 4.13}
|
753 |
-
{'loss': 0.0014, 'grad_norm': 0.03971700370311737, 'learning_rate': 3.539739771124001e-06, 'epoch': 4.13}
|
754 |
-
{'loss': 0.0008, 'grad_norm': 0.12946633994579315, 'learning_rate': 3.517344859247979e-06, 'epoch': 4.14}
|
755 |
-
{'loss': 0.0015, 'grad_norm': 0.01494985818862915, 'learning_rate': 3.4949499473719574e-06, 'epoch': 4.15}
|
756 |
-
{'loss': 0.0008, 'grad_norm': 0.0013534559402614832, 'learning_rate': 3.4725550354959357e-06, 'epoch': 4.15}
|
757 |
-
{'loss': 0.0008, 'grad_norm': 0.011890546418726444, 'learning_rate': 3.450160123619914e-06, 'epoch': 4.16}
|
758 |
-
{'loss': 0.0015, 'grad_norm': 0.013109634630382061, 'learning_rate': 3.427765211743892e-06, 'epoch': 4.16}
|
759 |
-
{'loss': 0.0008, 'grad_norm': 0.0019232493359595537, 'learning_rate': 3.40537029986787e-06, 'epoch': 4.17}
|
760 |
-
{'loss': 0.0009, 'grad_norm': 0.02066531963646412, 'learning_rate': 3.3829753879918487e-06, 'epoch': 4.17}
|
761 |
-
{'loss': 0.0018, 'grad_norm': 0.00364371994510293, 'learning_rate': 3.3605804761158266e-06, 'epoch': 4.18}
|
762 |
-
{'loss': 0.0013, 'grad_norm': 0.0214854683727026, 'learning_rate': 3.3381855642398046e-06, 'epoch': 4.18}
|
763 |
-
{'loss': 0.0012, 'grad_norm': 0.014650222845375538, 'learning_rate': 3.3157906523637833e-06, 'epoch': 4.19}
|
764 |
-
{'loss': 0.0008, 'grad_norm': 0.12458858639001846, 'learning_rate': 3.2933957404877613e-06, 'epoch': 4.19}
|
765 |
-
{'loss': 0.0008, 'grad_norm': 0.05412464588880539, 'learning_rate': 3.2710008286117396e-06, 'epoch': 4.2}
|
766 |
-
{'loss': 0.0008, 'grad_norm': 0.05582907423377037, 'learning_rate': 3.248605916735718e-06, 'epoch': 4.21}
|
767 |
-
{'loss': 0.0008, 'grad_norm': 0.006058037281036377, 'learning_rate': 3.2262110048596963e-06, 'epoch': 4.21}
|
768 |
-
{'loss': 0.001, 'grad_norm': 0.07414203137159348, 'learning_rate': 3.2038160929836743e-06, 'epoch': 4.22}
|
769 |
-
{'loss': 0.0008, 'grad_norm': 0.07749581336975098, 'learning_rate': 3.181421181107653e-06, 'epoch': 4.22}
|
770 |
-
{'loss': 0.0008, 'grad_norm': 0.08997820317745209, 'learning_rate': 3.159026269231631e-06, 'epoch': 4.23}
|
771 |
-
{'loss': 0.0009, 'grad_norm': 0.0007085053948685527, 'learning_rate': 3.136631357355609e-06, 'epoch': 4.23}
|
772 |
-
{'loss': 0.0008, 'grad_norm': 0.278054803609848, 'learning_rate': 3.1142364454795872e-06, 'epoch': 4.24}
|
773 |
-
{'loss': 0.0008, 'grad_norm': 0.025398461148142815, 'learning_rate': 3.0918415336035656e-06, 'epoch': 4.24}
|
774 |
-
{'loss': 0.0011, 'grad_norm': 0.0181169044226408, 'learning_rate': 3.0694466217275435e-06, 'epoch': 4.25}
|
775 |
-
{'loss': 0.0009, 'grad_norm': 0.03886833414435387, 'learning_rate': 3.047051709851522e-06, 'epoch': 4.25}
|
776 |
-
{'loss': 0.0007, 'grad_norm': 0.014894254505634308, 'learning_rate': 3.0246567979755002e-06, 'epoch': 4.26}
|
777 |
-
{'loss': 0.001, 'grad_norm': 0.3343604505062103, 'learning_rate': 3.0022618860994786e-06, 'epoch': 4.27}
|
778 |
-
{'loss': 0.0007, 'grad_norm': 0.2918633818626404, 'learning_rate': 2.9798669742234565e-06, 'epoch': 4.27}
|
779 |
-
{'loss': 0.0011, 'grad_norm': 0.011875933967530727, 'learning_rate': 2.9574720623474353e-06, 'epoch': 4.28}
|
780 |
-
{'loss': 0.0007, 'grad_norm': 0.01958482153713703, 'learning_rate': 2.935077150471413e-06, 'epoch': 4.28}
|
781 |
-
{'loss': 0.0019, 'grad_norm': 0.018138963729143143, 'learning_rate': 2.912682238595391e-06, 'epoch': 4.29}
|
782 |
-
{'loss': 0.0009, 'grad_norm': 0.010394470766186714, 'learning_rate': 2.89028732671937e-06, 'epoch': 4.29}
|
783 |
-
{'loss': 0.0011, 'grad_norm': 0.0032428407575935125, 'learning_rate': 2.867892414843348e-06, 'epoch': 4.3}
|
784 |
-
{'loss': 0.0008, 'grad_norm': 0.011067216284573078, 'learning_rate': 2.8454975029673258e-06, 'epoch': 4.3}
|
785 |
-
{'loss': 0.0006, 'grad_norm': 0.022999059408903122, 'learning_rate': 2.823102591091304e-06, 'epoch': 4.31}
|
786 |
-
{'loss': 0.0009, 'grad_norm': 0.001819304539822042, 'learning_rate': 2.8007076792152825e-06, 'epoch': 4.32}
|
787 |
-
{'loss': 0.001, 'grad_norm': 0.0037013550754636526, 'learning_rate': 2.778312767339261e-06, 'epoch': 4.32}
|
788 |
-
{'loss': 0.0007, 'grad_norm': 0.08672203868627548, 'learning_rate': 2.7559178554632387e-06, 'epoch': 4.33}
|
789 |
-
{'loss': 0.0011, 'grad_norm': 0.005167264491319656, 'learning_rate': 2.7335229435872175e-06, 'epoch': 4.33}
|
790 |
-
{'loss': 0.0008, 'grad_norm': 0.0014038735534995794, 'learning_rate': 2.7111280317111954e-06, 'epoch': 4.34}
|
791 |
-
{'loss': 0.0007, 'grad_norm': 0.010056782513856888, 'learning_rate': 2.6887331198351734e-06, 'epoch': 4.34}
|
792 |
-
{'loss': 0.0007, 'grad_norm': 0.00827051978558302, 'learning_rate': 2.666338207959152e-06, 'epoch': 4.35}
|
793 |
-
{'loss': 0.0007, 'grad_norm': 0.1306377500295639, 'learning_rate': 2.64394329608313e-06, 'epoch': 4.35}
|
794 |
-
{'loss': 0.001, 'grad_norm': 0.002261078916490078, 'learning_rate': 2.6215483842071084e-06, 'epoch': 4.36}
|
795 |
-
{'loss': 0.0008, 'grad_norm': 0.05072946101427078, 'learning_rate': 2.5991534723310868e-06, 'epoch': 4.36}
|
796 |
-
{'loss': 0.001, 'grad_norm': 0.04886786639690399, 'learning_rate': 2.5767585604550647e-06, 'epoch': 4.37}
|
797 |
-
{'loss': 0.0014, 'grad_norm': 0.06680363416671753, 'learning_rate': 2.554363648579043e-06, 'epoch': 4.38}
|
798 |
-
{'loss': 0.0006, 'grad_norm': 0.08678417652845383, 'learning_rate': 2.531968736703021e-06, 'epoch': 4.38}
|
799 |
-
{'loss': 0.0006, 'grad_norm': 0.17906591296195984, 'learning_rate': 2.5095738248269998e-06, 'epoch': 4.39}
|
800 |
-
{'loss': 0.001, 'grad_norm': 0.048420462757349014, 'learning_rate': 2.4871789129509777e-06, 'epoch': 4.39}
|
801 |
-
{'loss': 0.002, 'grad_norm': 0.22092890739440918, 'learning_rate': 2.464784001074956e-06, 'epoch': 4.4}
|
802 |
-
{'loss': 0.0006, 'grad_norm': 0.02592875249683857, 'learning_rate': 2.442389089198934e-06, 'epoch': 4.4}
|
803 |
-
{'loss': 0.0007, 'grad_norm': 0.04083279147744179, 'learning_rate': 2.4199941773229123e-06, 'epoch': 4.41}
|
804 |
-
{'loss': 0.001, 'grad_norm': 0.00027076838887296617, 'learning_rate': 2.3975992654468907e-06, 'epoch': 4.41}
|
805 |
-
{'loss': 0.0008, 'grad_norm': 0.002070697722956538, 'learning_rate': 2.375204353570869e-06, 'epoch': 4.42}
|
806 |
-
{'loss': 0.0008, 'grad_norm': 0.022934041917324066, 'learning_rate': 2.352809441694847e-06, 'epoch': 4.42}
|
807 |
-
{'loss': 0.0009, 'grad_norm': 0.025117984041571617, 'learning_rate': 2.3304145298188253e-06, 'epoch': 4.43}
|
808 |
-
{'loss': 0.0005, 'grad_norm': 0.0018961215391755104, 'learning_rate': 2.3080196179428037e-06, 'epoch': 4.44}
|
809 |
-
{'loss': 0.0008, 'grad_norm': 0.016121145337820053, 'learning_rate': 2.285624706066782e-06, 'epoch': 4.44}
|
810 |
-
{'loss': 0.0008, 'grad_norm': 0.15548691153526306, 'learning_rate': 2.26322979419076e-06, 'epoch': 4.45}
|
811 |
-
{'loss': 0.0007, 'grad_norm': 0.007404050324112177, 'learning_rate': 2.2408348823147383e-06, 'epoch': 4.45}
|
812 |
-
{'loss': 0.0006, 'grad_norm': 0.0019669681787490845, 'learning_rate': 2.2184399704387166e-06, 'epoch': 4.46}
|
813 |
-
{'loss': 0.0006, 'grad_norm': 0.04935136437416077, 'learning_rate': 2.1960450585626946e-06, 'epoch': 4.46}
|
814 |
-
{'loss': 0.0006, 'grad_norm': 0.007673050742596388, 'learning_rate': 2.173650146686673e-06, 'epoch': 4.47}
|
815 |
-
{'loss': 0.0006, 'grad_norm': 0.002124810591340065, 'learning_rate': 2.1512552348106513e-06, 'epoch': 4.47}
|
816 |
-
{'loss': 0.0009, 'grad_norm': 0.011607947759330273, 'learning_rate': 2.128860322934629e-06, 'epoch': 4.48}
|
817 |
-
{'loss': 0.0007, 'grad_norm': 0.015516514889895916, 'learning_rate': 2.1064654110586076e-06, 'epoch': 4.48}
|
818 |
-
{'loss': 0.0009, 'grad_norm': 0.013184698298573494, 'learning_rate': 2.084070499182586e-06, 'epoch': 4.49}
|
819 |
-
{'loss': 0.0006, 'grad_norm': 0.019689731299877167, 'learning_rate': 2.0616755873065643e-06, 'epoch': 4.5}
|
820 |
-
{'loss': 0.0007, 'grad_norm': 0.22405573725700378, 'learning_rate': 2.0392806754305426e-06, 'epoch': 4.5}
|
821 |
-
{'loss': 0.0006, 'grad_norm': 0.002072765724733472, 'learning_rate': 2.0168857635545205e-06, 'epoch': 4.51}
|
822 |
-
{'loss': 0.0007, 'grad_norm': 0.0035121950786560774, 'learning_rate': 1.994490851678499e-06, 'epoch': 4.51}
|
823 |
-
{'loss': 0.0006, 'grad_norm': 0.0017859174404293299, 'learning_rate': 1.972095939802477e-06, 'epoch': 4.52}
|
824 |
-
{'loss': 0.0008, 'grad_norm': 0.8883686661720276, 'learning_rate': 1.949701027926455e-06, 'epoch': 4.52}
|
825 |
-
{'loss': 0.0007, 'grad_norm': 0.3410530984401703, 'learning_rate': 1.9273061160504335e-06, 'epoch': 4.53}
|
826 |
-
{'loss': 0.0013, 'grad_norm': 0.005357651971280575, 'learning_rate': 1.9049112041744117e-06, 'epoch': 4.53}
|
827 |
-
{'loss': 0.0006, 'grad_norm': 0.009125343523919582, 'learning_rate': 1.88251629229839e-06, 'epoch': 4.54}
|
828 |
-
{'loss': 0.0009, 'grad_norm': 0.014439265243709087, 'learning_rate': 1.8601213804223681e-06, 'epoch': 4.55}
|
829 |
-
{'loss': 0.0015, 'grad_norm': 0.0037733712233603, 'learning_rate': 1.8377264685463465e-06, 'epoch': 4.55}
|
830 |
-
{'loss': 0.0014, 'grad_norm': 0.07933066040277481, 'learning_rate': 1.8153315566703246e-06, 'epoch': 4.56}
|
831 |
-
{'loss': 0.0007, 'grad_norm': 0.16726621985435486, 'learning_rate': 1.7929366447943028e-06, 'epoch': 4.56}
|
832 |
-
{'loss': 0.0007, 'grad_norm': 0.08296032249927521, 'learning_rate': 1.7705417329182811e-06, 'epoch': 4.57}
|
833 |
-
{'loss': 0.0008, 'grad_norm': 0.0007671950734220445, 'learning_rate': 1.7481468210422595e-06, 'epoch': 4.57}
|
834 |
-
{'loss': 0.0008, 'grad_norm': 0.07791215181350708, 'learning_rate': 1.7257519091662376e-06, 'epoch': 4.58}
|
835 |
-
{'loss': 0.0007, 'grad_norm': 0.03872445225715637, 'learning_rate': 1.7033569972902158e-06, 'epoch': 4.58}
|
836 |
-
{'loss': 0.0006, 'grad_norm': 0.09817063063383102, 'learning_rate': 1.680962085414194e-06, 'epoch': 4.59}
|
837 |
-
{'loss': 0.0008, 'grad_norm': 0.024218514561653137, 'learning_rate': 1.6585671735381723e-06, 'epoch': 4.59}
|
838 |
-
{'loss': 0.0008, 'grad_norm': 0.010985558852553368, 'learning_rate': 1.6361722616621506e-06, 'epoch': 4.6}
|
839 |
-
{'loss': 0.0006, 'grad_norm': 0.0027476183604449034, 'learning_rate': 1.6137773497861287e-06, 'epoch': 4.61}
|
840 |
-
{'loss': 0.001, 'grad_norm': 0.003122262191027403, 'learning_rate': 1.591382437910107e-06, 'epoch': 4.61}
|
841 |
-
{'loss': 0.0005, 'grad_norm': 0.0728781521320343, 'learning_rate': 1.568987526034085e-06, 'epoch': 4.62}
|
842 |
-
{'loss': 0.0007, 'grad_norm': 0.019124431535601616, 'learning_rate': 1.5465926141580634e-06, 'epoch': 4.62}
|
843 |
-
{'loss': 0.0006, 'grad_norm': 0.004708414431661367, 'learning_rate': 1.5241977022820417e-06, 'epoch': 4.63}
|
844 |
-
{'loss': 0.0007, 'grad_norm': 0.12547777593135834, 'learning_rate': 1.5018027904060199e-06, 'epoch': 4.63}
|
845 |
-
{'loss': 0.0009, 'grad_norm': 0.32263386249542236, 'learning_rate': 1.4794078785299982e-06, 'epoch': 4.64}
|
846 |
-
{'loss': 0.0014, 'grad_norm': 0.01729527674615383, 'learning_rate': 1.4570129666539764e-06, 'epoch': 4.64}
|
847 |
-
{'loss': 0.0008, 'grad_norm': 0.007950437255203724, 'learning_rate': 1.4346180547779545e-06, 'epoch': 4.65}
|
848 |
-
{'loss': 0.0006, 'grad_norm': 0.011319808661937714, 'learning_rate': 1.4122231429019328e-06, 'epoch': 4.65}
|
849 |
-
{'loss': 0.0006, 'grad_norm': 0.0025837954599410295, 'learning_rate': 1.389828231025911e-06, 'epoch': 4.66}
|
850 |
-
{'loss': 0.0016, 'grad_norm': 0.0021279077045619488, 'learning_rate': 1.3674333191498893e-06, 'epoch': 4.67}
|
851 |
-
{'loss': 0.0006, 'grad_norm': 0.0539991520345211, 'learning_rate': 1.3450384072738675e-06, 'epoch': 4.67}
|
852 |
-
{'loss': 0.0006, 'grad_norm': 0.0006465984624810517, 'learning_rate': 1.3226434953978456e-06, 'epoch': 4.68}
|
853 |
-
{'loss': 0.0012, 'grad_norm': 0.027662355452775955, 'learning_rate': 1.300248583521824e-06, 'epoch': 4.68}
|
854 |
-
{'loss': 0.0007, 'grad_norm': 0.004381787031888962, 'learning_rate': 1.2778536716458021e-06, 'epoch': 4.69}
|
855 |
-
{'loss': 0.0009, 'grad_norm': 0.004225610289722681, 'learning_rate': 1.2554587597697805e-06, 'epoch': 4.69}
|
856 |
-
{'loss': 0.0006, 'grad_norm': 0.0009983439231291413, 'learning_rate': 1.2330638478937586e-06, 'epoch': 4.7}
|
857 |
-
{'loss': 0.0005, 'grad_norm': 0.024487098678946495, 'learning_rate': 1.210668936017737e-06, 'epoch': 4.7}
|
858 |
-
{'loss': 0.0007, 'grad_norm': 0.3406839966773987, 'learning_rate': 1.188274024141715e-06, 'epoch': 4.71}
|
859 |
-
{'loss': 0.0007, 'grad_norm': 0.022679802030324936, 'learning_rate': 1.1658791122656932e-06, 'epoch': 4.71}
|
860 |
-
{'loss': 0.0006, 'grad_norm': 0.0023362182546406984, 'learning_rate': 1.1434842003896716e-06, 'epoch': 4.72}
|
861 |
-
{'loss': 0.0006, 'grad_norm': 0.006971537135541439, 'learning_rate': 1.12108928851365e-06, 'epoch': 4.73}
|
862 |
-
{'loss': 0.0006, 'grad_norm': 0.06807754933834076, 'learning_rate': 1.098694376637628e-06, 'epoch': 4.73}
|
863 |
-
{'loss': 0.0006, 'grad_norm': 0.007362959440797567, 'learning_rate': 1.0762994647616062e-06, 'epoch': 4.74}
|
864 |
-
{'loss': 0.0006, 'grad_norm': 0.08116839826107025, 'learning_rate': 1.0539045528855844e-06, 'epoch': 4.74}
|
865 |
-
{'loss': 0.0006, 'grad_norm': 0.01928202621638775, 'learning_rate': 1.0315096410095627e-06, 'epoch': 4.75}
|
866 |
-
{'loss': 0.0006, 'grad_norm': 0.13101086020469666, 'learning_rate': 1.009114729133541e-06, 'epoch': 4.75}
|
867 |
-
{'loss': 0.0006, 'grad_norm': 0.004853931255638599, 'learning_rate': 9.867198172575192e-07, 'epoch': 4.76}
|
868 |
-
{'loss': 0.0006, 'grad_norm': 0.02783609926700592, 'learning_rate': 9.643249053814973e-07, 'epoch': 4.76}
|
869 |
-
{'loss': 0.0018, 'grad_norm': 0.003236155491322279, 'learning_rate': 9.419299935054756e-07, 'epoch': 4.77}
|
870 |
-
{'loss': 0.0009, 'grad_norm': 0.023846732452511787, 'learning_rate': 9.195350816294539e-07, 'epoch': 4.78}
|
871 |
-
{'loss': 0.0007, 'grad_norm': 0.01901441439986229, 'learning_rate': 8.971401697534321e-07, 'epoch': 4.78}
|
872 |
-
{'loss': 0.0007, 'grad_norm': 0.00501618767157197, 'learning_rate': 8.747452578774103e-07, 'epoch': 4.79}
|
873 |
-
{'loss': 0.0005, 'grad_norm': 0.007777991704642773, 'learning_rate': 8.523503460013885e-07, 'epoch': 4.79}
|
874 |
-
{'loss': 0.0009, 'grad_norm': 0.6491960883140564, 'learning_rate': 8.299554341253668e-07, 'epoch': 4.8}
|
875 |
-
{'loss': 0.0013, 'grad_norm': 0.0740993320941925, 'learning_rate': 8.075605222493451e-07, 'epoch': 4.8}
|
876 |
-
{'loss': 0.0007, 'grad_norm': 0.02660405822098255, 'learning_rate': 7.851656103733232e-07, 'epoch': 4.81}
|
877 |
-
{'loss': 0.0006, 'grad_norm': 0.048786524683237076, 'learning_rate': 7.627706984973014e-07, 'epoch': 4.81}
|
878 |
-
{'loss': 0.0007, 'grad_norm': 0.005497151054441929, 'learning_rate': 7.403757866212798e-07, 'epoch': 4.82}
|
879 |
-
{'loss': 0.001, 'grad_norm': 0.003488279180601239, 'learning_rate': 7.179808747452579e-07, 'epoch': 4.82}
|
880 |
-
{'loss': 0.0019, 'grad_norm': 0.02504000999033451, 'learning_rate': 6.955859628692362e-07, 'epoch': 4.83}
|
881 |
-
{'loss': 0.0006, 'grad_norm': 0.009829353541135788, 'learning_rate': 6.731910509932143e-07, 'epoch': 4.84}
|
882 |
-
{'loss': 0.0006, 'grad_norm': 0.01532562542706728, 'learning_rate': 6.507961391171927e-07, 'epoch': 4.84}
|
883 |
-
{'loss': 0.0009, 'grad_norm': 0.00034189983853138983, 'learning_rate': 6.284012272411709e-07, 'epoch': 4.85}
|
884 |
-
{'loss': 0.0006, 'grad_norm': 0.019531667232513428, 'learning_rate': 6.060063153651491e-07, 'epoch': 4.85}
|
885 |
-
{'loss': 0.001, 'grad_norm': 0.0004679520789068192, 'learning_rate': 5.836114034891273e-07, 'epoch': 4.86}
|
886 |
-
{'loss': 0.0011, 'grad_norm': 0.00026422596420161426, 'learning_rate': 5.612164916131056e-07, 'epoch': 4.86}
|
887 |
-
{'loss': 0.0007, 'grad_norm': 0.0357745960354805, 'learning_rate': 5.388215797370838e-07, 'epoch': 4.87}
|
888 |
-
{'loss': 0.0007, 'grad_norm': 0.008043075911700726, 'learning_rate': 5.16426667861062e-07, 'epoch': 4.87}
|
889 |
-
{'loss': 0.0007, 'grad_norm': 0.01412264909595251, 'learning_rate': 4.940317559850402e-07, 'epoch': 4.88}
|
890 |
-
{'loss': 0.0018, 'grad_norm': 0.027081595733761787, 'learning_rate': 4.716368441090185e-07, 'epoch': 4.88}
|
891 |
-
{'loss': 0.0007, 'grad_norm': 0.02130473032593727, 'learning_rate': 4.4924193223299667e-07, 'epoch': 4.89}
|
892 |
-
{'loss': 0.0012, 'grad_norm': 0.0006097204168327153, 'learning_rate': 4.2684702035697497e-07, 'epoch': 4.9}
|
893 |
-
{'loss': 0.0007, 'grad_norm': 0.007859633304178715, 'learning_rate': 4.0445210848095316e-07, 'epoch': 4.9}
|
894 |
-
{'loss': 0.0009, 'grad_norm': 0.025279998779296875, 'learning_rate': 3.820571966049314e-07, 'epoch': 4.91}
|
895 |
-
{'loss': 0.0007, 'grad_norm': 0.010460122488439083, 'learning_rate': 3.596622847289096e-07, 'epoch': 4.91}
|
896 |
-
{'loss': 0.001, 'grad_norm': 0.5298627018928528, 'learning_rate': 3.372673728528879e-07, 'epoch': 4.92}
|
897 |
-
{'loss': 0.0007, 'grad_norm': 0.0009814887307584286, 'learning_rate': 3.148724609768661e-07, 'epoch': 4.92}
|
898 |
-
{'loss': 0.0007, 'grad_norm': 0.09579623490571976, 'learning_rate': 2.924775491008443e-07, 'epoch': 4.93}
|
899 |
-
{'loss': 0.0007, 'grad_norm': 0.006857629399746656, 'learning_rate': 2.7008263722482253e-07, 'epoch': 4.93}
|
900 |
-
{'loss': 0.0011, 'grad_norm': 0.004843506496399641, 'learning_rate': 2.476877253488008e-07, 'epoch': 4.94}
|
901 |
-
{'loss': 0.0005, 'grad_norm': 0.1149492859840393, 'learning_rate': 2.25292813472779e-07, 'epoch': 4.94}
|
902 |
-
{'loss': 0.0007, 'grad_norm': 0.09972663223743439, 'learning_rate': 2.0289790159675724e-07, 'epoch': 4.95}
|
903 |
-
{'loss': 0.0006, 'grad_norm': 0.036814313381910324, 'learning_rate': 1.8050298972073546e-07, 'epoch': 4.96}
|
904 |
-
{'loss': 0.0009, 'grad_norm': 0.016577519476413727, 'learning_rate': 1.581080778447137e-07, 'epoch': 4.96}
|
905 |
-
{'loss': 0.0008, 'grad_norm': 1.288059949874878, 'learning_rate': 1.3571316596869193e-07, 'epoch': 4.97}
|
906 |
-
{'loss': 0.0015, 'grad_norm': 0.060177162289619446, 'learning_rate': 1.1331825409267016e-07, 'epoch': 4.97}
|
907 |
-
{'loss': 0.0008, 'grad_norm': 0.03802037984132767, 'learning_rate': 9.092334221664839e-08, 'epoch': 4.98}
|
908 |
-
{'loss': 0.0006, 'grad_norm': 0.025011925026774406, 'learning_rate': 6.852843034062661e-08, 'epoch': 4.98}
|
909 |
-
{'loss': 0.0006, 'grad_norm': 0.0070183370262384415, 'learning_rate': 4.6133518464604844e-08, 'epoch': 4.99}
|
910 |
-
{'loss': 0.0007, 'grad_norm': 0.0013526534894481301, 'learning_rate': 2.3738606588583077e-08, 'epoch': 4.99}
|
911 |
-
{'loss': 0.0006, 'grad_norm': 0.0005251829861663282, 'learning_rate': 1.343694712561306e-09, 'epoch': 5.0}
|
912 |
-
{'train_runtime': 93010.3988, 'train_samples_per_second': 39.267, 'train_steps_per_second': 4.908, 'train_loss': 0.014228966551943086, 'epoch': 5.0}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mx-01/model.safetensors
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:f2e06a55924f0f993358ea48a4ee35966109cd59623a761909e0a4fdad0d4587
|
3 |
-
size 90864192
|
|
|
|
|
|
|
|
mx-01/modules.json
DELETED
@@ -1,14 +0,0 @@
|
|
1 |
-
[
|
2 |
-
{
|
3 |
-
"idx": 0,
|
4 |
-
"name": "0",
|
5 |
-
"path": "",
|
6 |
-
"type": "sentence_transformers.models.Transformer"
|
7 |
-
},
|
8 |
-
{
|
9 |
-
"idx": 1,
|
10 |
-
"name": "1",
|
11 |
-
"path": "1_Pooling",
|
12 |
-
"type": "sentence_transformers.models.Pooling"
|
13 |
-
}
|
14 |
-
]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mx-01/mx_eval.csv
DELETED
@@ -1,2 +0,0 @@
|
|
1 |
-
epoch,steps,cosine-Accuracy@1,cosine-Accuracy@3,cosine-Accuracy@5,cosine-Accuracy@10,cosine-Precision@1,cosine-Recall@1,cosine-Precision@3,cosine-Recall@3,cosine-Precision@5,cosine-Recall@5,cosine-Precision@10,cosine-Recall@10,cosine-MRR@10,cosine-NDCG@10,cosine-MAP@100,dot-Accuracy@1,dot-Accuracy@3,dot-Accuracy@5,dot-Accuracy@10,dot-Precision@1,dot-Recall@1,dot-Precision@3,dot-Recall@3,dot-Precision@5,dot-Recall@5,dot-Precision@10,dot-Recall@10,dot-MRR@10,dot-NDCG@10,dot-MAP@100
|
2 |
-
-1,-1,0.6832646087627935,0.7984590363227152,0.8344641400119378,0.8748336646350479,0.6832646087627935,0.6832646087627935,0.2661530121075717,0.7984590363227152,0.16689282800238753,0.8344641400119378,0.0874833664635048,0.8748336646350479,0.7486561788790493,0.7792742631517174,0.7522211420770829,0.47666924041552355,0.6420791509914409,0.7047636258097726,0.775700525154288,0.47666924041552355,0.47666924041552355,0.21402638366381363,0.6420791509914409,0.14095272516195448,0.7047636258097726,0.0775700525154288,0.775700525154288,0.5740915735669964,0.6226885460603906,0.5807913053435532
|
|
|
|
|
|
mx-01/sentence_bert_config.json
DELETED
@@ -1,4 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"max_seq_length": 512,
|
3 |
-
"do_lower_case": false
|
4 |
-
}
|
|
|
|
|
|
|
|
|
|
mx-01/special_tokens_map.json
DELETED
@@ -1,7 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"cls_token": "[CLS]",
|
3 |
-
"mask_token": "[MASK]",
|
4 |
-
"pad_token": "[PAD]",
|
5 |
-
"sep_token": "[SEP]",
|
6 |
-
"unk_token": "[UNK]"
|
7 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mx-01/tokenizer.json
DELETED
The diff for this file is too large to render.
See raw diff
|
|
mx-01/tokenizer_config.json
DELETED
@@ -1,57 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"added_tokens_decoder": {
|
3 |
-
"0": {
|
4 |
-
"content": "[PAD]",
|
5 |
-
"lstrip": false,
|
6 |
-
"normalized": false,
|
7 |
-
"rstrip": false,
|
8 |
-
"single_word": false,
|
9 |
-
"special": true
|
10 |
-
},
|
11 |
-
"100": {
|
12 |
-
"content": "[UNK]",
|
13 |
-
"lstrip": false,
|
14 |
-
"normalized": false,
|
15 |
-
"rstrip": false,
|
16 |
-
"single_word": false,
|
17 |
-
"special": true
|
18 |
-
},
|
19 |
-
"101": {
|
20 |
-
"content": "[CLS]",
|
21 |
-
"lstrip": false,
|
22 |
-
"normalized": false,
|
23 |
-
"rstrip": false,
|
24 |
-
"single_word": false,
|
25 |
-
"special": true
|
26 |
-
},
|
27 |
-
"102": {
|
28 |
-
"content": "[SEP]",
|
29 |
-
"lstrip": false,
|
30 |
-
"normalized": false,
|
31 |
-
"rstrip": false,
|
32 |
-
"single_word": false,
|
33 |
-
"special": true
|
34 |
-
},
|
35 |
-
"103": {
|
36 |
-
"content": "[MASK]",
|
37 |
-
"lstrip": false,
|
38 |
-
"normalized": false,
|
39 |
-
"rstrip": false,
|
40 |
-
"single_word": false,
|
41 |
-
"special": true
|
42 |
-
}
|
43 |
-
},
|
44 |
-
"clean_up_tokenization_spaces": true,
|
45 |
-
"cls_token": "[CLS]",
|
46 |
-
"do_basic_tokenize": true,
|
47 |
-
"do_lower_case": true,
|
48 |
-
"mask_token": "[MASK]",
|
49 |
-
"model_max_length": 512,
|
50 |
-
"never_split": null,
|
51 |
-
"pad_token": "[PAD]",
|
52 |
-
"sep_token": "[SEP]",
|
53 |
-
"strip_accents": null,
|
54 |
-
"tokenize_chinese_chars": true,
|
55 |
-
"tokenizer_class": "BertTokenizer",
|
56 |
-
"unk_token": "[UNK]"
|
57 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
mx-01/vocab.txt
DELETED
The diff for this file is too large to render.
See raw diff
|
|