Upload folder using huggingface_hub
Browse files- best-model.pt +3 -0
- dev.tsv +0 -0
- loss.tsv +11 -0
- runs/events.out.tfevents.1697749444.46dc0c540dd0.4731.18 +3 -0
- test.tsv +0 -0
- training.log +246 -0
best-model.pt
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:4c80f533761eb06b74037e30a727ef29c5558c6463ab593e5b1145a84022e0d7
|
3 |
+
size 19048098
|
dev.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
loss.tsv
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
|
2 |
+
1 21:04:28 0.0000 1.5206 0.3604 0.0286 0.0014 0.0026 0.0013
|
3 |
+
2 21:04:54 0.0000 0.4711 0.2766 0.2574 0.2612 0.2593 0.1600
|
4 |
+
3 21:05:19 0.0000 0.3922 0.2515 0.3101 0.3578 0.3323 0.2165
|
5 |
+
4 21:05:44 0.0000 0.3521 0.2403 0.3755 0.4245 0.3985 0.2697
|
6 |
+
5 21:06:09 0.0000 0.3243 0.2278 0.4181 0.4585 0.4374 0.3009
|
7 |
+
6 21:06:35 0.0000 0.3076 0.2194 0.4466 0.4667 0.4564 0.3144
|
8 |
+
7 21:07:00 0.0000 0.2932 0.2180 0.4529 0.4707 0.4616 0.3186
|
9 |
+
8 21:07:26 0.0000 0.2843 0.2130 0.4341 0.4925 0.4614 0.3195
|
10 |
+
9 21:07:51 0.0000 0.2741 0.2121 0.4333 0.4816 0.4562 0.3147
|
11 |
+
10 21:08:17 0.0000 0.2730 0.2101 0.4376 0.4912 0.4628 0.3217
|
runs/events.out.tfevents.1697749444.46dc0c540dd0.4731.18
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fb1ea733be2438ab5cdd31e49712d7964b84a8e042867ec31f4e088cedda57f7
|
3 |
+
size 502461
|
test.tsv
ADDED
The diff for this file is too large to render.
See raw diff
|
|
training.log
ADDED
@@ -0,0 +1,246 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
2023-10-19 21:04:04,067 ----------------------------------------------------------------------------------------------------
|
2 |
+
2023-10-19 21:04:04,067 Model: "SequenceTagger(
|
3 |
+
(embeddings): TransformerWordEmbeddings(
|
4 |
+
(model): BertModel(
|
5 |
+
(embeddings): BertEmbeddings(
|
6 |
+
(word_embeddings): Embedding(32001, 128)
|
7 |
+
(position_embeddings): Embedding(512, 128)
|
8 |
+
(token_type_embeddings): Embedding(2, 128)
|
9 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
10 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
11 |
+
)
|
12 |
+
(encoder): BertEncoder(
|
13 |
+
(layer): ModuleList(
|
14 |
+
(0-1): 2 x BertLayer(
|
15 |
+
(attention): BertAttention(
|
16 |
+
(self): BertSelfAttention(
|
17 |
+
(query): Linear(in_features=128, out_features=128, bias=True)
|
18 |
+
(key): Linear(in_features=128, out_features=128, bias=True)
|
19 |
+
(value): Linear(in_features=128, out_features=128, bias=True)
|
20 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
21 |
+
)
|
22 |
+
(output): BertSelfOutput(
|
23 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
24 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
25 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
26 |
+
)
|
27 |
+
)
|
28 |
+
(intermediate): BertIntermediate(
|
29 |
+
(dense): Linear(in_features=128, out_features=512, bias=True)
|
30 |
+
(intermediate_act_fn): GELUActivation()
|
31 |
+
)
|
32 |
+
(output): BertOutput(
|
33 |
+
(dense): Linear(in_features=512, out_features=128, bias=True)
|
34 |
+
(LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
|
35 |
+
(dropout): Dropout(p=0.1, inplace=False)
|
36 |
+
)
|
37 |
+
)
|
38 |
+
)
|
39 |
+
)
|
40 |
+
(pooler): BertPooler(
|
41 |
+
(dense): Linear(in_features=128, out_features=128, bias=True)
|
42 |
+
(activation): Tanh()
|
43 |
+
)
|
44 |
+
)
|
45 |
+
)
|
46 |
+
(locked_dropout): LockedDropout(p=0.5)
|
47 |
+
(linear): Linear(in_features=128, out_features=17, bias=True)
|
48 |
+
(loss_function): CrossEntropyLoss()
|
49 |
+
)"
|
50 |
+
2023-10-19 21:04:04,067 ----------------------------------------------------------------------------------------------------
|
51 |
+
2023-10-19 21:04:04,067 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
|
52 |
+
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
|
53 |
+
2023-10-19 21:04:04,067 ----------------------------------------------------------------------------------------------------
|
54 |
+
2023-10-19 21:04:04,067 Train: 7142 sentences
|
55 |
+
2023-10-19 21:04:04,068 (train_with_dev=False, train_with_test=False)
|
56 |
+
2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
|
57 |
+
2023-10-19 21:04:04,068 Training Params:
|
58 |
+
2023-10-19 21:04:04,068 - learning_rate: "3e-05"
|
59 |
+
2023-10-19 21:04:04,068 - mini_batch_size: "8"
|
60 |
+
2023-10-19 21:04:04,068 - max_epochs: "10"
|
61 |
+
2023-10-19 21:04:04,068 - shuffle: "True"
|
62 |
+
2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
|
63 |
+
2023-10-19 21:04:04,068 Plugins:
|
64 |
+
2023-10-19 21:04:04,068 - TensorboardLogger
|
65 |
+
2023-10-19 21:04:04,068 - LinearScheduler | warmup_fraction: '0.1'
|
66 |
+
2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
|
67 |
+
2023-10-19 21:04:04,068 Final evaluation on model from best epoch (best-model.pt)
|
68 |
+
2023-10-19 21:04:04,068 - metric: "('micro avg', 'f1-score')"
|
69 |
+
2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
|
70 |
+
2023-10-19 21:04:04,068 Computation:
|
71 |
+
2023-10-19 21:04:04,068 - compute on device: cuda:0
|
72 |
+
2023-10-19 21:04:04,068 - embedding storage: none
|
73 |
+
2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
|
74 |
+
2023-10-19 21:04:04,068 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
|
75 |
+
2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
|
76 |
+
2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
|
77 |
+
2023-10-19 21:04:04,068 Logging anything other than scalars to TensorBoard is currently not supported.
|
78 |
+
2023-10-19 21:04:06,450 epoch 1 - iter 89/893 - loss 3.46516027 - time (sec): 2.38 - samples/sec: 10355.47 - lr: 0.000003 - momentum: 0.000000
|
79 |
+
2023-10-19 21:04:08,890 epoch 1 - iter 178/893 - loss 3.31633199 - time (sec): 4.82 - samples/sec: 10596.78 - lr: 0.000006 - momentum: 0.000000
|
80 |
+
2023-10-19 21:04:11,278 epoch 1 - iter 267/893 - loss 3.01321278 - time (sec): 7.21 - samples/sec: 10667.08 - lr: 0.000009 - momentum: 0.000000
|
81 |
+
2023-10-19 21:04:13,675 epoch 1 - iter 356/893 - loss 2.63035363 - time (sec): 9.61 - samples/sec: 10662.06 - lr: 0.000012 - momentum: 0.000000
|
82 |
+
2023-10-19 21:04:16,000 epoch 1 - iter 445/893 - loss 2.31455106 - time (sec): 11.93 - samples/sec: 10567.39 - lr: 0.000015 - momentum: 0.000000
|
83 |
+
2023-10-19 21:04:18,356 epoch 1 - iter 534/893 - loss 2.07618924 - time (sec): 14.29 - samples/sec: 10509.93 - lr: 0.000018 - momentum: 0.000000
|
84 |
+
2023-10-19 21:04:21,075 epoch 1 - iter 623/893 - loss 1.88468649 - time (sec): 17.01 - samples/sec: 10330.37 - lr: 0.000021 - momentum: 0.000000
|
85 |
+
2023-10-19 21:04:23,229 epoch 1 - iter 712/893 - loss 1.73722430 - time (sec): 19.16 - samples/sec: 10479.92 - lr: 0.000024 - momentum: 0.000000
|
86 |
+
2023-10-19 21:04:25,515 epoch 1 - iter 801/893 - loss 1.61864786 - time (sec): 21.45 - samples/sec: 10471.92 - lr: 0.000027 - momentum: 0.000000
|
87 |
+
2023-10-19 21:04:27,744 epoch 1 - iter 890/893 - loss 1.52299965 - time (sec): 23.67 - samples/sec: 10476.93 - lr: 0.000030 - momentum: 0.000000
|
88 |
+
2023-10-19 21:04:27,807 ----------------------------------------------------------------------------------------------------
|
89 |
+
2023-10-19 21:04:27,808 EPOCH 1 done: loss 1.5206 - lr: 0.000030
|
90 |
+
2023-10-19 21:04:28,776 DEV : loss 0.36043813824653625 - f1-score (micro avg) 0.0026
|
91 |
+
2023-10-19 21:04:28,791 saving best model
|
92 |
+
2023-10-19 21:04:28,825 ----------------------------------------------------------------------------------------------------
|
93 |
+
2023-10-19 21:04:30,944 epoch 2 - iter 89/893 - loss 0.54958316 - time (sec): 2.12 - samples/sec: 11145.85 - lr: 0.000030 - momentum: 0.000000
|
94 |
+
2023-10-19 21:04:33,239 epoch 2 - iter 178/893 - loss 0.53358732 - time (sec): 4.41 - samples/sec: 11059.90 - lr: 0.000029 - momentum: 0.000000
|
95 |
+
2023-10-19 21:04:35,553 epoch 2 - iter 267/893 - loss 0.52508342 - time (sec): 6.73 - samples/sec: 10890.14 - lr: 0.000029 - momentum: 0.000000
|
96 |
+
2023-10-19 21:04:37,785 epoch 2 - iter 356/893 - loss 0.51537618 - time (sec): 8.96 - samples/sec: 10867.78 - lr: 0.000029 - momentum: 0.000000
|
97 |
+
2023-10-19 21:04:40,048 epoch 2 - iter 445/893 - loss 0.50151060 - time (sec): 11.22 - samples/sec: 10908.00 - lr: 0.000028 - momentum: 0.000000
|
98 |
+
2023-10-19 21:04:42,322 epoch 2 - iter 534/893 - loss 0.49081103 - time (sec): 13.50 - samples/sec: 10977.85 - lr: 0.000028 - momentum: 0.000000
|
99 |
+
2023-10-19 21:04:44,594 epoch 2 - iter 623/893 - loss 0.48334925 - time (sec): 15.77 - samples/sec: 11038.71 - lr: 0.000028 - momentum: 0.000000
|
100 |
+
2023-10-19 21:04:46,841 epoch 2 - iter 712/893 - loss 0.48240244 - time (sec): 18.01 - samples/sec: 10944.63 - lr: 0.000027 - momentum: 0.000000
|
101 |
+
2023-10-19 21:04:49,143 epoch 2 - iter 801/893 - loss 0.47466459 - time (sec): 20.32 - samples/sec: 10920.37 - lr: 0.000027 - momentum: 0.000000
|
102 |
+
2023-10-19 21:04:51,445 epoch 2 - iter 890/893 - loss 0.47084872 - time (sec): 22.62 - samples/sec: 10964.45 - lr: 0.000027 - momentum: 0.000000
|
103 |
+
2023-10-19 21:04:51,519 ----------------------------------------------------------------------------------------------------
|
104 |
+
2023-10-19 21:04:51,519 EPOCH 2 done: loss 0.4711 - lr: 0.000027
|
105 |
+
2023-10-19 21:04:54,349 DEV : loss 0.2766191065311432 - f1-score (micro avg) 0.2593
|
106 |
+
2023-10-19 21:04:54,364 saving best model
|
107 |
+
2023-10-19 21:04:54,398 ----------------------------------------------------------------------------------------------------
|
108 |
+
2023-10-19 21:04:56,666 epoch 3 - iter 89/893 - loss 0.37256391 - time (sec): 2.27 - samples/sec: 11380.92 - lr: 0.000026 - momentum: 0.000000
|
109 |
+
2023-10-19 21:04:58,952 epoch 3 - iter 178/893 - loss 0.39452300 - time (sec): 4.55 - samples/sec: 10884.87 - lr: 0.000026 - momentum: 0.000000
|
110 |
+
2023-10-19 21:05:01,194 epoch 3 - iter 267/893 - loss 0.39982878 - time (sec): 6.80 - samples/sec: 10778.53 - lr: 0.000026 - momentum: 0.000000
|
111 |
+
2023-10-19 21:05:03,441 epoch 3 - iter 356/893 - loss 0.39305945 - time (sec): 9.04 - samples/sec: 10883.19 - lr: 0.000025 - momentum: 0.000000
|
112 |
+
2023-10-19 21:05:05,683 epoch 3 - iter 445/893 - loss 0.39651155 - time (sec): 11.28 - samples/sec: 10974.19 - lr: 0.000025 - momentum: 0.000000
|
113 |
+
2023-10-19 21:05:07,927 epoch 3 - iter 534/893 - loss 0.39642494 - time (sec): 13.53 - samples/sec: 11049.80 - lr: 0.000025 - momentum: 0.000000
|
114 |
+
2023-10-19 21:05:10,208 epoch 3 - iter 623/893 - loss 0.39408089 - time (sec): 15.81 - samples/sec: 11044.80 - lr: 0.000024 - momentum: 0.000000
|
115 |
+
2023-10-19 21:05:12,406 epoch 3 - iter 712/893 - loss 0.39397319 - time (sec): 18.01 - samples/sec: 11043.21 - lr: 0.000024 - momentum: 0.000000
|
116 |
+
2023-10-19 21:05:14,604 epoch 3 - iter 801/893 - loss 0.39353180 - time (sec): 20.21 - samples/sec: 11015.76 - lr: 0.000024 - momentum: 0.000000
|
117 |
+
2023-10-19 21:05:16,849 epoch 3 - iter 890/893 - loss 0.39134018 - time (sec): 22.45 - samples/sec: 11048.56 - lr: 0.000023 - momentum: 0.000000
|
118 |
+
2023-10-19 21:05:16,926 ----------------------------------------------------------------------------------------------------
|
119 |
+
2023-10-19 21:05:16,926 EPOCH 3 done: loss 0.3922 - lr: 0.000023
|
120 |
+
2023-10-19 21:05:19,751 DEV : loss 0.2515345811843872 - f1-score (micro avg) 0.3323
|
121 |
+
2023-10-19 21:05:19,765 saving best model
|
122 |
+
2023-10-19 21:05:19,799 ----------------------------------------------------------------------------------------------------
|
123 |
+
2023-10-19 21:05:22,171 epoch 4 - iter 89/893 - loss 0.36067185 - time (sec): 2.37 - samples/sec: 10806.53 - lr: 0.000023 - momentum: 0.000000
|
124 |
+
2023-10-19 21:05:24,441 epoch 4 - iter 178/893 - loss 0.35369799 - time (sec): 4.64 - samples/sec: 10634.18 - lr: 0.000023 - momentum: 0.000000
|
125 |
+
2023-10-19 21:05:26,669 epoch 4 - iter 267/893 - loss 0.35728690 - time (sec): 6.87 - samples/sec: 10774.66 - lr: 0.000022 - momentum: 0.000000
|
126 |
+
2023-10-19 21:05:28,712 epoch 4 - iter 356/893 - loss 0.36526060 - time (sec): 8.91 - samples/sec: 10963.32 - lr: 0.000022 - momentum: 0.000000
|
127 |
+
2023-10-19 21:05:30,959 epoch 4 - iter 445/893 - loss 0.36088528 - time (sec): 11.16 - samples/sec: 10981.58 - lr: 0.000022 - momentum: 0.000000
|
128 |
+
2023-10-19 21:05:33,190 epoch 4 - iter 534/893 - loss 0.36222174 - time (sec): 13.39 - samples/sec: 10925.92 - lr: 0.000021 - momentum: 0.000000
|
129 |
+
2023-10-19 21:05:35,432 epoch 4 - iter 623/893 - loss 0.35964595 - time (sec): 15.63 - samples/sec: 10947.56 - lr: 0.000021 - momentum: 0.000000
|
130 |
+
2023-10-19 21:05:37,503 epoch 4 - iter 712/893 - loss 0.35640730 - time (sec): 17.70 - samples/sec: 11213.08 - lr: 0.000021 - momentum: 0.000000
|
131 |
+
2023-10-19 21:05:39,612 epoch 4 - iter 801/893 - loss 0.35414262 - time (sec): 19.81 - samples/sec: 11249.80 - lr: 0.000020 - momentum: 0.000000
|
132 |
+
2023-10-19 21:05:41,879 epoch 4 - iter 890/893 - loss 0.35252586 - time (sec): 22.08 - samples/sec: 11214.76 - lr: 0.000020 - momentum: 0.000000
|
133 |
+
2023-10-19 21:05:41,953 ----------------------------------------------------------------------------------------------------
|
134 |
+
2023-10-19 21:05:41,953 EPOCH 4 done: loss 0.3521 - lr: 0.000020
|
135 |
+
2023-10-19 21:05:44,308 DEV : loss 0.24030107259750366 - f1-score (micro avg) 0.3985
|
136 |
+
2023-10-19 21:05:44,322 saving best model
|
137 |
+
2023-10-19 21:05:44,355 ----------------------------------------------------------------------------------------------------
|
138 |
+
2023-10-19 21:05:46,615 epoch 5 - iter 89/893 - loss 0.33201364 - time (sec): 2.26 - samples/sec: 10449.33 - lr: 0.000020 - momentum: 0.000000
|
139 |
+
2023-10-19 21:05:48,898 epoch 5 - iter 178/893 - loss 0.32799586 - time (sec): 4.54 - samples/sec: 10848.84 - lr: 0.000019 - momentum: 0.000000
|
140 |
+
2023-10-19 21:05:51,137 epoch 5 - iter 267/893 - loss 0.33434089 - time (sec): 6.78 - samples/sec: 10828.88 - lr: 0.000019 - momentum: 0.000000
|
141 |
+
2023-10-19 21:05:53,351 epoch 5 - iter 356/893 - loss 0.33073391 - time (sec): 9.00 - samples/sec: 10953.80 - lr: 0.000019 - momentum: 0.000000
|
142 |
+
2023-10-19 21:05:55,663 epoch 5 - iter 445/893 - loss 0.33199483 - time (sec): 11.31 - samples/sec: 10906.21 - lr: 0.000018 - momentum: 0.000000
|
143 |
+
2023-10-19 21:05:57,878 epoch 5 - iter 534/893 - loss 0.32808703 - time (sec): 13.52 - samples/sec: 10862.79 - lr: 0.000018 - momentum: 0.000000
|
144 |
+
2023-10-19 21:06:00,156 epoch 5 - iter 623/893 - loss 0.32729446 - time (sec): 15.80 - samples/sec: 11023.39 - lr: 0.000018 - momentum: 0.000000
|
145 |
+
2023-10-19 21:06:02,408 epoch 5 - iter 712/893 - loss 0.32383033 - time (sec): 18.05 - samples/sec: 10996.11 - lr: 0.000017 - momentum: 0.000000
|
146 |
+
2023-10-19 21:06:04,750 epoch 5 - iter 801/893 - loss 0.32640604 - time (sec): 20.39 - samples/sec: 10960.58 - lr: 0.000017 - momentum: 0.000000
|
147 |
+
2023-10-19 21:06:07,022 epoch 5 - iter 890/893 - loss 0.32481877 - time (sec): 22.67 - samples/sec: 10929.86 - lr: 0.000017 - momentum: 0.000000
|
148 |
+
2023-10-19 21:06:07,109 ----------------------------------------------------------------------------------------------------
|
149 |
+
2023-10-19 21:06:07,109 EPOCH 5 done: loss 0.3243 - lr: 0.000017
|
150 |
+
2023-10-19 21:06:09,945 DEV : loss 0.22777576744556427 - f1-score (micro avg) 0.4374
|
151 |
+
2023-10-19 21:06:09,958 saving best model
|
152 |
+
2023-10-19 21:06:09,993 ----------------------------------------------------------------------------------------------------
|
153 |
+
2023-10-19 21:06:12,246 epoch 6 - iter 89/893 - loss 0.30901463 - time (sec): 2.25 - samples/sec: 10491.78 - lr: 0.000016 - momentum: 0.000000
|
154 |
+
2023-10-19 21:06:14,534 epoch 6 - iter 178/893 - loss 0.30776053 - time (sec): 4.54 - samples/sec: 10896.61 - lr: 0.000016 - momentum: 0.000000
|
155 |
+
2023-10-19 21:06:16,805 epoch 6 - iter 267/893 - loss 0.31279529 - time (sec): 6.81 - samples/sec: 10972.40 - lr: 0.000016 - momentum: 0.000000
|
156 |
+
2023-10-19 21:06:19,052 epoch 6 - iter 356/893 - loss 0.31110898 - time (sec): 9.06 - samples/sec: 11069.80 - lr: 0.000015 - momentum: 0.000000
|
157 |
+
2023-10-19 21:06:21,259 epoch 6 - iter 445/893 - loss 0.31454604 - time (sec): 11.27 - samples/sec: 10919.77 - lr: 0.000015 - momentum: 0.000000
|
158 |
+
2023-10-19 21:06:23,421 epoch 6 - iter 534/893 - loss 0.31388124 - time (sec): 13.43 - samples/sec: 10919.47 - lr: 0.000015 - momentum: 0.000000
|
159 |
+
2023-10-19 21:06:25,685 epoch 6 - iter 623/893 - loss 0.31175351 - time (sec): 15.69 - samples/sec: 10969.54 - lr: 0.000014 - momentum: 0.000000
|
160 |
+
2023-10-19 21:06:28,029 epoch 6 - iter 712/893 - loss 0.30862470 - time (sec): 18.04 - samples/sec: 10966.02 - lr: 0.000014 - momentum: 0.000000
|
161 |
+
2023-10-19 21:06:30,350 epoch 6 - iter 801/893 - loss 0.30704706 - time (sec): 20.36 - samples/sec: 10969.15 - lr: 0.000014 - momentum: 0.000000
|
162 |
+
2023-10-19 21:06:32,616 epoch 6 - iter 890/893 - loss 0.30734708 - time (sec): 22.62 - samples/sec: 10971.21 - lr: 0.000013 - momentum: 0.000000
|
163 |
+
2023-10-19 21:06:32,682 ----------------------------------------------------------------------------------------------------
|
164 |
+
2023-10-19 21:06:32,682 EPOCH 6 done: loss 0.3076 - lr: 0.000013
|
165 |
+
2023-10-19 21:06:35,530 DEV : loss 0.21941223740577698 - f1-score (micro avg) 0.4564
|
166 |
+
2023-10-19 21:06:35,544 saving best model
|
167 |
+
2023-10-19 21:06:35,582 ----------------------------------------------------------------------------------------------------
|
168 |
+
2023-10-19 21:06:37,807 epoch 7 - iter 89/893 - loss 0.30509027 - time (sec): 2.22 - samples/sec: 10655.57 - lr: 0.000013 - momentum: 0.000000
|
169 |
+
2023-10-19 21:06:40,129 epoch 7 - iter 178/893 - loss 0.28654601 - time (sec): 4.55 - samples/sec: 10694.39 - lr: 0.000013 - momentum: 0.000000
|
170 |
+
2023-10-19 21:06:42,497 epoch 7 - iter 267/893 - loss 0.28403086 - time (sec): 6.91 - samples/sec: 10509.95 - lr: 0.000012 - momentum: 0.000000
|
171 |
+
2023-10-19 21:06:44,843 epoch 7 - iter 356/893 - loss 0.28827472 - time (sec): 9.26 - samples/sec: 10509.54 - lr: 0.000012 - momentum: 0.000000
|
172 |
+
2023-10-19 21:06:47,165 epoch 7 - iter 445/893 - loss 0.29045580 - time (sec): 11.58 - samples/sec: 10583.77 - lr: 0.000012 - momentum: 0.000000
|
173 |
+
2023-10-19 21:06:49,400 epoch 7 - iter 534/893 - loss 0.30162964 - time (sec): 13.82 - samples/sec: 10697.48 - lr: 0.000011 - momentum: 0.000000
|
174 |
+
2023-10-19 21:06:51,632 epoch 7 - iter 623/893 - loss 0.30089497 - time (sec): 16.05 - samples/sec: 10683.90 - lr: 0.000011 - momentum: 0.000000
|
175 |
+
2023-10-19 21:06:53,942 epoch 7 - iter 712/893 - loss 0.29776092 - time (sec): 18.36 - samples/sec: 10762.55 - lr: 0.000011 - momentum: 0.000000
|
176 |
+
2023-10-19 21:06:56,209 epoch 7 - iter 801/893 - loss 0.29561950 - time (sec): 20.63 - samples/sec: 10809.02 - lr: 0.000010 - momentum: 0.000000
|
177 |
+
2023-10-19 21:06:58,427 epoch 7 - iter 890/893 - loss 0.29349579 - time (sec): 22.84 - samples/sec: 10841.48 - lr: 0.000010 - momentum: 0.000000
|
178 |
+
2023-10-19 21:06:58,506 ----------------------------------------------------------------------------------------------------
|
179 |
+
2023-10-19 21:06:58,506 EPOCH 7 done: loss 0.2932 - lr: 0.000010
|
180 |
+
2023-10-19 21:07:00,835 DEV : loss 0.2180010825395584 - f1-score (micro avg) 0.4616
|
181 |
+
2023-10-19 21:07:00,849 saving best model
|
182 |
+
2023-10-19 21:07:00,884 ----------------------------------------------------------------------------------------------------
|
183 |
+
2023-10-19 21:07:03,134 epoch 8 - iter 89/893 - loss 0.28270141 - time (sec): 2.25 - samples/sec: 11490.73 - lr: 0.000010 - momentum: 0.000000
|
184 |
+
2023-10-19 21:07:05,412 epoch 8 - iter 178/893 - loss 0.28238330 - time (sec): 4.53 - samples/sec: 11628.05 - lr: 0.000009 - momentum: 0.000000
|
185 |
+
2023-10-19 21:07:07,718 epoch 8 - iter 267/893 - loss 0.27604606 - time (sec): 6.83 - samples/sec: 11561.75 - lr: 0.000009 - momentum: 0.000000
|
186 |
+
2023-10-19 21:07:10,009 epoch 8 - iter 356/893 - loss 0.27685070 - time (sec): 9.12 - samples/sec: 11354.16 - lr: 0.000009 - momentum: 0.000000
|
187 |
+
2023-10-19 21:07:12,333 epoch 8 - iter 445/893 - loss 0.28502945 - time (sec): 11.45 - samples/sec: 11152.36 - lr: 0.000008 - momentum: 0.000000
|
188 |
+
2023-10-19 21:07:14,695 epoch 8 - iter 534/893 - loss 0.28106511 - time (sec): 13.81 - samples/sec: 11140.31 - lr: 0.000008 - momentum: 0.000000
|
189 |
+
2023-10-19 21:07:16,982 epoch 8 - iter 623/893 - loss 0.28187735 - time (sec): 16.10 - samples/sec: 10994.00 - lr: 0.000008 - momentum: 0.000000
|
190 |
+
2023-10-19 21:07:19,237 epoch 8 - iter 712/893 - loss 0.27973353 - time (sec): 18.35 - samples/sec: 10944.00 - lr: 0.000007 - momentum: 0.000000
|
191 |
+
2023-10-19 21:07:21,454 epoch 8 - iter 801/893 - loss 0.28007865 - time (sec): 20.57 - samples/sec: 10940.20 - lr: 0.000007 - momentum: 0.000000
|
192 |
+
2023-10-19 21:07:23,659 epoch 8 - iter 890/893 - loss 0.28450041 - time (sec): 22.77 - samples/sec: 10890.11 - lr: 0.000007 - momentum: 0.000000
|
193 |
+
2023-10-19 21:07:23,729 ----------------------------------------------------------------------------------------------------
|
194 |
+
2023-10-19 21:07:23,729 EPOCH 8 done: loss 0.2843 - lr: 0.000007
|
195 |
+
2023-10-19 21:07:26,548 DEV : loss 0.21302838623523712 - f1-score (micro avg) 0.4614
|
196 |
+
2023-10-19 21:07:26,562 ----------------------------------------------------------------------------------------------------
|
197 |
+
2023-10-19 21:07:28,823 epoch 9 - iter 89/893 - loss 0.27434130 - time (sec): 2.26 - samples/sec: 11242.35 - lr: 0.000006 - momentum: 0.000000
|
198 |
+
2023-10-19 21:07:31,078 epoch 9 - iter 178/893 - loss 0.27958386 - time (sec): 4.52 - samples/sec: 11145.59 - lr: 0.000006 - momentum: 0.000000
|
199 |
+
2023-10-19 21:07:33,324 epoch 9 - iter 267/893 - loss 0.28201334 - time (sec): 6.76 - samples/sec: 10968.95 - lr: 0.000006 - momentum: 0.000000
|
200 |
+
2023-10-19 21:07:35,560 epoch 9 - iter 356/893 - loss 0.28336896 - time (sec): 9.00 - samples/sec: 10917.79 - lr: 0.000005 - momentum: 0.000000
|
201 |
+
2023-10-19 21:07:37,762 epoch 9 - iter 445/893 - loss 0.28043243 - time (sec): 11.20 - samples/sec: 10900.38 - lr: 0.000005 - momentum: 0.000000
|
202 |
+
2023-10-19 21:07:40,039 epoch 9 - iter 534/893 - loss 0.27712063 - time (sec): 13.48 - samples/sec: 11038.22 - lr: 0.000005 - momentum: 0.000000
|
203 |
+
2023-10-19 21:07:42,253 epoch 9 - iter 623/893 - loss 0.27542232 - time (sec): 15.69 - samples/sec: 11006.18 - lr: 0.000004 - momentum: 0.000000
|
204 |
+
2023-10-19 21:07:44,549 epoch 9 - iter 712/893 - loss 0.27658911 - time (sec): 17.99 - samples/sec: 11007.29 - lr: 0.000004 - momentum: 0.000000
|
205 |
+
2023-10-19 21:07:46,811 epoch 9 - iter 801/893 - loss 0.27538194 - time (sec): 20.25 - samples/sec: 10997.20 - lr: 0.000004 - momentum: 0.000000
|
206 |
+
2023-10-19 21:07:49,039 epoch 9 - iter 890/893 - loss 0.27416186 - time (sec): 22.48 - samples/sec: 11034.55 - lr: 0.000003 - momentum: 0.000000
|
207 |
+
2023-10-19 21:07:49,117 ----------------------------------------------------------------------------------------------------
|
208 |
+
2023-10-19 21:07:49,117 EPOCH 9 done: loss 0.2741 - lr: 0.000003
|
209 |
+
2023-10-19 21:07:51,965 DEV : loss 0.21206232905387878 - f1-score (micro avg) 0.4562
|
210 |
+
2023-10-19 21:07:51,980 ----------------------------------------------------------------------------------------------------
|
211 |
+
2023-10-19 21:07:54,185 epoch 10 - iter 89/893 - loss 0.25869512 - time (sec): 2.20 - samples/sec: 10712.67 - lr: 0.000003 - momentum: 0.000000
|
212 |
+
2023-10-19 21:07:56,391 epoch 10 - iter 178/893 - loss 0.26163812 - time (sec): 4.41 - samples/sec: 10800.80 - lr: 0.000003 - momentum: 0.000000
|
213 |
+
2023-10-19 21:07:58,723 epoch 10 - iter 267/893 - loss 0.26555406 - time (sec): 6.74 - samples/sec: 11025.94 - lr: 0.000002 - momentum: 0.000000
|
214 |
+
2023-10-19 21:08:00,990 epoch 10 - iter 356/893 - loss 0.26148161 - time (sec): 9.01 - samples/sec: 10868.20 - lr: 0.000002 - momentum: 0.000000
|
215 |
+
2023-10-19 21:08:03,290 epoch 10 - iter 445/893 - loss 0.26190440 - time (sec): 11.31 - samples/sec: 10794.51 - lr: 0.000002 - momentum: 0.000000
|
216 |
+
2023-10-19 21:08:05,593 epoch 10 - iter 534/893 - loss 0.26271915 - time (sec): 13.61 - samples/sec: 10856.21 - lr: 0.000001 - momentum: 0.000000
|
217 |
+
2023-10-19 21:08:07,843 epoch 10 - iter 623/893 - loss 0.26537037 - time (sec): 15.86 - samples/sec: 10830.62 - lr: 0.000001 - momentum: 0.000000
|
218 |
+
2023-10-19 21:08:10,148 epoch 10 - iter 712/893 - loss 0.27004143 - time (sec): 18.17 - samples/sec: 10879.72 - lr: 0.000001 - momentum: 0.000000
|
219 |
+
2023-10-19 21:08:12,367 epoch 10 - iter 801/893 - loss 0.27091574 - time (sec): 20.39 - samples/sec: 10981.73 - lr: 0.000000 - momentum: 0.000000
|
220 |
+
2023-10-19 21:08:14,609 epoch 10 - iter 890/893 - loss 0.27304663 - time (sec): 22.63 - samples/sec: 10963.71 - lr: 0.000000 - momentum: 0.000000
|
221 |
+
2023-10-19 21:08:14,679 ----------------------------------------------------------------------------------------------------
|
222 |
+
2023-10-19 21:08:14,679 EPOCH 10 done: loss 0.2730 - lr: 0.000000
|
223 |
+
2023-10-19 21:08:17,043 DEV : loss 0.21014845371246338 - f1-score (micro avg) 0.4628
|
224 |
+
2023-10-19 21:08:17,057 saving best model
|
225 |
+
2023-10-19 21:08:17,117 ----------------------------------------------------------------------------------------------------
|
226 |
+
2023-10-19 21:08:17,118 Loading model from best epoch ...
|
227 |
+
2023-10-19 21:08:17,199 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
|
228 |
+
2023-10-19 21:08:21,777
|
229 |
+
Results:
|
230 |
+
- F-score (micro) 0.3645
|
231 |
+
- F-score (macro) 0.2036
|
232 |
+
- Accuracy 0.2332
|
233 |
+
|
234 |
+
By class:
|
235 |
+
precision recall f1-score support
|
236 |
+
|
237 |
+
LOC 0.3682 0.4849 0.4186 1095
|
238 |
+
PER 0.3427 0.4219 0.3782 1012
|
239 |
+
ORG 0.0426 0.0112 0.0177 357
|
240 |
+
HumanProd 0.0000 0.0000 0.0000 33
|
241 |
+
|
242 |
+
micro avg 0.3458 0.3853 0.3645 2497
|
243 |
+
macro avg 0.1884 0.2295 0.2036 2497
|
244 |
+
weighted avg 0.3065 0.3853 0.3394 2497
|
245 |
+
|
246 |
+
2023-10-19 21:08:21,778 ----------------------------------------------------------------------------------------------------
|