stefan-it commited on
Commit
c25d14b
·
1 Parent(s): 4863b1d

Upload folder using huggingface_hub

Browse files
best-model.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c80f533761eb06b74037e30a727ef29c5558c6463ab593e5b1145a84022e0d7
3
+ size 19048098
dev.tsv ADDED
The diff for this file is too large to render. See raw diff
 
loss.tsv ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ EPOCH TIMESTAMP LEARNING_RATE TRAIN_LOSS DEV_LOSS DEV_PRECISION DEV_RECALL DEV_F1 DEV_ACCURACY
2
+ 1 21:04:28 0.0000 1.5206 0.3604 0.0286 0.0014 0.0026 0.0013
3
+ 2 21:04:54 0.0000 0.4711 0.2766 0.2574 0.2612 0.2593 0.1600
4
+ 3 21:05:19 0.0000 0.3922 0.2515 0.3101 0.3578 0.3323 0.2165
5
+ 4 21:05:44 0.0000 0.3521 0.2403 0.3755 0.4245 0.3985 0.2697
6
+ 5 21:06:09 0.0000 0.3243 0.2278 0.4181 0.4585 0.4374 0.3009
7
+ 6 21:06:35 0.0000 0.3076 0.2194 0.4466 0.4667 0.4564 0.3144
8
+ 7 21:07:00 0.0000 0.2932 0.2180 0.4529 0.4707 0.4616 0.3186
9
+ 8 21:07:26 0.0000 0.2843 0.2130 0.4341 0.4925 0.4614 0.3195
10
+ 9 21:07:51 0.0000 0.2741 0.2121 0.4333 0.4816 0.4562 0.3147
11
+ 10 21:08:17 0.0000 0.2730 0.2101 0.4376 0.4912 0.4628 0.3217
runs/events.out.tfevents.1697749444.46dc0c540dd0.4731.18 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fb1ea733be2438ab5cdd31e49712d7964b84a8e042867ec31f4e088cedda57f7
3
+ size 502461
test.tsv ADDED
The diff for this file is too large to render. See raw diff
 
training.log ADDED
@@ -0,0 +1,246 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2023-10-19 21:04:04,067 ----------------------------------------------------------------------------------------------------
2
+ 2023-10-19 21:04:04,067 Model: "SequenceTagger(
3
+ (embeddings): TransformerWordEmbeddings(
4
+ (model): BertModel(
5
+ (embeddings): BertEmbeddings(
6
+ (word_embeddings): Embedding(32001, 128)
7
+ (position_embeddings): Embedding(512, 128)
8
+ (token_type_embeddings): Embedding(2, 128)
9
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
10
+ (dropout): Dropout(p=0.1, inplace=False)
11
+ )
12
+ (encoder): BertEncoder(
13
+ (layer): ModuleList(
14
+ (0-1): 2 x BertLayer(
15
+ (attention): BertAttention(
16
+ (self): BertSelfAttention(
17
+ (query): Linear(in_features=128, out_features=128, bias=True)
18
+ (key): Linear(in_features=128, out_features=128, bias=True)
19
+ (value): Linear(in_features=128, out_features=128, bias=True)
20
+ (dropout): Dropout(p=0.1, inplace=False)
21
+ )
22
+ (output): BertSelfOutput(
23
+ (dense): Linear(in_features=128, out_features=128, bias=True)
24
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
25
+ (dropout): Dropout(p=0.1, inplace=False)
26
+ )
27
+ )
28
+ (intermediate): BertIntermediate(
29
+ (dense): Linear(in_features=128, out_features=512, bias=True)
30
+ (intermediate_act_fn): GELUActivation()
31
+ )
32
+ (output): BertOutput(
33
+ (dense): Linear(in_features=512, out_features=128, bias=True)
34
+ (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True)
35
+ (dropout): Dropout(p=0.1, inplace=False)
36
+ )
37
+ )
38
+ )
39
+ )
40
+ (pooler): BertPooler(
41
+ (dense): Linear(in_features=128, out_features=128, bias=True)
42
+ (activation): Tanh()
43
+ )
44
+ )
45
+ )
46
+ (locked_dropout): LockedDropout(p=0.5)
47
+ (linear): Linear(in_features=128, out_features=17, bias=True)
48
+ (loss_function): CrossEntropyLoss()
49
+ )"
50
+ 2023-10-19 21:04:04,067 ----------------------------------------------------------------------------------------------------
51
+ 2023-10-19 21:04:04,067 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
52
+ - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
53
+ 2023-10-19 21:04:04,067 ----------------------------------------------------------------------------------------------------
54
+ 2023-10-19 21:04:04,067 Train: 7142 sentences
55
+ 2023-10-19 21:04:04,068 (train_with_dev=False, train_with_test=False)
56
+ 2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
57
+ 2023-10-19 21:04:04,068 Training Params:
58
+ 2023-10-19 21:04:04,068 - learning_rate: "3e-05"
59
+ 2023-10-19 21:04:04,068 - mini_batch_size: "8"
60
+ 2023-10-19 21:04:04,068 - max_epochs: "10"
61
+ 2023-10-19 21:04:04,068 - shuffle: "True"
62
+ 2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
63
+ 2023-10-19 21:04:04,068 Plugins:
64
+ 2023-10-19 21:04:04,068 - TensorboardLogger
65
+ 2023-10-19 21:04:04,068 - LinearScheduler | warmup_fraction: '0.1'
66
+ 2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
67
+ 2023-10-19 21:04:04,068 Final evaluation on model from best epoch (best-model.pt)
68
+ 2023-10-19 21:04:04,068 - metric: "('micro avg', 'f1-score')"
69
+ 2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
70
+ 2023-10-19 21:04:04,068 Computation:
71
+ 2023-10-19 21:04:04,068 - compute on device: cuda:0
72
+ 2023-10-19 21:04:04,068 - embedding storage: none
73
+ 2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
74
+ 2023-10-19 21:04:04,068 Model training base path: "hmbench-newseye/fr-dbmdz/bert-tiny-historic-multilingual-cased-bs8-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5"
75
+ 2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
76
+ 2023-10-19 21:04:04,068 ----------------------------------------------------------------------------------------------------
77
+ 2023-10-19 21:04:04,068 Logging anything other than scalars to TensorBoard is currently not supported.
78
+ 2023-10-19 21:04:06,450 epoch 1 - iter 89/893 - loss 3.46516027 - time (sec): 2.38 - samples/sec: 10355.47 - lr: 0.000003 - momentum: 0.000000
79
+ 2023-10-19 21:04:08,890 epoch 1 - iter 178/893 - loss 3.31633199 - time (sec): 4.82 - samples/sec: 10596.78 - lr: 0.000006 - momentum: 0.000000
80
+ 2023-10-19 21:04:11,278 epoch 1 - iter 267/893 - loss 3.01321278 - time (sec): 7.21 - samples/sec: 10667.08 - lr: 0.000009 - momentum: 0.000000
81
+ 2023-10-19 21:04:13,675 epoch 1 - iter 356/893 - loss 2.63035363 - time (sec): 9.61 - samples/sec: 10662.06 - lr: 0.000012 - momentum: 0.000000
82
+ 2023-10-19 21:04:16,000 epoch 1 - iter 445/893 - loss 2.31455106 - time (sec): 11.93 - samples/sec: 10567.39 - lr: 0.000015 - momentum: 0.000000
83
+ 2023-10-19 21:04:18,356 epoch 1 - iter 534/893 - loss 2.07618924 - time (sec): 14.29 - samples/sec: 10509.93 - lr: 0.000018 - momentum: 0.000000
84
+ 2023-10-19 21:04:21,075 epoch 1 - iter 623/893 - loss 1.88468649 - time (sec): 17.01 - samples/sec: 10330.37 - lr: 0.000021 - momentum: 0.000000
85
+ 2023-10-19 21:04:23,229 epoch 1 - iter 712/893 - loss 1.73722430 - time (sec): 19.16 - samples/sec: 10479.92 - lr: 0.000024 - momentum: 0.000000
86
+ 2023-10-19 21:04:25,515 epoch 1 - iter 801/893 - loss 1.61864786 - time (sec): 21.45 - samples/sec: 10471.92 - lr: 0.000027 - momentum: 0.000000
87
+ 2023-10-19 21:04:27,744 epoch 1 - iter 890/893 - loss 1.52299965 - time (sec): 23.67 - samples/sec: 10476.93 - lr: 0.000030 - momentum: 0.000000
88
+ 2023-10-19 21:04:27,807 ----------------------------------------------------------------------------------------------------
89
+ 2023-10-19 21:04:27,808 EPOCH 1 done: loss 1.5206 - lr: 0.000030
90
+ 2023-10-19 21:04:28,776 DEV : loss 0.36043813824653625 - f1-score (micro avg) 0.0026
91
+ 2023-10-19 21:04:28,791 saving best model
92
+ 2023-10-19 21:04:28,825 ----------------------------------------------------------------------------------------------------
93
+ 2023-10-19 21:04:30,944 epoch 2 - iter 89/893 - loss 0.54958316 - time (sec): 2.12 - samples/sec: 11145.85 - lr: 0.000030 - momentum: 0.000000
94
+ 2023-10-19 21:04:33,239 epoch 2 - iter 178/893 - loss 0.53358732 - time (sec): 4.41 - samples/sec: 11059.90 - lr: 0.000029 - momentum: 0.000000
95
+ 2023-10-19 21:04:35,553 epoch 2 - iter 267/893 - loss 0.52508342 - time (sec): 6.73 - samples/sec: 10890.14 - lr: 0.000029 - momentum: 0.000000
96
+ 2023-10-19 21:04:37,785 epoch 2 - iter 356/893 - loss 0.51537618 - time (sec): 8.96 - samples/sec: 10867.78 - lr: 0.000029 - momentum: 0.000000
97
+ 2023-10-19 21:04:40,048 epoch 2 - iter 445/893 - loss 0.50151060 - time (sec): 11.22 - samples/sec: 10908.00 - lr: 0.000028 - momentum: 0.000000
98
+ 2023-10-19 21:04:42,322 epoch 2 - iter 534/893 - loss 0.49081103 - time (sec): 13.50 - samples/sec: 10977.85 - lr: 0.000028 - momentum: 0.000000
99
+ 2023-10-19 21:04:44,594 epoch 2 - iter 623/893 - loss 0.48334925 - time (sec): 15.77 - samples/sec: 11038.71 - lr: 0.000028 - momentum: 0.000000
100
+ 2023-10-19 21:04:46,841 epoch 2 - iter 712/893 - loss 0.48240244 - time (sec): 18.01 - samples/sec: 10944.63 - lr: 0.000027 - momentum: 0.000000
101
+ 2023-10-19 21:04:49,143 epoch 2 - iter 801/893 - loss 0.47466459 - time (sec): 20.32 - samples/sec: 10920.37 - lr: 0.000027 - momentum: 0.000000
102
+ 2023-10-19 21:04:51,445 epoch 2 - iter 890/893 - loss 0.47084872 - time (sec): 22.62 - samples/sec: 10964.45 - lr: 0.000027 - momentum: 0.000000
103
+ 2023-10-19 21:04:51,519 ----------------------------------------------------------------------------------------------------
104
+ 2023-10-19 21:04:51,519 EPOCH 2 done: loss 0.4711 - lr: 0.000027
105
+ 2023-10-19 21:04:54,349 DEV : loss 0.2766191065311432 - f1-score (micro avg) 0.2593
106
+ 2023-10-19 21:04:54,364 saving best model
107
+ 2023-10-19 21:04:54,398 ----------------------------------------------------------------------------------------------------
108
+ 2023-10-19 21:04:56,666 epoch 3 - iter 89/893 - loss 0.37256391 - time (sec): 2.27 - samples/sec: 11380.92 - lr: 0.000026 - momentum: 0.000000
109
+ 2023-10-19 21:04:58,952 epoch 3 - iter 178/893 - loss 0.39452300 - time (sec): 4.55 - samples/sec: 10884.87 - lr: 0.000026 - momentum: 0.000000
110
+ 2023-10-19 21:05:01,194 epoch 3 - iter 267/893 - loss 0.39982878 - time (sec): 6.80 - samples/sec: 10778.53 - lr: 0.000026 - momentum: 0.000000
111
+ 2023-10-19 21:05:03,441 epoch 3 - iter 356/893 - loss 0.39305945 - time (sec): 9.04 - samples/sec: 10883.19 - lr: 0.000025 - momentum: 0.000000
112
+ 2023-10-19 21:05:05,683 epoch 3 - iter 445/893 - loss 0.39651155 - time (sec): 11.28 - samples/sec: 10974.19 - lr: 0.000025 - momentum: 0.000000
113
+ 2023-10-19 21:05:07,927 epoch 3 - iter 534/893 - loss 0.39642494 - time (sec): 13.53 - samples/sec: 11049.80 - lr: 0.000025 - momentum: 0.000000
114
+ 2023-10-19 21:05:10,208 epoch 3 - iter 623/893 - loss 0.39408089 - time (sec): 15.81 - samples/sec: 11044.80 - lr: 0.000024 - momentum: 0.000000
115
+ 2023-10-19 21:05:12,406 epoch 3 - iter 712/893 - loss 0.39397319 - time (sec): 18.01 - samples/sec: 11043.21 - lr: 0.000024 - momentum: 0.000000
116
+ 2023-10-19 21:05:14,604 epoch 3 - iter 801/893 - loss 0.39353180 - time (sec): 20.21 - samples/sec: 11015.76 - lr: 0.000024 - momentum: 0.000000
117
+ 2023-10-19 21:05:16,849 epoch 3 - iter 890/893 - loss 0.39134018 - time (sec): 22.45 - samples/sec: 11048.56 - lr: 0.000023 - momentum: 0.000000
118
+ 2023-10-19 21:05:16,926 ----------------------------------------------------------------------------------------------------
119
+ 2023-10-19 21:05:16,926 EPOCH 3 done: loss 0.3922 - lr: 0.000023
120
+ 2023-10-19 21:05:19,751 DEV : loss 0.2515345811843872 - f1-score (micro avg) 0.3323
121
+ 2023-10-19 21:05:19,765 saving best model
122
+ 2023-10-19 21:05:19,799 ----------------------------------------------------------------------------------------------------
123
+ 2023-10-19 21:05:22,171 epoch 4 - iter 89/893 - loss 0.36067185 - time (sec): 2.37 - samples/sec: 10806.53 - lr: 0.000023 - momentum: 0.000000
124
+ 2023-10-19 21:05:24,441 epoch 4 - iter 178/893 - loss 0.35369799 - time (sec): 4.64 - samples/sec: 10634.18 - lr: 0.000023 - momentum: 0.000000
125
+ 2023-10-19 21:05:26,669 epoch 4 - iter 267/893 - loss 0.35728690 - time (sec): 6.87 - samples/sec: 10774.66 - lr: 0.000022 - momentum: 0.000000
126
+ 2023-10-19 21:05:28,712 epoch 4 - iter 356/893 - loss 0.36526060 - time (sec): 8.91 - samples/sec: 10963.32 - lr: 0.000022 - momentum: 0.000000
127
+ 2023-10-19 21:05:30,959 epoch 4 - iter 445/893 - loss 0.36088528 - time (sec): 11.16 - samples/sec: 10981.58 - lr: 0.000022 - momentum: 0.000000
128
+ 2023-10-19 21:05:33,190 epoch 4 - iter 534/893 - loss 0.36222174 - time (sec): 13.39 - samples/sec: 10925.92 - lr: 0.000021 - momentum: 0.000000
129
+ 2023-10-19 21:05:35,432 epoch 4 - iter 623/893 - loss 0.35964595 - time (sec): 15.63 - samples/sec: 10947.56 - lr: 0.000021 - momentum: 0.000000
130
+ 2023-10-19 21:05:37,503 epoch 4 - iter 712/893 - loss 0.35640730 - time (sec): 17.70 - samples/sec: 11213.08 - lr: 0.000021 - momentum: 0.000000
131
+ 2023-10-19 21:05:39,612 epoch 4 - iter 801/893 - loss 0.35414262 - time (sec): 19.81 - samples/sec: 11249.80 - lr: 0.000020 - momentum: 0.000000
132
+ 2023-10-19 21:05:41,879 epoch 4 - iter 890/893 - loss 0.35252586 - time (sec): 22.08 - samples/sec: 11214.76 - lr: 0.000020 - momentum: 0.000000
133
+ 2023-10-19 21:05:41,953 ----------------------------------------------------------------------------------------------------
134
+ 2023-10-19 21:05:41,953 EPOCH 4 done: loss 0.3521 - lr: 0.000020
135
+ 2023-10-19 21:05:44,308 DEV : loss 0.24030107259750366 - f1-score (micro avg) 0.3985
136
+ 2023-10-19 21:05:44,322 saving best model
137
+ 2023-10-19 21:05:44,355 ----------------------------------------------------------------------------------------------------
138
+ 2023-10-19 21:05:46,615 epoch 5 - iter 89/893 - loss 0.33201364 - time (sec): 2.26 - samples/sec: 10449.33 - lr: 0.000020 - momentum: 0.000000
139
+ 2023-10-19 21:05:48,898 epoch 5 - iter 178/893 - loss 0.32799586 - time (sec): 4.54 - samples/sec: 10848.84 - lr: 0.000019 - momentum: 0.000000
140
+ 2023-10-19 21:05:51,137 epoch 5 - iter 267/893 - loss 0.33434089 - time (sec): 6.78 - samples/sec: 10828.88 - lr: 0.000019 - momentum: 0.000000
141
+ 2023-10-19 21:05:53,351 epoch 5 - iter 356/893 - loss 0.33073391 - time (sec): 9.00 - samples/sec: 10953.80 - lr: 0.000019 - momentum: 0.000000
142
+ 2023-10-19 21:05:55,663 epoch 5 - iter 445/893 - loss 0.33199483 - time (sec): 11.31 - samples/sec: 10906.21 - lr: 0.000018 - momentum: 0.000000
143
+ 2023-10-19 21:05:57,878 epoch 5 - iter 534/893 - loss 0.32808703 - time (sec): 13.52 - samples/sec: 10862.79 - lr: 0.000018 - momentum: 0.000000
144
+ 2023-10-19 21:06:00,156 epoch 5 - iter 623/893 - loss 0.32729446 - time (sec): 15.80 - samples/sec: 11023.39 - lr: 0.000018 - momentum: 0.000000
145
+ 2023-10-19 21:06:02,408 epoch 5 - iter 712/893 - loss 0.32383033 - time (sec): 18.05 - samples/sec: 10996.11 - lr: 0.000017 - momentum: 0.000000
146
+ 2023-10-19 21:06:04,750 epoch 5 - iter 801/893 - loss 0.32640604 - time (sec): 20.39 - samples/sec: 10960.58 - lr: 0.000017 - momentum: 0.000000
147
+ 2023-10-19 21:06:07,022 epoch 5 - iter 890/893 - loss 0.32481877 - time (sec): 22.67 - samples/sec: 10929.86 - lr: 0.000017 - momentum: 0.000000
148
+ 2023-10-19 21:06:07,109 ----------------------------------------------------------------------------------------------------
149
+ 2023-10-19 21:06:07,109 EPOCH 5 done: loss 0.3243 - lr: 0.000017
150
+ 2023-10-19 21:06:09,945 DEV : loss 0.22777576744556427 - f1-score (micro avg) 0.4374
151
+ 2023-10-19 21:06:09,958 saving best model
152
+ 2023-10-19 21:06:09,993 ----------------------------------------------------------------------------------------------------
153
+ 2023-10-19 21:06:12,246 epoch 6 - iter 89/893 - loss 0.30901463 - time (sec): 2.25 - samples/sec: 10491.78 - lr: 0.000016 - momentum: 0.000000
154
+ 2023-10-19 21:06:14,534 epoch 6 - iter 178/893 - loss 0.30776053 - time (sec): 4.54 - samples/sec: 10896.61 - lr: 0.000016 - momentum: 0.000000
155
+ 2023-10-19 21:06:16,805 epoch 6 - iter 267/893 - loss 0.31279529 - time (sec): 6.81 - samples/sec: 10972.40 - lr: 0.000016 - momentum: 0.000000
156
+ 2023-10-19 21:06:19,052 epoch 6 - iter 356/893 - loss 0.31110898 - time (sec): 9.06 - samples/sec: 11069.80 - lr: 0.000015 - momentum: 0.000000
157
+ 2023-10-19 21:06:21,259 epoch 6 - iter 445/893 - loss 0.31454604 - time (sec): 11.27 - samples/sec: 10919.77 - lr: 0.000015 - momentum: 0.000000
158
+ 2023-10-19 21:06:23,421 epoch 6 - iter 534/893 - loss 0.31388124 - time (sec): 13.43 - samples/sec: 10919.47 - lr: 0.000015 - momentum: 0.000000
159
+ 2023-10-19 21:06:25,685 epoch 6 - iter 623/893 - loss 0.31175351 - time (sec): 15.69 - samples/sec: 10969.54 - lr: 0.000014 - momentum: 0.000000
160
+ 2023-10-19 21:06:28,029 epoch 6 - iter 712/893 - loss 0.30862470 - time (sec): 18.04 - samples/sec: 10966.02 - lr: 0.000014 - momentum: 0.000000
161
+ 2023-10-19 21:06:30,350 epoch 6 - iter 801/893 - loss 0.30704706 - time (sec): 20.36 - samples/sec: 10969.15 - lr: 0.000014 - momentum: 0.000000
162
+ 2023-10-19 21:06:32,616 epoch 6 - iter 890/893 - loss 0.30734708 - time (sec): 22.62 - samples/sec: 10971.21 - lr: 0.000013 - momentum: 0.000000
163
+ 2023-10-19 21:06:32,682 ----------------------------------------------------------------------------------------------------
164
+ 2023-10-19 21:06:32,682 EPOCH 6 done: loss 0.3076 - lr: 0.000013
165
+ 2023-10-19 21:06:35,530 DEV : loss 0.21941223740577698 - f1-score (micro avg) 0.4564
166
+ 2023-10-19 21:06:35,544 saving best model
167
+ 2023-10-19 21:06:35,582 ----------------------------------------------------------------------------------------------------
168
+ 2023-10-19 21:06:37,807 epoch 7 - iter 89/893 - loss 0.30509027 - time (sec): 2.22 - samples/sec: 10655.57 - lr: 0.000013 - momentum: 0.000000
169
+ 2023-10-19 21:06:40,129 epoch 7 - iter 178/893 - loss 0.28654601 - time (sec): 4.55 - samples/sec: 10694.39 - lr: 0.000013 - momentum: 0.000000
170
+ 2023-10-19 21:06:42,497 epoch 7 - iter 267/893 - loss 0.28403086 - time (sec): 6.91 - samples/sec: 10509.95 - lr: 0.000012 - momentum: 0.000000
171
+ 2023-10-19 21:06:44,843 epoch 7 - iter 356/893 - loss 0.28827472 - time (sec): 9.26 - samples/sec: 10509.54 - lr: 0.000012 - momentum: 0.000000
172
+ 2023-10-19 21:06:47,165 epoch 7 - iter 445/893 - loss 0.29045580 - time (sec): 11.58 - samples/sec: 10583.77 - lr: 0.000012 - momentum: 0.000000
173
+ 2023-10-19 21:06:49,400 epoch 7 - iter 534/893 - loss 0.30162964 - time (sec): 13.82 - samples/sec: 10697.48 - lr: 0.000011 - momentum: 0.000000
174
+ 2023-10-19 21:06:51,632 epoch 7 - iter 623/893 - loss 0.30089497 - time (sec): 16.05 - samples/sec: 10683.90 - lr: 0.000011 - momentum: 0.000000
175
+ 2023-10-19 21:06:53,942 epoch 7 - iter 712/893 - loss 0.29776092 - time (sec): 18.36 - samples/sec: 10762.55 - lr: 0.000011 - momentum: 0.000000
176
+ 2023-10-19 21:06:56,209 epoch 7 - iter 801/893 - loss 0.29561950 - time (sec): 20.63 - samples/sec: 10809.02 - lr: 0.000010 - momentum: 0.000000
177
+ 2023-10-19 21:06:58,427 epoch 7 - iter 890/893 - loss 0.29349579 - time (sec): 22.84 - samples/sec: 10841.48 - lr: 0.000010 - momentum: 0.000000
178
+ 2023-10-19 21:06:58,506 ----------------------------------------------------------------------------------------------------
179
+ 2023-10-19 21:06:58,506 EPOCH 7 done: loss 0.2932 - lr: 0.000010
180
+ 2023-10-19 21:07:00,835 DEV : loss 0.2180010825395584 - f1-score (micro avg) 0.4616
181
+ 2023-10-19 21:07:00,849 saving best model
182
+ 2023-10-19 21:07:00,884 ----------------------------------------------------------------------------------------------------
183
+ 2023-10-19 21:07:03,134 epoch 8 - iter 89/893 - loss 0.28270141 - time (sec): 2.25 - samples/sec: 11490.73 - lr: 0.000010 - momentum: 0.000000
184
+ 2023-10-19 21:07:05,412 epoch 8 - iter 178/893 - loss 0.28238330 - time (sec): 4.53 - samples/sec: 11628.05 - lr: 0.000009 - momentum: 0.000000
185
+ 2023-10-19 21:07:07,718 epoch 8 - iter 267/893 - loss 0.27604606 - time (sec): 6.83 - samples/sec: 11561.75 - lr: 0.000009 - momentum: 0.000000
186
+ 2023-10-19 21:07:10,009 epoch 8 - iter 356/893 - loss 0.27685070 - time (sec): 9.12 - samples/sec: 11354.16 - lr: 0.000009 - momentum: 0.000000
187
+ 2023-10-19 21:07:12,333 epoch 8 - iter 445/893 - loss 0.28502945 - time (sec): 11.45 - samples/sec: 11152.36 - lr: 0.000008 - momentum: 0.000000
188
+ 2023-10-19 21:07:14,695 epoch 8 - iter 534/893 - loss 0.28106511 - time (sec): 13.81 - samples/sec: 11140.31 - lr: 0.000008 - momentum: 0.000000
189
+ 2023-10-19 21:07:16,982 epoch 8 - iter 623/893 - loss 0.28187735 - time (sec): 16.10 - samples/sec: 10994.00 - lr: 0.000008 - momentum: 0.000000
190
+ 2023-10-19 21:07:19,237 epoch 8 - iter 712/893 - loss 0.27973353 - time (sec): 18.35 - samples/sec: 10944.00 - lr: 0.000007 - momentum: 0.000000
191
+ 2023-10-19 21:07:21,454 epoch 8 - iter 801/893 - loss 0.28007865 - time (sec): 20.57 - samples/sec: 10940.20 - lr: 0.000007 - momentum: 0.000000
192
+ 2023-10-19 21:07:23,659 epoch 8 - iter 890/893 - loss 0.28450041 - time (sec): 22.77 - samples/sec: 10890.11 - lr: 0.000007 - momentum: 0.000000
193
+ 2023-10-19 21:07:23,729 ----------------------------------------------------------------------------------------------------
194
+ 2023-10-19 21:07:23,729 EPOCH 8 done: loss 0.2843 - lr: 0.000007
195
+ 2023-10-19 21:07:26,548 DEV : loss 0.21302838623523712 - f1-score (micro avg) 0.4614
196
+ 2023-10-19 21:07:26,562 ----------------------------------------------------------------------------------------------------
197
+ 2023-10-19 21:07:28,823 epoch 9 - iter 89/893 - loss 0.27434130 - time (sec): 2.26 - samples/sec: 11242.35 - lr: 0.000006 - momentum: 0.000000
198
+ 2023-10-19 21:07:31,078 epoch 9 - iter 178/893 - loss 0.27958386 - time (sec): 4.52 - samples/sec: 11145.59 - lr: 0.000006 - momentum: 0.000000
199
+ 2023-10-19 21:07:33,324 epoch 9 - iter 267/893 - loss 0.28201334 - time (sec): 6.76 - samples/sec: 10968.95 - lr: 0.000006 - momentum: 0.000000
200
+ 2023-10-19 21:07:35,560 epoch 9 - iter 356/893 - loss 0.28336896 - time (sec): 9.00 - samples/sec: 10917.79 - lr: 0.000005 - momentum: 0.000000
201
+ 2023-10-19 21:07:37,762 epoch 9 - iter 445/893 - loss 0.28043243 - time (sec): 11.20 - samples/sec: 10900.38 - lr: 0.000005 - momentum: 0.000000
202
+ 2023-10-19 21:07:40,039 epoch 9 - iter 534/893 - loss 0.27712063 - time (sec): 13.48 - samples/sec: 11038.22 - lr: 0.000005 - momentum: 0.000000
203
+ 2023-10-19 21:07:42,253 epoch 9 - iter 623/893 - loss 0.27542232 - time (sec): 15.69 - samples/sec: 11006.18 - lr: 0.000004 - momentum: 0.000000
204
+ 2023-10-19 21:07:44,549 epoch 9 - iter 712/893 - loss 0.27658911 - time (sec): 17.99 - samples/sec: 11007.29 - lr: 0.000004 - momentum: 0.000000
205
+ 2023-10-19 21:07:46,811 epoch 9 - iter 801/893 - loss 0.27538194 - time (sec): 20.25 - samples/sec: 10997.20 - lr: 0.000004 - momentum: 0.000000
206
+ 2023-10-19 21:07:49,039 epoch 9 - iter 890/893 - loss 0.27416186 - time (sec): 22.48 - samples/sec: 11034.55 - lr: 0.000003 - momentum: 0.000000
207
+ 2023-10-19 21:07:49,117 ----------------------------------------------------------------------------------------------------
208
+ 2023-10-19 21:07:49,117 EPOCH 9 done: loss 0.2741 - lr: 0.000003
209
+ 2023-10-19 21:07:51,965 DEV : loss 0.21206232905387878 - f1-score (micro avg) 0.4562
210
+ 2023-10-19 21:07:51,980 ----------------------------------------------------------------------------------------------------
211
+ 2023-10-19 21:07:54,185 epoch 10 - iter 89/893 - loss 0.25869512 - time (sec): 2.20 - samples/sec: 10712.67 - lr: 0.000003 - momentum: 0.000000
212
+ 2023-10-19 21:07:56,391 epoch 10 - iter 178/893 - loss 0.26163812 - time (sec): 4.41 - samples/sec: 10800.80 - lr: 0.000003 - momentum: 0.000000
213
+ 2023-10-19 21:07:58,723 epoch 10 - iter 267/893 - loss 0.26555406 - time (sec): 6.74 - samples/sec: 11025.94 - lr: 0.000002 - momentum: 0.000000
214
+ 2023-10-19 21:08:00,990 epoch 10 - iter 356/893 - loss 0.26148161 - time (sec): 9.01 - samples/sec: 10868.20 - lr: 0.000002 - momentum: 0.000000
215
+ 2023-10-19 21:08:03,290 epoch 10 - iter 445/893 - loss 0.26190440 - time (sec): 11.31 - samples/sec: 10794.51 - lr: 0.000002 - momentum: 0.000000
216
+ 2023-10-19 21:08:05,593 epoch 10 - iter 534/893 - loss 0.26271915 - time (sec): 13.61 - samples/sec: 10856.21 - lr: 0.000001 - momentum: 0.000000
217
+ 2023-10-19 21:08:07,843 epoch 10 - iter 623/893 - loss 0.26537037 - time (sec): 15.86 - samples/sec: 10830.62 - lr: 0.000001 - momentum: 0.000000
218
+ 2023-10-19 21:08:10,148 epoch 10 - iter 712/893 - loss 0.27004143 - time (sec): 18.17 - samples/sec: 10879.72 - lr: 0.000001 - momentum: 0.000000
219
+ 2023-10-19 21:08:12,367 epoch 10 - iter 801/893 - loss 0.27091574 - time (sec): 20.39 - samples/sec: 10981.73 - lr: 0.000000 - momentum: 0.000000
220
+ 2023-10-19 21:08:14,609 epoch 10 - iter 890/893 - loss 0.27304663 - time (sec): 22.63 - samples/sec: 10963.71 - lr: 0.000000 - momentum: 0.000000
221
+ 2023-10-19 21:08:14,679 ----------------------------------------------------------------------------------------------------
222
+ 2023-10-19 21:08:14,679 EPOCH 10 done: loss 0.2730 - lr: 0.000000
223
+ 2023-10-19 21:08:17,043 DEV : loss 0.21014845371246338 - f1-score (micro avg) 0.4628
224
+ 2023-10-19 21:08:17,057 saving best model
225
+ 2023-10-19 21:08:17,117 ----------------------------------------------------------------------------------------------------
226
+ 2023-10-19 21:08:17,118 Loading model from best epoch ...
227
+ 2023-10-19 21:08:17,199 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
228
+ 2023-10-19 21:08:21,777
229
+ Results:
230
+ - F-score (micro) 0.3645
231
+ - F-score (macro) 0.2036
232
+ - Accuracy 0.2332
233
+
234
+ By class:
235
+ precision recall f1-score support
236
+
237
+ LOC 0.3682 0.4849 0.4186 1095
238
+ PER 0.3427 0.4219 0.3782 1012
239
+ ORG 0.0426 0.0112 0.0177 357
240
+ HumanProd 0.0000 0.0000 0.0000 33
241
+
242
+ micro avg 0.3458 0.3853 0.3645 2497
243
+ macro avg 0.1884 0.2295 0.2036 2497
244
+ weighted avg 0.3065 0.3853 0.3394 2497
245
+
246
+ 2023-10-19 21:08:21,778 ----------------------------------------------------------------------------------------------------