|
2023-10-13 08:39:45,554 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:45,555 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=25, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-13 08:39:45,555 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:45,555 MultiCorpus: 1100 train + 206 dev + 240 test sentences |
|
- NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator |
|
2023-10-13 08:39:45,556 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:45,556 Train: 1100 sentences |
|
2023-10-13 08:39:45,556 (train_with_dev=False, train_with_test=False) |
|
2023-10-13 08:39:45,556 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:45,556 Training Params: |
|
2023-10-13 08:39:45,556 - learning_rate: "3e-05" |
|
2023-10-13 08:39:45,556 - mini_batch_size: "4" |
|
2023-10-13 08:39:45,556 - max_epochs: "10" |
|
2023-10-13 08:39:45,556 - shuffle: "True" |
|
2023-10-13 08:39:45,556 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:45,556 Plugins: |
|
2023-10-13 08:39:45,556 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-13 08:39:45,556 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:45,556 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-13 08:39:45,556 - metric: "('micro avg', 'f1-score')" |
|
2023-10-13 08:39:45,556 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:45,556 Computation: |
|
2023-10-13 08:39:45,556 - compute on device: cuda:0 |
|
2023-10-13 08:39:45,556 - embedding storage: none |
|
2023-10-13 08:39:45,556 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:45,556 Model training base path: "hmbench-ajmc/de-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-4" |
|
2023-10-13 08:39:45,556 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:45,556 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:46,790 epoch 1 - iter 27/275 - loss 3.34492708 - time (sec): 1.23 - samples/sec: 1750.17 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 08:39:47,978 epoch 1 - iter 54/275 - loss 2.98549674 - time (sec): 2.42 - samples/sec: 1825.23 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 08:39:49,165 epoch 1 - iter 81/275 - loss 2.39560273 - time (sec): 3.61 - samples/sec: 1828.02 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 08:39:50,389 epoch 1 - iter 108/275 - loss 1.98325714 - time (sec): 4.83 - samples/sec: 1816.83 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 08:39:51,611 epoch 1 - iter 135/275 - loss 1.71558303 - time (sec): 6.05 - samples/sec: 1845.41 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 08:39:52,892 epoch 1 - iter 162/275 - loss 1.52471697 - time (sec): 7.33 - samples/sec: 1822.66 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 08:39:54,072 epoch 1 - iter 189/275 - loss 1.35824731 - time (sec): 8.51 - samples/sec: 1846.57 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 08:39:55,267 epoch 1 - iter 216/275 - loss 1.22688036 - time (sec): 9.71 - samples/sec: 1858.03 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 08:39:56,481 epoch 1 - iter 243/275 - loss 1.13136521 - time (sec): 10.92 - samples/sec: 1846.19 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 08:39:57,668 epoch 1 - iter 270/275 - loss 1.04758440 - time (sec): 12.11 - samples/sec: 1840.99 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 08:39:57,903 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:57,903 EPOCH 1 done: loss 1.0321 - lr: 0.000029 |
|
2023-10-13 08:39:58,431 DEV : loss 0.2740706205368042 - f1-score (micro avg) 0.5938 |
|
2023-10-13 08:39:58,436 saving best model |
|
2023-10-13 08:39:58,817 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:39:59,987 epoch 2 - iter 27/275 - loss 0.28926253 - time (sec): 1.17 - samples/sec: 1781.92 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-13 08:40:01,165 epoch 2 - iter 54/275 - loss 0.23992950 - time (sec): 2.35 - samples/sec: 1866.76 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 08:40:02,369 epoch 2 - iter 81/275 - loss 0.22474376 - time (sec): 3.55 - samples/sec: 1828.65 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 08:40:03,619 epoch 2 - iter 108/275 - loss 0.21005906 - time (sec): 4.80 - samples/sec: 1842.77 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-13 08:40:04,881 epoch 2 - iter 135/275 - loss 0.20172376 - time (sec): 6.06 - samples/sec: 1825.64 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 08:40:06,115 epoch 2 - iter 162/275 - loss 0.20055598 - time (sec): 7.30 - samples/sec: 1850.68 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 08:40:07,349 epoch 2 - iter 189/275 - loss 0.18706140 - time (sec): 8.53 - samples/sec: 1832.71 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-13 08:40:08,561 epoch 2 - iter 216/275 - loss 0.18860587 - time (sec): 9.74 - samples/sec: 1851.88 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 08:40:09,744 epoch 2 - iter 243/275 - loss 0.18268234 - time (sec): 10.93 - samples/sec: 1855.54 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 08:40:10,926 epoch 2 - iter 270/275 - loss 0.17950041 - time (sec): 12.11 - samples/sec: 1840.47 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-13 08:40:11,158 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:40:11,158 EPOCH 2 done: loss 0.1813 - lr: 0.000027 |
|
2023-10-13 08:40:11,857 DEV : loss 0.13900607824325562 - f1-score (micro avg) 0.8047 |
|
2023-10-13 08:40:11,862 saving best model |
|
2023-10-13 08:40:12,515 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:40:13,678 epoch 3 - iter 27/275 - loss 0.09446228 - time (sec): 1.16 - samples/sec: 1980.46 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 08:40:14,878 epoch 3 - iter 54/275 - loss 0.10509112 - time (sec): 2.36 - samples/sec: 1903.10 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 08:40:16,054 epoch 3 - iter 81/275 - loss 0.11020384 - time (sec): 3.53 - samples/sec: 1895.80 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-13 08:40:17,252 epoch 3 - iter 108/275 - loss 0.10919792 - time (sec): 4.73 - samples/sec: 1899.21 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 08:40:18,436 epoch 3 - iter 135/275 - loss 0.10692203 - time (sec): 5.91 - samples/sec: 1898.30 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 08:40:19,618 epoch 3 - iter 162/275 - loss 0.10449658 - time (sec): 7.10 - samples/sec: 1894.76 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-13 08:40:20,817 epoch 3 - iter 189/275 - loss 0.10461865 - time (sec): 8.29 - samples/sec: 1921.91 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 08:40:22,002 epoch 3 - iter 216/275 - loss 0.09838979 - time (sec): 9.48 - samples/sec: 1911.69 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 08:40:23,220 epoch 3 - iter 243/275 - loss 0.09432405 - time (sec): 10.70 - samples/sec: 1906.52 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-13 08:40:24,429 epoch 3 - iter 270/275 - loss 0.09763774 - time (sec): 11.91 - samples/sec: 1885.15 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 08:40:24,657 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:40:24,657 EPOCH 3 done: loss 0.0969 - lr: 0.000023 |
|
2023-10-13 08:40:25,296 DEV : loss 0.14131943881511688 - f1-score (micro avg) 0.8547 |
|
2023-10-13 08:40:25,300 saving best model |
|
2023-10-13 08:40:25,754 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:40:26,928 epoch 4 - iter 27/275 - loss 0.04827930 - time (sec): 1.17 - samples/sec: 1870.32 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 08:40:28,133 epoch 4 - iter 54/275 - loss 0.06353083 - time (sec): 2.38 - samples/sec: 1813.08 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-13 08:40:29,327 epoch 4 - iter 81/275 - loss 0.07903506 - time (sec): 3.57 - samples/sec: 1779.84 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 08:40:30,548 epoch 4 - iter 108/275 - loss 0.07878090 - time (sec): 4.79 - samples/sec: 1769.53 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 08:40:31,755 epoch 4 - iter 135/275 - loss 0.07751083 - time (sec): 6.00 - samples/sec: 1783.48 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-13 08:40:32,968 epoch 4 - iter 162/275 - loss 0.07629198 - time (sec): 7.21 - samples/sec: 1796.70 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 08:40:34,202 epoch 4 - iter 189/275 - loss 0.07261307 - time (sec): 8.45 - samples/sec: 1766.74 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 08:40:35,369 epoch 4 - iter 216/275 - loss 0.07313839 - time (sec): 9.61 - samples/sec: 1785.81 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-13 08:40:36,561 epoch 4 - iter 243/275 - loss 0.07538749 - time (sec): 10.81 - samples/sec: 1825.46 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 08:40:37,776 epoch 4 - iter 270/275 - loss 0.07645860 - time (sec): 12.02 - samples/sec: 1856.66 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 08:40:37,995 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:40:37,995 EPOCH 4 done: loss 0.0761 - lr: 0.000020 |
|
2023-10-13 08:40:38,699 DEV : loss 0.14559561014175415 - f1-score (micro avg) 0.8656 |
|
2023-10-13 08:40:38,704 saving best model |
|
2023-10-13 08:40:39,189 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:40:40,424 epoch 5 - iter 27/275 - loss 0.05390686 - time (sec): 1.22 - samples/sec: 1804.58 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-13 08:40:41,626 epoch 5 - iter 54/275 - loss 0.04559267 - time (sec): 2.43 - samples/sec: 1899.76 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 08:40:42,837 epoch 5 - iter 81/275 - loss 0.04680050 - time (sec): 3.64 - samples/sec: 1853.64 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 08:40:44,051 epoch 5 - iter 108/275 - loss 0.05446991 - time (sec): 4.85 - samples/sec: 1878.52 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-13 08:40:45,234 epoch 5 - iter 135/275 - loss 0.05015221 - time (sec): 6.03 - samples/sec: 1894.64 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 08:40:46,414 epoch 5 - iter 162/275 - loss 0.04864770 - time (sec): 7.21 - samples/sec: 1879.46 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 08:40:47,604 epoch 5 - iter 189/275 - loss 0.04806670 - time (sec): 8.41 - samples/sec: 1837.40 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-13 08:40:48,806 epoch 5 - iter 216/275 - loss 0.05702823 - time (sec): 9.61 - samples/sec: 1847.71 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 08:40:49,985 epoch 5 - iter 243/275 - loss 0.05475300 - time (sec): 10.79 - samples/sec: 1845.76 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 08:40:51,186 epoch 5 - iter 270/275 - loss 0.05316789 - time (sec): 11.99 - samples/sec: 1853.62 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-13 08:40:51,410 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:40:51,410 EPOCH 5 done: loss 0.0543 - lr: 0.000017 |
|
2023-10-13 08:40:52,122 DEV : loss 0.15457984805107117 - f1-score (micro avg) 0.8707 |
|
2023-10-13 08:40:52,127 saving best model |
|
2023-10-13 08:40:52,597 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:40:53,788 epoch 6 - iter 27/275 - loss 0.00834565 - time (sec): 1.18 - samples/sec: 1908.10 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 08:40:55,006 epoch 6 - iter 54/275 - loss 0.02701269 - time (sec): 2.40 - samples/sec: 1919.62 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 08:40:56,211 epoch 6 - iter 81/275 - loss 0.03904077 - time (sec): 3.60 - samples/sec: 1911.62 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-13 08:40:57,411 epoch 6 - iter 108/275 - loss 0.03642560 - time (sec): 4.80 - samples/sec: 1858.63 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 08:40:58,631 epoch 6 - iter 135/275 - loss 0.03601588 - time (sec): 6.02 - samples/sec: 1882.34 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 08:40:59,814 epoch 6 - iter 162/275 - loss 0.03433120 - time (sec): 7.21 - samples/sec: 1879.51 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-13 08:41:01,007 epoch 6 - iter 189/275 - loss 0.03662864 - time (sec): 8.40 - samples/sec: 1886.94 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 08:41:02,182 epoch 6 - iter 216/275 - loss 0.03698452 - time (sec): 9.57 - samples/sec: 1874.16 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 08:41:03,366 epoch 6 - iter 243/275 - loss 0.03825393 - time (sec): 10.76 - samples/sec: 1882.59 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-13 08:41:04,534 epoch 6 - iter 270/275 - loss 0.03667517 - time (sec): 11.93 - samples/sec: 1882.16 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 08:41:04,754 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:41:04,755 EPOCH 6 done: loss 0.0375 - lr: 0.000013 |
|
2023-10-13 08:41:05,396 DEV : loss 0.1502944976091385 - f1-score (micro avg) 0.8729 |
|
2023-10-13 08:41:05,401 saving best model |
|
2023-10-13 08:41:05,888 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:41:07,112 epoch 7 - iter 27/275 - loss 0.02041576 - time (sec): 1.22 - samples/sec: 1600.53 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 08:41:08,316 epoch 7 - iter 54/275 - loss 0.03416922 - time (sec): 2.42 - samples/sec: 1704.67 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-13 08:41:09,519 epoch 7 - iter 81/275 - loss 0.03933149 - time (sec): 3.63 - samples/sec: 1780.34 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 08:41:10,762 epoch 7 - iter 108/275 - loss 0.04294870 - time (sec): 4.87 - samples/sec: 1811.96 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 08:41:11,981 epoch 7 - iter 135/275 - loss 0.03624465 - time (sec): 6.09 - samples/sec: 1814.33 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-13 08:41:13,188 epoch 7 - iter 162/275 - loss 0.03659759 - time (sec): 7.29 - samples/sec: 1820.53 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 08:41:14,389 epoch 7 - iter 189/275 - loss 0.03260483 - time (sec): 8.49 - samples/sec: 1849.24 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 08:41:15,586 epoch 7 - iter 216/275 - loss 0.03719664 - time (sec): 9.69 - samples/sec: 1875.70 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-13 08:41:16,810 epoch 7 - iter 243/275 - loss 0.03579088 - time (sec): 10.92 - samples/sec: 1846.59 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 08:41:18,079 epoch 7 - iter 270/275 - loss 0.03303033 - time (sec): 12.18 - samples/sec: 1827.51 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 08:41:18,313 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:41:18,313 EPOCH 7 done: loss 0.0333 - lr: 0.000010 |
|
2023-10-13 08:41:18,948 DEV : loss 0.15816469490528107 - f1-score (micro avg) 0.8841 |
|
2023-10-13 08:41:18,953 saving best model |
|
2023-10-13 08:41:19,419 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:41:20,613 epoch 8 - iter 27/275 - loss 0.01789827 - time (sec): 1.19 - samples/sec: 1820.05 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-13 08:41:21,843 epoch 8 - iter 54/275 - loss 0.02164511 - time (sec): 2.42 - samples/sec: 1751.53 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 08:41:23,007 epoch 8 - iter 81/275 - loss 0.01780909 - time (sec): 3.59 - samples/sec: 1791.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 08:41:24,190 epoch 8 - iter 108/275 - loss 0.02271763 - time (sec): 4.77 - samples/sec: 1836.00 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-13 08:41:25,365 epoch 8 - iter 135/275 - loss 0.02433439 - time (sec): 5.94 - samples/sec: 1839.26 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 08:41:26,546 epoch 8 - iter 162/275 - loss 0.02185385 - time (sec): 7.12 - samples/sec: 1825.82 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 08:41:27,723 epoch 8 - iter 189/275 - loss 0.02114702 - time (sec): 8.30 - samples/sec: 1841.08 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-13 08:41:28,924 epoch 8 - iter 216/275 - loss 0.02175805 - time (sec): 9.50 - samples/sec: 1867.75 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 08:41:30,118 epoch 8 - iter 243/275 - loss 0.02104962 - time (sec): 10.70 - samples/sec: 1880.63 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 08:41:31,309 epoch 8 - iter 270/275 - loss 0.02010803 - time (sec): 11.89 - samples/sec: 1873.25 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-13 08:41:31,547 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:41:31,547 EPOCH 8 done: loss 0.0222 - lr: 0.000007 |
|
2023-10-13 08:41:32,253 DEV : loss 0.1525907665491104 - f1-score (micro avg) 0.8868 |
|
2023-10-13 08:41:32,258 saving best model |
|
2023-10-13 08:41:32,757 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:41:34,168 epoch 9 - iter 27/275 - loss 0.00228100 - time (sec): 1.41 - samples/sec: 1422.83 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 08:41:35,499 epoch 9 - iter 54/275 - loss 0.03448114 - time (sec): 2.74 - samples/sec: 1523.55 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 08:41:36,895 epoch 9 - iter 81/275 - loss 0.02760678 - time (sec): 4.14 - samples/sec: 1543.92 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-13 08:41:38,286 epoch 9 - iter 108/275 - loss 0.02505676 - time (sec): 5.53 - samples/sec: 1582.30 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 08:41:39,649 epoch 9 - iter 135/275 - loss 0.02264791 - time (sec): 6.89 - samples/sec: 1606.19 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 08:41:41,003 epoch 9 - iter 162/275 - loss 0.02014999 - time (sec): 8.24 - samples/sec: 1621.65 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-13 08:41:42,340 epoch 9 - iter 189/275 - loss 0.01792590 - time (sec): 9.58 - samples/sec: 1614.02 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 08:41:43,767 epoch 9 - iter 216/275 - loss 0.01783111 - time (sec): 11.01 - samples/sec: 1621.48 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 08:41:45,240 epoch 9 - iter 243/275 - loss 0.01810447 - time (sec): 12.48 - samples/sec: 1621.95 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-13 08:41:46,634 epoch 9 - iter 270/275 - loss 0.02018950 - time (sec): 13.87 - samples/sec: 1606.63 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 08:41:46,882 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:41:46,883 EPOCH 9 done: loss 0.0201 - lr: 0.000003 |
|
2023-10-13 08:41:47,566 DEV : loss 0.15038155019283295 - f1-score (micro avg) 0.8889 |
|
2023-10-13 08:41:47,571 saving best model |
|
2023-10-13 08:41:48,024 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:41:49,299 epoch 10 - iter 27/275 - loss 0.00487508 - time (sec): 1.27 - samples/sec: 1597.62 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 08:41:50,558 epoch 10 - iter 54/275 - loss 0.01171812 - time (sec): 2.53 - samples/sec: 1731.39 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-13 08:41:51,839 epoch 10 - iter 81/275 - loss 0.00947176 - time (sec): 3.81 - samples/sec: 1707.63 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 08:41:53,089 epoch 10 - iter 108/275 - loss 0.01091956 - time (sec): 5.06 - samples/sec: 1724.90 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 08:41:54,287 epoch 10 - iter 135/275 - loss 0.01359143 - time (sec): 6.26 - samples/sec: 1808.40 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-13 08:41:55,487 epoch 10 - iter 162/275 - loss 0.01467793 - time (sec): 7.46 - samples/sec: 1772.72 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 08:41:56,784 epoch 10 - iter 189/275 - loss 0.01368588 - time (sec): 8.75 - samples/sec: 1785.40 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 08:41:57,986 epoch 10 - iter 216/275 - loss 0.01604266 - time (sec): 9.96 - samples/sec: 1799.19 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-13 08:41:59,170 epoch 10 - iter 243/275 - loss 0.01505167 - time (sec): 11.14 - samples/sec: 1789.24 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 08:42:00,378 epoch 10 - iter 270/275 - loss 0.01518889 - time (sec): 12.35 - samples/sec: 1808.44 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-13 08:42:00,598 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:42:00,598 EPOCH 10 done: loss 0.0156 - lr: 0.000000 |
|
2023-10-13 08:42:01,252 DEV : loss 0.1532716304063797 - f1-score (micro avg) 0.8793 |
|
2023-10-13 08:42:01,614 ---------------------------------------------------------------------------------------------------- |
|
2023-10-13 08:42:01,616 Loading model from best epoch ... |
|
2023-10-13 08:42:03,286 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date |
|
2023-10-13 08:42:04,059 |
|
Results: |
|
- F-score (micro) 0.9147 |
|
- F-score (macro) 0.8082 |
|
- Accuracy 0.8551 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
scope 0.8852 0.9205 0.9025 176 |
|
pers 0.9839 0.9531 0.9683 128 |
|
work 0.8375 0.9054 0.8701 74 |
|
loc 0.6667 1.0000 0.8000 2 |
|
object 0.5000 0.5000 0.5000 2 |
|
|
|
micro avg 0.9031 0.9267 0.9147 382 |
|
macro avg 0.7747 0.8558 0.8082 382 |
|
weighted avg 0.9059 0.9267 0.9156 382 |
|
|
|
2023-10-13 08:42:04,059 ---------------------------------------------------------------------------------------------------- |
|
|