stefan-it's picture
Upload folder using huggingface_hub
f2580d4
2023-10-17 12:46:59,129 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:59,130 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 12:46:59,130 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:59,130 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-17 12:46:59,130 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:59,130 Train: 7936 sentences
2023-10-17 12:46:59,130 (train_with_dev=False, train_with_test=False)
2023-10-17 12:46:59,130 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:59,130 Training Params:
2023-10-17 12:46:59,130 - learning_rate: "5e-05"
2023-10-17 12:46:59,130 - mini_batch_size: "4"
2023-10-17 12:46:59,130 - max_epochs: "10"
2023-10-17 12:46:59,130 - shuffle: "True"
2023-10-17 12:46:59,130 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:59,130 Plugins:
2023-10-17 12:46:59,130 - TensorboardLogger
2023-10-17 12:46:59,130 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 12:46:59,130 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:59,130 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 12:46:59,130 - metric: "('micro avg', 'f1-score')"
2023-10-17 12:46:59,130 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:59,130 Computation:
2023-10-17 12:46:59,130 - compute on device: cuda:0
2023-10-17 12:46:59,131 - embedding storage: none
2023-10-17 12:46:59,131 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:59,131 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 12:46:59,131 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:59,131 ----------------------------------------------------------------------------------------------------
2023-10-17 12:46:59,131 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 12:47:08,412 epoch 1 - iter 198/1984 - loss 1.95138784 - time (sec): 9.28 - samples/sec: 1840.03 - lr: 0.000005 - momentum: 0.000000
2023-10-17 12:47:17,762 epoch 1 - iter 396/1984 - loss 1.16755511 - time (sec): 18.63 - samples/sec: 1773.29 - lr: 0.000010 - momentum: 0.000000
2023-10-17 12:47:27,276 epoch 1 - iter 594/1984 - loss 0.85631491 - time (sec): 28.14 - samples/sec: 1778.06 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:47:36,514 epoch 1 - iter 792/1984 - loss 0.69613444 - time (sec): 37.38 - samples/sec: 1762.40 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:47:45,647 epoch 1 - iter 990/1984 - loss 0.58337391 - time (sec): 46.52 - samples/sec: 1784.74 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:47:54,968 epoch 1 - iter 1188/1984 - loss 0.50718607 - time (sec): 55.84 - samples/sec: 1799.80 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:48:04,334 epoch 1 - iter 1386/1984 - loss 0.45379013 - time (sec): 65.20 - samples/sec: 1804.34 - lr: 0.000035 - momentum: 0.000000
2023-10-17 12:48:13,644 epoch 1 - iter 1584/1984 - loss 0.42012684 - time (sec): 74.51 - samples/sec: 1783.88 - lr: 0.000040 - momentum: 0.000000
2023-10-17 12:48:22,679 epoch 1 - iter 1782/1984 - loss 0.39242579 - time (sec): 83.55 - samples/sec: 1771.63 - lr: 0.000045 - momentum: 0.000000
2023-10-17 12:48:31,375 epoch 1 - iter 1980/1984 - loss 0.36710615 - time (sec): 92.24 - samples/sec: 1774.74 - lr: 0.000050 - momentum: 0.000000
2023-10-17 12:48:31,545 ----------------------------------------------------------------------------------------------------
2023-10-17 12:48:31,545 EPOCH 1 done: loss 0.3670 - lr: 0.000050
2023-10-17 12:48:34,796 DEV : loss 0.08705931901931763 - f1-score (micro avg) 0.7042
2023-10-17 12:48:34,817 saving best model
2023-10-17 12:48:35,263 ----------------------------------------------------------------------------------------------------
2023-10-17 12:48:44,219 epoch 2 - iter 198/1984 - loss 0.13447725 - time (sec): 8.95 - samples/sec: 1684.39 - lr: 0.000049 - momentum: 0.000000
2023-10-17 12:48:53,342 epoch 2 - iter 396/1984 - loss 0.12519486 - time (sec): 18.08 - samples/sec: 1746.79 - lr: 0.000049 - momentum: 0.000000
2023-10-17 12:49:02,617 epoch 2 - iter 594/1984 - loss 0.12768946 - time (sec): 27.35 - samples/sec: 1716.69 - lr: 0.000048 - momentum: 0.000000
2023-10-17 12:49:12,897 epoch 2 - iter 792/1984 - loss 0.12434640 - time (sec): 37.63 - samples/sec: 1686.98 - lr: 0.000048 - momentum: 0.000000
2023-10-17 12:49:22,703 epoch 2 - iter 990/1984 - loss 0.12438353 - time (sec): 47.44 - samples/sec: 1692.29 - lr: 0.000047 - momentum: 0.000000
2023-10-17 12:49:31,761 epoch 2 - iter 1188/1984 - loss 0.12321391 - time (sec): 56.50 - samples/sec: 1711.71 - lr: 0.000047 - momentum: 0.000000
2023-10-17 12:49:40,930 epoch 2 - iter 1386/1984 - loss 0.12223100 - time (sec): 65.67 - samples/sec: 1715.45 - lr: 0.000046 - momentum: 0.000000
2023-10-17 12:49:50,575 epoch 2 - iter 1584/1984 - loss 0.12393554 - time (sec): 75.31 - samples/sec: 1718.04 - lr: 0.000046 - momentum: 0.000000
2023-10-17 12:49:59,914 epoch 2 - iter 1782/1984 - loss 0.12338924 - time (sec): 84.65 - samples/sec: 1734.21 - lr: 0.000045 - momentum: 0.000000
2023-10-17 12:50:09,035 epoch 2 - iter 1980/1984 - loss 0.12187740 - time (sec): 93.77 - samples/sec: 1745.61 - lr: 0.000044 - momentum: 0.000000
2023-10-17 12:50:09,216 ----------------------------------------------------------------------------------------------------
2023-10-17 12:50:09,217 EPOCH 2 done: loss 0.1217 - lr: 0.000044
2023-10-17 12:50:13,258 DEV : loss 0.11133266240358353 - f1-score (micro avg) 0.7549
2023-10-17 12:50:13,279 saving best model
2023-10-17 12:50:13,803 ----------------------------------------------------------------------------------------------------
2023-10-17 12:50:23,285 epoch 3 - iter 198/1984 - loss 0.10677996 - time (sec): 9.48 - samples/sec: 1681.00 - lr: 0.000044 - momentum: 0.000000
2023-10-17 12:50:32,630 epoch 3 - iter 396/1984 - loss 0.10024302 - time (sec): 18.82 - samples/sec: 1736.37 - lr: 0.000043 - momentum: 0.000000
2023-10-17 12:50:41,612 epoch 3 - iter 594/1984 - loss 0.09584361 - time (sec): 27.80 - samples/sec: 1762.32 - lr: 0.000043 - momentum: 0.000000
2023-10-17 12:50:50,953 epoch 3 - iter 792/1984 - loss 0.09534706 - time (sec): 37.15 - samples/sec: 1757.15 - lr: 0.000042 - momentum: 0.000000
2023-10-17 12:51:00,042 epoch 3 - iter 990/1984 - loss 0.09438992 - time (sec): 46.23 - samples/sec: 1747.78 - lr: 0.000042 - momentum: 0.000000
2023-10-17 12:51:09,296 epoch 3 - iter 1188/1984 - loss 0.09441971 - time (sec): 55.49 - samples/sec: 1746.25 - lr: 0.000041 - momentum: 0.000000
2023-10-17 12:51:18,660 epoch 3 - iter 1386/1984 - loss 0.09258276 - time (sec): 64.85 - samples/sec: 1752.18 - lr: 0.000041 - momentum: 0.000000
2023-10-17 12:51:27,819 epoch 3 - iter 1584/1984 - loss 0.09280077 - time (sec): 74.01 - samples/sec: 1778.28 - lr: 0.000040 - momentum: 0.000000
2023-10-17 12:51:36,996 epoch 3 - iter 1782/1984 - loss 0.09287512 - time (sec): 83.19 - samples/sec: 1776.15 - lr: 0.000039 - momentum: 0.000000
2023-10-17 12:51:46,023 epoch 3 - iter 1980/1984 - loss 0.09227744 - time (sec): 92.22 - samples/sec: 1774.99 - lr: 0.000039 - momentum: 0.000000
2023-10-17 12:51:46,205 ----------------------------------------------------------------------------------------------------
2023-10-17 12:51:46,205 EPOCH 3 done: loss 0.0921 - lr: 0.000039
2023-10-17 12:51:49,815 DEV : loss 0.12508752942085266 - f1-score (micro avg) 0.7457
2023-10-17 12:51:49,839 ----------------------------------------------------------------------------------------------------
2023-10-17 12:51:59,362 epoch 4 - iter 198/1984 - loss 0.07168588 - time (sec): 9.52 - samples/sec: 1798.75 - lr: 0.000038 - momentum: 0.000000
2023-10-17 12:52:08,583 epoch 4 - iter 396/1984 - loss 0.06670880 - time (sec): 18.74 - samples/sec: 1756.76 - lr: 0.000038 - momentum: 0.000000
2023-10-17 12:52:17,944 epoch 4 - iter 594/1984 - loss 0.06287904 - time (sec): 28.10 - samples/sec: 1742.76 - lr: 0.000037 - momentum: 0.000000
2023-10-17 12:52:27,311 epoch 4 - iter 792/1984 - loss 0.06538020 - time (sec): 37.47 - samples/sec: 1735.71 - lr: 0.000037 - momentum: 0.000000
2023-10-17 12:52:36,780 epoch 4 - iter 990/1984 - loss 0.06605263 - time (sec): 46.94 - samples/sec: 1743.88 - lr: 0.000036 - momentum: 0.000000
2023-10-17 12:52:46,097 epoch 4 - iter 1188/1984 - loss 0.06749132 - time (sec): 56.26 - samples/sec: 1734.01 - lr: 0.000036 - momentum: 0.000000
2023-10-17 12:52:55,393 epoch 4 - iter 1386/1984 - loss 0.07064539 - time (sec): 65.55 - samples/sec: 1746.22 - lr: 0.000035 - momentum: 0.000000
2023-10-17 12:53:04,548 epoch 4 - iter 1584/1984 - loss 0.07083371 - time (sec): 74.71 - samples/sec: 1753.43 - lr: 0.000034 - momentum: 0.000000
2023-10-17 12:53:13,831 epoch 4 - iter 1782/1984 - loss 0.07085589 - time (sec): 83.99 - samples/sec: 1755.38 - lr: 0.000034 - momentum: 0.000000
2023-10-17 12:53:23,039 epoch 4 - iter 1980/1984 - loss 0.07179751 - time (sec): 93.20 - samples/sec: 1755.99 - lr: 0.000033 - momentum: 0.000000
2023-10-17 12:53:23,235 ----------------------------------------------------------------------------------------------------
2023-10-17 12:53:23,235 EPOCH 4 done: loss 0.0719 - lr: 0.000033
2023-10-17 12:53:26,874 DEV : loss 0.15007272362709045 - f1-score (micro avg) 0.7434
2023-10-17 12:53:26,898 ----------------------------------------------------------------------------------------------------
2023-10-17 12:53:36,403 epoch 5 - iter 198/1984 - loss 0.05191204 - time (sec): 9.50 - samples/sec: 1735.42 - lr: 0.000033 - momentum: 0.000000
2023-10-17 12:53:45,521 epoch 5 - iter 396/1984 - loss 0.04935558 - time (sec): 18.62 - samples/sec: 1774.10 - lr: 0.000032 - momentum: 0.000000
2023-10-17 12:53:54,441 epoch 5 - iter 594/1984 - loss 0.05488619 - time (sec): 27.54 - samples/sec: 1768.34 - lr: 0.000032 - momentum: 0.000000
2023-10-17 12:54:03,624 epoch 5 - iter 792/1984 - loss 0.05643838 - time (sec): 36.72 - samples/sec: 1766.61 - lr: 0.000031 - momentum: 0.000000
2023-10-17 12:54:12,819 epoch 5 - iter 990/1984 - loss 0.05590674 - time (sec): 45.92 - samples/sec: 1773.21 - lr: 0.000031 - momentum: 0.000000
2023-10-17 12:54:21,976 epoch 5 - iter 1188/1984 - loss 0.05482986 - time (sec): 55.08 - samples/sec: 1755.80 - lr: 0.000030 - momentum: 0.000000
2023-10-17 12:54:31,108 epoch 5 - iter 1386/1984 - loss 0.05386415 - time (sec): 64.21 - samples/sec: 1772.47 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:54:40,403 epoch 5 - iter 1584/1984 - loss 0.05340092 - time (sec): 73.50 - samples/sec: 1770.69 - lr: 0.000029 - momentum: 0.000000
2023-10-17 12:54:49,745 epoch 5 - iter 1782/1984 - loss 0.05379444 - time (sec): 82.85 - samples/sec: 1771.56 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:54:59,027 epoch 5 - iter 1980/1984 - loss 0.05432464 - time (sec): 92.13 - samples/sec: 1776.36 - lr: 0.000028 - momentum: 0.000000
2023-10-17 12:54:59,213 ----------------------------------------------------------------------------------------------------
2023-10-17 12:54:59,214 EPOCH 5 done: loss 0.0543 - lr: 0.000028
2023-10-17 12:55:02,841 DEV : loss 0.20846031606197357 - f1-score (micro avg) 0.7471
2023-10-17 12:55:02,866 ----------------------------------------------------------------------------------------------------
2023-10-17 12:55:12,114 epoch 6 - iter 198/1984 - loss 0.03953401 - time (sec): 9.25 - samples/sec: 1769.03 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:55:21,364 epoch 6 - iter 396/1984 - loss 0.03812773 - time (sec): 18.50 - samples/sec: 1765.39 - lr: 0.000027 - momentum: 0.000000
2023-10-17 12:55:30,551 epoch 6 - iter 594/1984 - loss 0.04268890 - time (sec): 27.68 - samples/sec: 1778.53 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:55:40,019 epoch 6 - iter 792/1984 - loss 0.04138806 - time (sec): 37.15 - samples/sec: 1769.41 - lr: 0.000026 - momentum: 0.000000
2023-10-17 12:55:49,073 epoch 6 - iter 990/1984 - loss 0.04025963 - time (sec): 46.21 - samples/sec: 1763.61 - lr: 0.000025 - momentum: 0.000000
2023-10-17 12:55:58,111 epoch 6 - iter 1188/1984 - loss 0.03971711 - time (sec): 55.24 - samples/sec: 1746.68 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:56:07,294 epoch 6 - iter 1386/1984 - loss 0.04025272 - time (sec): 64.43 - samples/sec: 1752.53 - lr: 0.000024 - momentum: 0.000000
2023-10-17 12:56:16,467 epoch 6 - iter 1584/1984 - loss 0.03959126 - time (sec): 73.60 - samples/sec: 1771.43 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:56:25,663 epoch 6 - iter 1782/1984 - loss 0.03963302 - time (sec): 82.80 - samples/sec: 1766.72 - lr: 0.000023 - momentum: 0.000000
2023-10-17 12:56:35,200 epoch 6 - iter 1980/1984 - loss 0.03953747 - time (sec): 92.33 - samples/sec: 1772.28 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:56:35,381 ----------------------------------------------------------------------------------------------------
2023-10-17 12:56:35,381 EPOCH 6 done: loss 0.0396 - lr: 0.000022
2023-10-17 12:56:39,041 DEV : loss 0.21264401078224182 - f1-score (micro avg) 0.7517
2023-10-17 12:56:39,065 ----------------------------------------------------------------------------------------------------
2023-10-17 12:56:48,296 epoch 7 - iter 198/1984 - loss 0.02510474 - time (sec): 9.23 - samples/sec: 1692.24 - lr: 0.000022 - momentum: 0.000000
2023-10-17 12:56:57,912 epoch 7 - iter 396/1984 - loss 0.02487195 - time (sec): 18.85 - samples/sec: 1760.14 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:57:07,100 epoch 7 - iter 594/1984 - loss 0.02627481 - time (sec): 28.03 - samples/sec: 1757.59 - lr: 0.000021 - momentum: 0.000000
2023-10-17 12:57:16,372 epoch 7 - iter 792/1984 - loss 0.03015138 - time (sec): 37.31 - samples/sec: 1746.05 - lr: 0.000020 - momentum: 0.000000
2023-10-17 12:57:25,611 epoch 7 - iter 990/1984 - loss 0.02986521 - time (sec): 46.54 - samples/sec: 1749.85 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:57:34,638 epoch 7 - iter 1188/1984 - loss 0.02921708 - time (sec): 55.57 - samples/sec: 1768.09 - lr: 0.000019 - momentum: 0.000000
2023-10-17 12:57:44,038 epoch 7 - iter 1386/1984 - loss 0.02876629 - time (sec): 64.97 - samples/sec: 1773.71 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:57:53,266 epoch 7 - iter 1584/1984 - loss 0.02894997 - time (sec): 74.20 - samples/sec: 1770.84 - lr: 0.000018 - momentum: 0.000000
2023-10-17 12:58:02,591 epoch 7 - iter 1782/1984 - loss 0.03019572 - time (sec): 83.52 - samples/sec: 1765.81 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:58:11,832 epoch 7 - iter 1980/1984 - loss 0.02983351 - time (sec): 92.77 - samples/sec: 1764.81 - lr: 0.000017 - momentum: 0.000000
2023-10-17 12:58:12,018 ----------------------------------------------------------------------------------------------------
2023-10-17 12:58:12,018 EPOCH 7 done: loss 0.0299 - lr: 0.000017
2023-10-17 12:58:16,123 DEV : loss 0.23210309445858002 - f1-score (micro avg) 0.7468
2023-10-17 12:58:16,144 ----------------------------------------------------------------------------------------------------
2023-10-17 12:58:25,261 epoch 8 - iter 198/1984 - loss 0.02662398 - time (sec): 9.12 - samples/sec: 1763.88 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:58:34,458 epoch 8 - iter 396/1984 - loss 0.02320812 - time (sec): 18.31 - samples/sec: 1753.52 - lr: 0.000016 - momentum: 0.000000
2023-10-17 12:58:43,623 epoch 8 - iter 594/1984 - loss 0.02078829 - time (sec): 27.48 - samples/sec: 1758.96 - lr: 0.000015 - momentum: 0.000000
2023-10-17 12:58:53,133 epoch 8 - iter 792/1984 - loss 0.01981068 - time (sec): 36.99 - samples/sec: 1754.70 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:59:02,118 epoch 8 - iter 990/1984 - loss 0.02018388 - time (sec): 45.97 - samples/sec: 1780.30 - lr: 0.000014 - momentum: 0.000000
2023-10-17 12:59:11,160 epoch 8 - iter 1188/1984 - loss 0.02143587 - time (sec): 55.02 - samples/sec: 1768.00 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:59:20,403 epoch 8 - iter 1386/1984 - loss 0.02108215 - time (sec): 64.26 - samples/sec: 1779.78 - lr: 0.000013 - momentum: 0.000000
2023-10-17 12:59:29,551 epoch 8 - iter 1584/1984 - loss 0.02050008 - time (sec): 73.41 - samples/sec: 1790.79 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:59:38,685 epoch 8 - iter 1782/1984 - loss 0.02034437 - time (sec): 82.54 - samples/sec: 1790.41 - lr: 0.000012 - momentum: 0.000000
2023-10-17 12:59:47,933 epoch 8 - iter 1980/1984 - loss 0.02018234 - time (sec): 91.79 - samples/sec: 1782.77 - lr: 0.000011 - momentum: 0.000000
2023-10-17 12:59:48,133 ----------------------------------------------------------------------------------------------------
2023-10-17 12:59:48,133 EPOCH 8 done: loss 0.0203 - lr: 0.000011
2023-10-17 12:59:51,750 DEV : loss 0.23719041049480438 - f1-score (micro avg) 0.7626
2023-10-17 12:59:51,774 saving best model
2023-10-17 12:59:52,342 ----------------------------------------------------------------------------------------------------
2023-10-17 13:00:01,612 epoch 9 - iter 198/1984 - loss 0.01128630 - time (sec): 9.27 - samples/sec: 1795.34 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:00:10,749 epoch 9 - iter 396/1984 - loss 0.00943051 - time (sec): 18.41 - samples/sec: 1820.75 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:00:19,875 epoch 9 - iter 594/1984 - loss 0.01123883 - time (sec): 27.53 - samples/sec: 1777.99 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:00:29,376 epoch 9 - iter 792/1984 - loss 0.01104369 - time (sec): 37.03 - samples/sec: 1767.64 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:00:38,816 epoch 9 - iter 990/1984 - loss 0.01110156 - time (sec): 46.47 - samples/sec: 1763.83 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:00:48,150 epoch 9 - iter 1188/1984 - loss 0.01314728 - time (sec): 55.81 - samples/sec: 1775.48 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:00:57,267 epoch 9 - iter 1386/1984 - loss 0.01317730 - time (sec): 64.92 - samples/sec: 1764.93 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:01:06,731 epoch 9 - iter 1584/1984 - loss 0.01382017 - time (sec): 74.39 - samples/sec: 1760.10 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:01:15,926 epoch 9 - iter 1782/1984 - loss 0.01358330 - time (sec): 83.58 - samples/sec: 1759.37 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:01:25,451 epoch 9 - iter 1980/1984 - loss 0.01373408 - time (sec): 93.11 - samples/sec: 1756.06 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:01:25,664 ----------------------------------------------------------------------------------------------------
2023-10-17 13:01:25,665 EPOCH 9 done: loss 0.0138 - lr: 0.000006
2023-10-17 13:01:29,408 DEV : loss 0.2416975200176239 - f1-score (micro avg) 0.7531
2023-10-17 13:01:29,434 ----------------------------------------------------------------------------------------------------
2023-10-17 13:01:38,722 epoch 10 - iter 198/1984 - loss 0.00724275 - time (sec): 9.29 - samples/sec: 1731.53 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:01:48,062 epoch 10 - iter 396/1984 - loss 0.00878620 - time (sec): 18.63 - samples/sec: 1741.75 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:01:57,341 epoch 10 - iter 594/1984 - loss 0.00792020 - time (sec): 27.91 - samples/sec: 1788.05 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:02:06,885 epoch 10 - iter 792/1984 - loss 0.00864729 - time (sec): 37.45 - samples/sec: 1766.05 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:02:16,926 epoch 10 - iter 990/1984 - loss 0.00875296 - time (sec): 47.49 - samples/sec: 1730.47 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:02:26,316 epoch 10 - iter 1188/1984 - loss 0.00957040 - time (sec): 56.88 - samples/sec: 1731.11 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:02:35,536 epoch 10 - iter 1386/1984 - loss 0.00980326 - time (sec): 66.10 - samples/sec: 1733.47 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:02:44,848 epoch 10 - iter 1584/1984 - loss 0.00931466 - time (sec): 75.41 - samples/sec: 1747.44 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:02:53,940 epoch 10 - iter 1782/1984 - loss 0.00906989 - time (sec): 84.50 - samples/sec: 1750.28 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:03:03,073 epoch 10 - iter 1980/1984 - loss 0.00906925 - time (sec): 93.64 - samples/sec: 1748.58 - lr: 0.000000 - momentum: 0.000000
2023-10-17 13:03:03,259 ----------------------------------------------------------------------------------------------------
2023-10-17 13:03:03,259 EPOCH 10 done: loss 0.0092 - lr: 0.000000
2023-10-17 13:03:06,977 DEV : loss 0.25551512837409973 - f1-score (micro avg) 0.7566
2023-10-17 13:03:07,490 ----------------------------------------------------------------------------------------------------
2023-10-17 13:03:07,492 Loading model from best epoch ...
2023-10-17 13:03:09,090 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 13:03:12,571
Results:
- F-score (micro) 0.7774
- F-score (macro) 0.6822
- Accuracy 0.6592
By class:
precision recall f1-score support
LOC 0.8343 0.8611 0.8475 655
PER 0.6920 0.7758 0.7315 223
ORG 0.5192 0.4252 0.4675 127
micro avg 0.7680 0.7871 0.7774 1005
macro avg 0.6819 0.6874 0.6822 1005
weighted avg 0.7629 0.7871 0.7737 1005
2023-10-17 13:03:12,571 ----------------------------------------------------------------------------------------------------