2023-10-17 12:46:59,129 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:59,130 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=13, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 12:46:59,130 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:59,130 MultiCorpus: 7936 train + 992 dev + 992 test sentences - NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr 2023-10-17 12:46:59,130 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:59,130 Train: 7936 sentences 2023-10-17 12:46:59,130 (train_with_dev=False, train_with_test=False) 2023-10-17 12:46:59,130 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:59,130 Training Params: 2023-10-17 12:46:59,130 - learning_rate: "5e-05" 2023-10-17 12:46:59,130 - mini_batch_size: "4" 2023-10-17 12:46:59,130 - max_epochs: "10" 2023-10-17 12:46:59,130 - shuffle: "True" 2023-10-17 12:46:59,130 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:59,130 Plugins: 2023-10-17 12:46:59,130 - TensorboardLogger 2023-10-17 12:46:59,130 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 12:46:59,130 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:59,130 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 12:46:59,130 - metric: "('micro avg', 'f1-score')" 2023-10-17 12:46:59,130 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:59,130 Computation: 2023-10-17 12:46:59,130 - compute on device: cuda:0 2023-10-17 12:46:59,131 - embedding storage: none 2023-10-17 12:46:59,131 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:59,131 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs4-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3" 2023-10-17 12:46:59,131 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:59,131 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:46:59,131 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 12:47:08,412 epoch 1 - iter 198/1984 - loss 1.95138784 - time (sec): 9.28 - samples/sec: 1840.03 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:47:17,762 epoch 1 - iter 396/1984 - loss 1.16755511 - time (sec): 18.63 - samples/sec: 1773.29 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:47:27,276 epoch 1 - iter 594/1984 - loss 0.85631491 - time (sec): 28.14 - samples/sec: 1778.06 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:47:36,514 epoch 1 - iter 792/1984 - loss 0.69613444 - time (sec): 37.38 - samples/sec: 1762.40 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:47:45,647 epoch 1 - iter 990/1984 - loss 0.58337391 - time (sec): 46.52 - samples/sec: 1784.74 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:47:54,968 epoch 1 - iter 1188/1984 - loss 0.50718607 - time (sec): 55.84 - samples/sec: 1799.80 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:48:04,334 epoch 1 - iter 1386/1984 - loss 0.45379013 - time (sec): 65.20 - samples/sec: 1804.34 - lr: 0.000035 - momentum: 0.000000 2023-10-17 12:48:13,644 epoch 1 - iter 1584/1984 - loss 0.42012684 - time (sec): 74.51 - samples/sec: 1783.88 - lr: 0.000040 - momentum: 0.000000 2023-10-17 12:48:22,679 epoch 1 - iter 1782/1984 - loss 0.39242579 - time (sec): 83.55 - samples/sec: 1771.63 - lr: 0.000045 - momentum: 0.000000 2023-10-17 12:48:31,375 epoch 1 - iter 1980/1984 - loss 0.36710615 - time (sec): 92.24 - samples/sec: 1774.74 - lr: 0.000050 - momentum: 0.000000 2023-10-17 12:48:31,545 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:48:31,545 EPOCH 1 done: loss 0.3670 - lr: 0.000050 2023-10-17 12:48:34,796 DEV : loss 0.08705931901931763 - f1-score (micro avg) 0.7042 2023-10-17 12:48:34,817 saving best model 2023-10-17 12:48:35,263 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:48:44,219 epoch 2 - iter 198/1984 - loss 0.13447725 - time (sec): 8.95 - samples/sec: 1684.39 - lr: 0.000049 - momentum: 0.000000 2023-10-17 12:48:53,342 epoch 2 - iter 396/1984 - loss 0.12519486 - time (sec): 18.08 - samples/sec: 1746.79 - lr: 0.000049 - momentum: 0.000000 2023-10-17 12:49:02,617 epoch 2 - iter 594/1984 - loss 0.12768946 - time (sec): 27.35 - samples/sec: 1716.69 - lr: 0.000048 - momentum: 0.000000 2023-10-17 12:49:12,897 epoch 2 - iter 792/1984 - loss 0.12434640 - time (sec): 37.63 - samples/sec: 1686.98 - lr: 0.000048 - momentum: 0.000000 2023-10-17 12:49:22,703 epoch 2 - iter 990/1984 - loss 0.12438353 - time (sec): 47.44 - samples/sec: 1692.29 - lr: 0.000047 - momentum: 0.000000 2023-10-17 12:49:31,761 epoch 2 - iter 1188/1984 - loss 0.12321391 - time (sec): 56.50 - samples/sec: 1711.71 - lr: 0.000047 - momentum: 0.000000 2023-10-17 12:49:40,930 epoch 2 - iter 1386/1984 - loss 0.12223100 - time (sec): 65.67 - samples/sec: 1715.45 - lr: 0.000046 - momentum: 0.000000 2023-10-17 12:49:50,575 epoch 2 - iter 1584/1984 - loss 0.12393554 - time (sec): 75.31 - samples/sec: 1718.04 - lr: 0.000046 - momentum: 0.000000 2023-10-17 12:49:59,914 epoch 2 - iter 1782/1984 - loss 0.12338924 - time (sec): 84.65 - samples/sec: 1734.21 - lr: 0.000045 - momentum: 0.000000 2023-10-17 12:50:09,035 epoch 2 - iter 1980/1984 - loss 0.12187740 - time (sec): 93.77 - samples/sec: 1745.61 - lr: 0.000044 - momentum: 0.000000 2023-10-17 12:50:09,216 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:50:09,217 EPOCH 2 done: loss 0.1217 - lr: 0.000044 2023-10-17 12:50:13,258 DEV : loss 0.11133266240358353 - f1-score (micro avg) 0.7549 2023-10-17 12:50:13,279 saving best model 2023-10-17 12:50:13,803 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:50:23,285 epoch 3 - iter 198/1984 - loss 0.10677996 - time (sec): 9.48 - samples/sec: 1681.00 - lr: 0.000044 - momentum: 0.000000 2023-10-17 12:50:32,630 epoch 3 - iter 396/1984 - loss 0.10024302 - time (sec): 18.82 - samples/sec: 1736.37 - lr: 0.000043 - momentum: 0.000000 2023-10-17 12:50:41,612 epoch 3 - iter 594/1984 - loss 0.09584361 - time (sec): 27.80 - samples/sec: 1762.32 - lr: 0.000043 - momentum: 0.000000 2023-10-17 12:50:50,953 epoch 3 - iter 792/1984 - loss 0.09534706 - time (sec): 37.15 - samples/sec: 1757.15 - lr: 0.000042 - momentum: 0.000000 2023-10-17 12:51:00,042 epoch 3 - iter 990/1984 - loss 0.09438992 - time (sec): 46.23 - samples/sec: 1747.78 - lr: 0.000042 - momentum: 0.000000 2023-10-17 12:51:09,296 epoch 3 - iter 1188/1984 - loss 0.09441971 - time (sec): 55.49 - samples/sec: 1746.25 - lr: 0.000041 - momentum: 0.000000 2023-10-17 12:51:18,660 epoch 3 - iter 1386/1984 - loss 0.09258276 - time (sec): 64.85 - samples/sec: 1752.18 - lr: 0.000041 - momentum: 0.000000 2023-10-17 12:51:27,819 epoch 3 - iter 1584/1984 - loss 0.09280077 - time (sec): 74.01 - samples/sec: 1778.28 - lr: 0.000040 - momentum: 0.000000 2023-10-17 12:51:36,996 epoch 3 - iter 1782/1984 - loss 0.09287512 - time (sec): 83.19 - samples/sec: 1776.15 - lr: 0.000039 - momentum: 0.000000 2023-10-17 12:51:46,023 epoch 3 - iter 1980/1984 - loss 0.09227744 - time (sec): 92.22 - samples/sec: 1774.99 - lr: 0.000039 - momentum: 0.000000 2023-10-17 12:51:46,205 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:51:46,205 EPOCH 3 done: loss 0.0921 - lr: 0.000039 2023-10-17 12:51:49,815 DEV : loss 0.12508752942085266 - f1-score (micro avg) 0.7457 2023-10-17 12:51:49,839 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:51:59,362 epoch 4 - iter 198/1984 - loss 0.07168588 - time (sec): 9.52 - samples/sec: 1798.75 - lr: 0.000038 - momentum: 0.000000 2023-10-17 12:52:08,583 epoch 4 - iter 396/1984 - loss 0.06670880 - time (sec): 18.74 - samples/sec: 1756.76 - lr: 0.000038 - momentum: 0.000000 2023-10-17 12:52:17,944 epoch 4 - iter 594/1984 - loss 0.06287904 - time (sec): 28.10 - samples/sec: 1742.76 - lr: 0.000037 - momentum: 0.000000 2023-10-17 12:52:27,311 epoch 4 - iter 792/1984 - loss 0.06538020 - time (sec): 37.47 - samples/sec: 1735.71 - lr: 0.000037 - momentum: 0.000000 2023-10-17 12:52:36,780 epoch 4 - iter 990/1984 - loss 0.06605263 - time (sec): 46.94 - samples/sec: 1743.88 - lr: 0.000036 - momentum: 0.000000 2023-10-17 12:52:46,097 epoch 4 - iter 1188/1984 - loss 0.06749132 - time (sec): 56.26 - samples/sec: 1734.01 - lr: 0.000036 - momentum: 0.000000 2023-10-17 12:52:55,393 epoch 4 - iter 1386/1984 - loss 0.07064539 - time (sec): 65.55 - samples/sec: 1746.22 - lr: 0.000035 - momentum: 0.000000 2023-10-17 12:53:04,548 epoch 4 - iter 1584/1984 - loss 0.07083371 - time (sec): 74.71 - samples/sec: 1753.43 - lr: 0.000034 - momentum: 0.000000 2023-10-17 12:53:13,831 epoch 4 - iter 1782/1984 - loss 0.07085589 - time (sec): 83.99 - samples/sec: 1755.38 - lr: 0.000034 - momentum: 0.000000 2023-10-17 12:53:23,039 epoch 4 - iter 1980/1984 - loss 0.07179751 - time (sec): 93.20 - samples/sec: 1755.99 - lr: 0.000033 - momentum: 0.000000 2023-10-17 12:53:23,235 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:53:23,235 EPOCH 4 done: loss 0.0719 - lr: 0.000033 2023-10-17 12:53:26,874 DEV : loss 0.15007272362709045 - f1-score (micro avg) 0.7434 2023-10-17 12:53:26,898 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:53:36,403 epoch 5 - iter 198/1984 - loss 0.05191204 - time (sec): 9.50 - samples/sec: 1735.42 - lr: 0.000033 - momentum: 0.000000 2023-10-17 12:53:45,521 epoch 5 - iter 396/1984 - loss 0.04935558 - time (sec): 18.62 - samples/sec: 1774.10 - lr: 0.000032 - momentum: 0.000000 2023-10-17 12:53:54,441 epoch 5 - iter 594/1984 - loss 0.05488619 - time (sec): 27.54 - samples/sec: 1768.34 - lr: 0.000032 - momentum: 0.000000 2023-10-17 12:54:03,624 epoch 5 - iter 792/1984 - loss 0.05643838 - time (sec): 36.72 - samples/sec: 1766.61 - lr: 0.000031 - momentum: 0.000000 2023-10-17 12:54:12,819 epoch 5 - iter 990/1984 - loss 0.05590674 - time (sec): 45.92 - samples/sec: 1773.21 - lr: 0.000031 - momentum: 0.000000 2023-10-17 12:54:21,976 epoch 5 - iter 1188/1984 - loss 0.05482986 - time (sec): 55.08 - samples/sec: 1755.80 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:54:31,108 epoch 5 - iter 1386/1984 - loss 0.05386415 - time (sec): 64.21 - samples/sec: 1772.47 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:54:40,403 epoch 5 - iter 1584/1984 - loss 0.05340092 - time (sec): 73.50 - samples/sec: 1770.69 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:54:49,745 epoch 5 - iter 1782/1984 - loss 0.05379444 - time (sec): 82.85 - samples/sec: 1771.56 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:54:59,027 epoch 5 - iter 1980/1984 - loss 0.05432464 - time (sec): 92.13 - samples/sec: 1776.36 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:54:59,213 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:54:59,214 EPOCH 5 done: loss 0.0543 - lr: 0.000028 2023-10-17 12:55:02,841 DEV : loss 0.20846031606197357 - f1-score (micro avg) 0.7471 2023-10-17 12:55:02,866 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:55:12,114 epoch 6 - iter 198/1984 - loss 0.03953401 - time (sec): 9.25 - samples/sec: 1769.03 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:55:21,364 epoch 6 - iter 396/1984 - loss 0.03812773 - time (sec): 18.50 - samples/sec: 1765.39 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:55:30,551 epoch 6 - iter 594/1984 - loss 0.04268890 - time (sec): 27.68 - samples/sec: 1778.53 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:55:40,019 epoch 6 - iter 792/1984 - loss 0.04138806 - time (sec): 37.15 - samples/sec: 1769.41 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:55:49,073 epoch 6 - iter 990/1984 - loss 0.04025963 - time (sec): 46.21 - samples/sec: 1763.61 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:55:58,111 epoch 6 - iter 1188/1984 - loss 0.03971711 - time (sec): 55.24 - samples/sec: 1746.68 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:56:07,294 epoch 6 - iter 1386/1984 - loss 0.04025272 - time (sec): 64.43 - samples/sec: 1752.53 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:56:16,467 epoch 6 - iter 1584/1984 - loss 0.03959126 - time (sec): 73.60 - samples/sec: 1771.43 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:56:25,663 epoch 6 - iter 1782/1984 - loss 0.03963302 - time (sec): 82.80 - samples/sec: 1766.72 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:56:35,200 epoch 6 - iter 1980/1984 - loss 0.03953747 - time (sec): 92.33 - samples/sec: 1772.28 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:56:35,381 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:56:35,381 EPOCH 6 done: loss 0.0396 - lr: 0.000022 2023-10-17 12:56:39,041 DEV : loss 0.21264401078224182 - f1-score (micro avg) 0.7517 2023-10-17 12:56:39,065 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:56:48,296 epoch 7 - iter 198/1984 - loss 0.02510474 - time (sec): 9.23 - samples/sec: 1692.24 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:56:57,912 epoch 7 - iter 396/1984 - loss 0.02487195 - time (sec): 18.85 - samples/sec: 1760.14 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:57:07,100 epoch 7 - iter 594/1984 - loss 0.02627481 - time (sec): 28.03 - samples/sec: 1757.59 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:57:16,372 epoch 7 - iter 792/1984 - loss 0.03015138 - time (sec): 37.31 - samples/sec: 1746.05 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:57:25,611 epoch 7 - iter 990/1984 - loss 0.02986521 - time (sec): 46.54 - samples/sec: 1749.85 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:57:34,638 epoch 7 - iter 1188/1984 - loss 0.02921708 - time (sec): 55.57 - samples/sec: 1768.09 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:57:44,038 epoch 7 - iter 1386/1984 - loss 0.02876629 - time (sec): 64.97 - samples/sec: 1773.71 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:57:53,266 epoch 7 - iter 1584/1984 - loss 0.02894997 - time (sec): 74.20 - samples/sec: 1770.84 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:58:02,591 epoch 7 - iter 1782/1984 - loss 0.03019572 - time (sec): 83.52 - samples/sec: 1765.81 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:58:11,832 epoch 7 - iter 1980/1984 - loss 0.02983351 - time (sec): 92.77 - samples/sec: 1764.81 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:58:12,018 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:58:12,018 EPOCH 7 done: loss 0.0299 - lr: 0.000017 2023-10-17 12:58:16,123 DEV : loss 0.23210309445858002 - f1-score (micro avg) 0.7468 2023-10-17 12:58:16,144 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:58:25,261 epoch 8 - iter 198/1984 - loss 0.02662398 - time (sec): 9.12 - samples/sec: 1763.88 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:58:34,458 epoch 8 - iter 396/1984 - loss 0.02320812 - time (sec): 18.31 - samples/sec: 1753.52 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:58:43,623 epoch 8 - iter 594/1984 - loss 0.02078829 - time (sec): 27.48 - samples/sec: 1758.96 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:58:53,133 epoch 8 - iter 792/1984 - loss 0.01981068 - time (sec): 36.99 - samples/sec: 1754.70 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:59:02,118 epoch 8 - iter 990/1984 - loss 0.02018388 - time (sec): 45.97 - samples/sec: 1780.30 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:59:11,160 epoch 8 - iter 1188/1984 - loss 0.02143587 - time (sec): 55.02 - samples/sec: 1768.00 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:59:20,403 epoch 8 - iter 1386/1984 - loss 0.02108215 - time (sec): 64.26 - samples/sec: 1779.78 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:59:29,551 epoch 8 - iter 1584/1984 - loss 0.02050008 - time (sec): 73.41 - samples/sec: 1790.79 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:59:38,685 epoch 8 - iter 1782/1984 - loss 0.02034437 - time (sec): 82.54 - samples/sec: 1790.41 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:59:47,933 epoch 8 - iter 1980/1984 - loss 0.02018234 - time (sec): 91.79 - samples/sec: 1782.77 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:59:48,133 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:59:48,133 EPOCH 8 done: loss 0.0203 - lr: 0.000011 2023-10-17 12:59:51,750 DEV : loss 0.23719041049480438 - f1-score (micro avg) 0.7626 2023-10-17 12:59:51,774 saving best model 2023-10-17 12:59:52,342 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:00:01,612 epoch 9 - iter 198/1984 - loss 0.01128630 - time (sec): 9.27 - samples/sec: 1795.34 - lr: 0.000011 - momentum: 0.000000 2023-10-17 13:00:10,749 epoch 9 - iter 396/1984 - loss 0.00943051 - time (sec): 18.41 - samples/sec: 1820.75 - lr: 0.000010 - momentum: 0.000000 2023-10-17 13:00:19,875 epoch 9 - iter 594/1984 - loss 0.01123883 - time (sec): 27.53 - samples/sec: 1777.99 - lr: 0.000009 - momentum: 0.000000 2023-10-17 13:00:29,376 epoch 9 - iter 792/1984 - loss 0.01104369 - time (sec): 37.03 - samples/sec: 1767.64 - lr: 0.000009 - momentum: 0.000000 2023-10-17 13:00:38,816 epoch 9 - iter 990/1984 - loss 0.01110156 - time (sec): 46.47 - samples/sec: 1763.83 - lr: 0.000008 - momentum: 0.000000 2023-10-17 13:00:48,150 epoch 9 - iter 1188/1984 - loss 0.01314728 - time (sec): 55.81 - samples/sec: 1775.48 - lr: 0.000008 - momentum: 0.000000 2023-10-17 13:00:57,267 epoch 9 - iter 1386/1984 - loss 0.01317730 - time (sec): 64.92 - samples/sec: 1764.93 - lr: 0.000007 - momentum: 0.000000 2023-10-17 13:01:06,731 epoch 9 - iter 1584/1984 - loss 0.01382017 - time (sec): 74.39 - samples/sec: 1760.10 - lr: 0.000007 - momentum: 0.000000 2023-10-17 13:01:15,926 epoch 9 - iter 1782/1984 - loss 0.01358330 - time (sec): 83.58 - samples/sec: 1759.37 - lr: 0.000006 - momentum: 0.000000 2023-10-17 13:01:25,451 epoch 9 - iter 1980/1984 - loss 0.01373408 - time (sec): 93.11 - samples/sec: 1756.06 - lr: 0.000006 - momentum: 0.000000 2023-10-17 13:01:25,664 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:01:25,665 EPOCH 9 done: loss 0.0138 - lr: 0.000006 2023-10-17 13:01:29,408 DEV : loss 0.2416975200176239 - f1-score (micro avg) 0.7531 2023-10-17 13:01:29,434 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:01:38,722 epoch 10 - iter 198/1984 - loss 0.00724275 - time (sec): 9.29 - samples/sec: 1731.53 - lr: 0.000005 - momentum: 0.000000 2023-10-17 13:01:48,062 epoch 10 - iter 396/1984 - loss 0.00878620 - time (sec): 18.63 - samples/sec: 1741.75 - lr: 0.000004 - momentum: 0.000000 2023-10-17 13:01:57,341 epoch 10 - iter 594/1984 - loss 0.00792020 - time (sec): 27.91 - samples/sec: 1788.05 - lr: 0.000004 - momentum: 0.000000 2023-10-17 13:02:06,885 epoch 10 - iter 792/1984 - loss 0.00864729 - time (sec): 37.45 - samples/sec: 1766.05 - lr: 0.000003 - momentum: 0.000000 2023-10-17 13:02:16,926 epoch 10 - iter 990/1984 - loss 0.00875296 - time (sec): 47.49 - samples/sec: 1730.47 - lr: 0.000003 - momentum: 0.000000 2023-10-17 13:02:26,316 epoch 10 - iter 1188/1984 - loss 0.00957040 - time (sec): 56.88 - samples/sec: 1731.11 - lr: 0.000002 - momentum: 0.000000 2023-10-17 13:02:35,536 epoch 10 - iter 1386/1984 - loss 0.00980326 - time (sec): 66.10 - samples/sec: 1733.47 - lr: 0.000002 - momentum: 0.000000 2023-10-17 13:02:44,848 epoch 10 - iter 1584/1984 - loss 0.00931466 - time (sec): 75.41 - samples/sec: 1747.44 - lr: 0.000001 - momentum: 0.000000 2023-10-17 13:02:53,940 epoch 10 - iter 1782/1984 - loss 0.00906989 - time (sec): 84.50 - samples/sec: 1750.28 - lr: 0.000001 - momentum: 0.000000 2023-10-17 13:03:03,073 epoch 10 - iter 1980/1984 - loss 0.00906925 - time (sec): 93.64 - samples/sec: 1748.58 - lr: 0.000000 - momentum: 0.000000 2023-10-17 13:03:03,259 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:03:03,259 EPOCH 10 done: loss 0.0092 - lr: 0.000000 2023-10-17 13:03:06,977 DEV : loss 0.25551512837409973 - f1-score (micro avg) 0.7566 2023-10-17 13:03:07,490 ---------------------------------------------------------------------------------------------------- 2023-10-17 13:03:07,492 Loading model from best epoch ... 2023-10-17 13:03:09,090 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG 2023-10-17 13:03:12,571 Results: - F-score (micro) 0.7774 - F-score (macro) 0.6822 - Accuracy 0.6592 By class: precision recall f1-score support LOC 0.8343 0.8611 0.8475 655 PER 0.6920 0.7758 0.7315 223 ORG 0.5192 0.4252 0.4675 127 micro avg 0.7680 0.7871 0.7774 1005 macro avg 0.6819 0.6874 0.6822 1005 weighted avg 0.7629 0.7871 0.7737 1005 2023-10-17 13:03:12,571 ----------------------------------------------------------------------------------------------------