2023-10-16 09:41:24,547 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:41:24,548 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-11): 12 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=768, out_features=768, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-16 09:41:24,548 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:41:24,548 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-16 09:41:24,548 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:41:24,548 Train: 7142 sentences 2023-10-16 09:41:24,548 (train_with_dev=False, train_with_test=False) 2023-10-16 09:41:24,548 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:41:24,548 Training Params: 2023-10-16 09:41:24,548 - learning_rate: "3e-05" 2023-10-16 09:41:24,548 - mini_batch_size: "4" 2023-10-16 09:41:24,548 - max_epochs: "10" 2023-10-16 09:41:24,548 - shuffle: "True" 2023-10-16 09:41:24,548 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:41:24,548 Plugins: 2023-10-16 09:41:24,548 - LinearScheduler | warmup_fraction: '0.1' 2023-10-16 09:41:24,548 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:41:24,548 Final evaluation on model from best epoch (best-model.pt) 2023-10-16 09:41:24,548 - metric: "('micro avg', 'f1-score')" 2023-10-16 09:41:24,549 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:41:24,549 Computation: 2023-10-16 09:41:24,549 - compute on device: cuda:0 2023-10-16 09:41:24,549 - embedding storage: none 2023-10-16 09:41:24,549 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:41:24,549 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" 2023-10-16 09:41:24,549 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:41:24,549 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:41:33,533 epoch 1 - iter 178/1786 - loss 2.31725358 - time (sec): 8.98 - samples/sec: 2906.50 - lr: 0.000003 - momentum: 0.000000 2023-10-16 09:41:42,484 epoch 1 - iter 356/1786 - loss 1.47845564 - time (sec): 17.93 - samples/sec: 2820.10 - lr: 0.000006 - momentum: 0.000000 2023-10-16 09:41:51,268 epoch 1 - iter 534/1786 - loss 1.11555514 - time (sec): 26.72 - samples/sec: 2815.66 - lr: 0.000009 - momentum: 0.000000 2023-10-16 09:42:00,133 epoch 1 - iter 712/1786 - loss 0.91749738 - time (sec): 35.58 - samples/sec: 2796.69 - lr: 0.000012 - momentum: 0.000000 2023-10-16 09:42:08,756 epoch 1 - iter 890/1786 - loss 0.78960784 - time (sec): 44.21 - samples/sec: 2780.89 - lr: 0.000015 - momentum: 0.000000 2023-10-16 09:42:17,850 epoch 1 - iter 1068/1786 - loss 0.69748291 - time (sec): 53.30 - samples/sec: 2751.43 - lr: 0.000018 - momentum: 0.000000 2023-10-16 09:42:27,008 epoch 1 - iter 1246/1786 - loss 0.62001255 - time (sec): 62.46 - samples/sec: 2766.94 - lr: 0.000021 - momentum: 0.000000 2023-10-16 09:42:36,105 epoch 1 - iter 1424/1786 - loss 0.55925070 - time (sec): 71.56 - samples/sec: 2773.56 - lr: 0.000024 - momentum: 0.000000 2023-10-16 09:42:45,037 epoch 1 - iter 1602/1786 - loss 0.51380389 - time (sec): 80.49 - samples/sec: 2771.52 - lr: 0.000027 - momentum: 0.000000 2023-10-16 09:42:54,388 epoch 1 - iter 1780/1786 - loss 0.47570966 - time (sec): 89.84 - samples/sec: 2760.26 - lr: 0.000030 - momentum: 0.000000 2023-10-16 09:42:54,673 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:42:54,673 EPOCH 1 done: loss 0.4746 - lr: 0.000030 2023-10-16 09:42:57,223 DEV : loss 0.14207538962364197 - f1-score (micro avg) 0.6781 2023-10-16 09:42:57,239 saving best model 2023-10-16 09:42:57,669 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:43:06,735 epoch 2 - iter 178/1786 - loss 0.10458942 - time (sec): 9.06 - samples/sec: 2828.77 - lr: 0.000030 - momentum: 0.000000 2023-10-16 09:43:15,763 epoch 2 - iter 356/1786 - loss 0.11289530 - time (sec): 18.09 - samples/sec: 2789.86 - lr: 0.000029 - momentum: 0.000000 2023-10-16 09:43:24,751 epoch 2 - iter 534/1786 - loss 0.11838363 - time (sec): 27.08 - samples/sec: 2726.46 - lr: 0.000029 - momentum: 0.000000 2023-10-16 09:43:33,440 epoch 2 - iter 712/1786 - loss 0.11649418 - time (sec): 35.77 - samples/sec: 2770.55 - lr: 0.000029 - momentum: 0.000000 2023-10-16 09:43:42,555 epoch 2 - iter 890/1786 - loss 0.11836022 - time (sec): 44.88 - samples/sec: 2788.32 - lr: 0.000028 - momentum: 0.000000 2023-10-16 09:43:51,148 epoch 2 - iter 1068/1786 - loss 0.11714005 - time (sec): 53.48 - samples/sec: 2794.44 - lr: 0.000028 - momentum: 0.000000 2023-10-16 09:43:59,933 epoch 2 - iter 1246/1786 - loss 0.11797601 - time (sec): 62.26 - samples/sec: 2784.34 - lr: 0.000028 - momentum: 0.000000 2023-10-16 09:44:08,599 epoch 2 - iter 1424/1786 - loss 0.11713784 - time (sec): 70.93 - samples/sec: 2792.29 - lr: 0.000027 - momentum: 0.000000 2023-10-16 09:44:17,368 epoch 2 - iter 1602/1786 - loss 0.11647325 - time (sec): 79.70 - samples/sec: 2814.52 - lr: 0.000027 - momentum: 0.000000 2023-10-16 09:44:26,201 epoch 2 - iter 1780/1786 - loss 0.11414611 - time (sec): 88.53 - samples/sec: 2803.96 - lr: 0.000027 - momentum: 0.000000 2023-10-16 09:44:26,489 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:44:26,489 EPOCH 2 done: loss 0.1141 - lr: 0.000027 2023-10-16 09:44:31,250 DEV : loss 0.12184497714042664 - f1-score (micro avg) 0.7654 2023-10-16 09:44:31,266 saving best model 2023-10-16 09:44:31,758 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:44:40,509 epoch 3 - iter 178/1786 - loss 0.07028279 - time (sec): 8.75 - samples/sec: 2717.53 - lr: 0.000026 - momentum: 0.000000 2023-10-16 09:44:49,387 epoch 3 - iter 356/1786 - loss 0.07414788 - time (sec): 17.63 - samples/sec: 2805.34 - lr: 0.000026 - momentum: 0.000000 2023-10-16 09:44:58,046 epoch 3 - iter 534/1786 - loss 0.07616466 - time (sec): 26.29 - samples/sec: 2829.08 - lr: 0.000026 - momentum: 0.000000 2023-10-16 09:45:07,014 epoch 3 - iter 712/1786 - loss 0.07702373 - time (sec): 35.25 - samples/sec: 2857.75 - lr: 0.000025 - momentum: 0.000000 2023-10-16 09:45:15,560 epoch 3 - iter 890/1786 - loss 0.07733931 - time (sec): 43.80 - samples/sec: 2857.14 - lr: 0.000025 - momentum: 0.000000 2023-10-16 09:45:24,052 epoch 3 - iter 1068/1786 - loss 0.07927594 - time (sec): 52.29 - samples/sec: 2845.37 - lr: 0.000025 - momentum: 0.000000 2023-10-16 09:45:32,889 epoch 3 - iter 1246/1786 - loss 0.07824875 - time (sec): 61.13 - samples/sec: 2824.98 - lr: 0.000024 - momentum: 0.000000 2023-10-16 09:45:41,842 epoch 3 - iter 1424/1786 - loss 0.07833523 - time (sec): 70.08 - samples/sec: 2817.66 - lr: 0.000024 - momentum: 0.000000 2023-10-16 09:45:50,813 epoch 3 - iter 1602/1786 - loss 0.07784066 - time (sec): 79.05 - samples/sec: 2805.85 - lr: 0.000024 - momentum: 0.000000 2023-10-16 09:45:59,884 epoch 3 - iter 1780/1786 - loss 0.07736904 - time (sec): 88.12 - samples/sec: 2811.90 - lr: 0.000023 - momentum: 0.000000 2023-10-16 09:46:00,242 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:46:00,242 EPOCH 3 done: loss 0.0775 - lr: 0.000023 2023-10-16 09:46:04,416 DEV : loss 0.12185992300510406 - f1-score (micro avg) 0.7936 2023-10-16 09:46:04,433 saving best model 2023-10-16 09:46:04,894 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:46:14,224 epoch 4 - iter 178/1786 - loss 0.04631419 - time (sec): 9.33 - samples/sec: 2653.86 - lr: 0.000023 - momentum: 0.000000 2023-10-16 09:46:22,793 epoch 4 - iter 356/1786 - loss 0.05317882 - time (sec): 17.89 - samples/sec: 2767.12 - lr: 0.000023 - momentum: 0.000000 2023-10-16 09:46:31,654 epoch 4 - iter 534/1786 - loss 0.05386557 - time (sec): 26.76 - samples/sec: 2755.06 - lr: 0.000022 - momentum: 0.000000 2023-10-16 09:46:40,697 epoch 4 - iter 712/1786 - loss 0.05265052 - time (sec): 35.80 - samples/sec: 2794.02 - lr: 0.000022 - momentum: 0.000000 2023-10-16 09:46:49,295 epoch 4 - iter 890/1786 - loss 0.05513144 - time (sec): 44.40 - samples/sec: 2788.99 - lr: 0.000022 - momentum: 0.000000 2023-10-16 09:46:57,956 epoch 4 - iter 1068/1786 - loss 0.05490959 - time (sec): 53.06 - samples/sec: 2787.52 - lr: 0.000021 - momentum: 0.000000 2023-10-16 09:47:06,623 epoch 4 - iter 1246/1786 - loss 0.05446774 - time (sec): 61.72 - samples/sec: 2798.70 - lr: 0.000021 - momentum: 0.000000 2023-10-16 09:47:15,185 epoch 4 - iter 1424/1786 - loss 0.05609331 - time (sec): 70.29 - samples/sec: 2801.47 - lr: 0.000021 - momentum: 0.000000 2023-10-16 09:47:23,833 epoch 4 - iter 1602/1786 - loss 0.05699963 - time (sec): 78.93 - samples/sec: 2789.90 - lr: 0.000020 - momentum: 0.000000 2023-10-16 09:47:33,473 epoch 4 - iter 1780/1786 - loss 0.05709770 - time (sec): 88.57 - samples/sec: 2799.92 - lr: 0.000020 - momentum: 0.000000 2023-10-16 09:47:33,791 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:47:33,791 EPOCH 4 done: loss 0.0572 - lr: 0.000020 2023-10-16 09:47:37,982 DEV : loss 0.1453891098499298 - f1-score (micro avg) 0.8098 2023-10-16 09:47:38,000 saving best model 2023-10-16 09:47:38,481 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:47:47,325 epoch 5 - iter 178/1786 - loss 0.03591331 - time (sec): 8.84 - samples/sec: 2566.86 - lr: 0.000020 - momentum: 0.000000 2023-10-16 09:47:56,198 epoch 5 - iter 356/1786 - loss 0.04195969 - time (sec): 17.71 - samples/sec: 2718.76 - lr: 0.000019 - momentum: 0.000000 2023-10-16 09:48:05,110 epoch 5 - iter 534/1786 - loss 0.03788302 - time (sec): 26.62 - samples/sec: 2797.45 - lr: 0.000019 - momentum: 0.000000 2023-10-16 09:48:13,656 epoch 5 - iter 712/1786 - loss 0.03811789 - time (sec): 35.17 - samples/sec: 2784.76 - lr: 0.000019 - momentum: 0.000000 2023-10-16 09:48:22,647 epoch 5 - iter 890/1786 - loss 0.04117165 - time (sec): 44.16 - samples/sec: 2805.22 - lr: 0.000018 - momentum: 0.000000 2023-10-16 09:48:31,231 epoch 5 - iter 1068/1786 - loss 0.04034710 - time (sec): 52.74 - samples/sec: 2780.00 - lr: 0.000018 - momentum: 0.000000 2023-10-16 09:48:40,288 epoch 5 - iter 1246/1786 - loss 0.04037781 - time (sec): 61.80 - samples/sec: 2786.91 - lr: 0.000018 - momentum: 0.000000 2023-10-16 09:48:49,175 epoch 5 - iter 1424/1786 - loss 0.04044669 - time (sec): 70.69 - samples/sec: 2783.63 - lr: 0.000017 - momentum: 0.000000 2023-10-16 09:48:58,146 epoch 5 - iter 1602/1786 - loss 0.04252121 - time (sec): 79.66 - samples/sec: 2801.41 - lr: 0.000017 - momentum: 0.000000 2023-10-16 09:49:06,873 epoch 5 - iter 1780/1786 - loss 0.04260854 - time (sec): 88.39 - samples/sec: 2805.68 - lr: 0.000017 - momentum: 0.000000 2023-10-16 09:49:07,179 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:49:07,180 EPOCH 5 done: loss 0.0427 - lr: 0.000017 2023-10-16 09:49:12,015 DEV : loss 0.16638240218162537 - f1-score (micro avg) 0.8147 2023-10-16 09:49:12,032 saving best model 2023-10-16 09:49:12,511 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:49:21,429 epoch 6 - iter 178/1786 - loss 0.02981234 - time (sec): 8.91 - samples/sec: 2985.12 - lr: 0.000016 - momentum: 0.000000 2023-10-16 09:49:30,086 epoch 6 - iter 356/1786 - loss 0.02843138 - time (sec): 17.57 - samples/sec: 2895.44 - lr: 0.000016 - momentum: 0.000000 2023-10-16 09:49:39,196 epoch 6 - iter 534/1786 - loss 0.02832788 - time (sec): 26.68 - samples/sec: 2815.02 - lr: 0.000016 - momentum: 0.000000 2023-10-16 09:49:47,947 epoch 6 - iter 712/1786 - loss 0.03080102 - time (sec): 35.43 - samples/sec: 2804.26 - lr: 0.000015 - momentum: 0.000000 2023-10-16 09:49:56,775 epoch 6 - iter 890/1786 - loss 0.03150369 - time (sec): 44.26 - samples/sec: 2792.99 - lr: 0.000015 - momentum: 0.000000 2023-10-16 09:50:05,560 epoch 6 - iter 1068/1786 - loss 0.03084312 - time (sec): 53.04 - samples/sec: 2823.20 - lr: 0.000015 - momentum: 0.000000 2023-10-16 09:50:14,018 epoch 6 - iter 1246/1786 - loss 0.03053296 - time (sec): 61.50 - samples/sec: 2812.53 - lr: 0.000014 - momentum: 0.000000 2023-10-16 09:50:22,886 epoch 6 - iter 1424/1786 - loss 0.03100959 - time (sec): 70.37 - samples/sec: 2805.48 - lr: 0.000014 - momentum: 0.000000 2023-10-16 09:50:31,646 epoch 6 - iter 1602/1786 - loss 0.03088695 - time (sec): 79.13 - samples/sec: 2810.14 - lr: 0.000014 - momentum: 0.000000 2023-10-16 09:50:40,534 epoch 6 - iter 1780/1786 - loss 0.03238525 - time (sec): 88.02 - samples/sec: 2815.78 - lr: 0.000013 - momentum: 0.000000 2023-10-16 09:50:40,794 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:50:40,794 EPOCH 6 done: loss 0.0323 - lr: 0.000013 2023-10-16 09:50:45,022 DEV : loss 0.19458912312984467 - f1-score (micro avg) 0.7899 2023-10-16 09:50:45,041 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:50:53,864 epoch 7 - iter 178/1786 - loss 0.02985635 - time (sec): 8.82 - samples/sec: 2807.19 - lr: 0.000013 - momentum: 0.000000 2023-10-16 09:51:02,562 epoch 7 - iter 356/1786 - loss 0.02581944 - time (sec): 17.52 - samples/sec: 2859.72 - lr: 0.000013 - momentum: 0.000000 2023-10-16 09:51:11,349 epoch 7 - iter 534/1786 - loss 0.02750578 - time (sec): 26.31 - samples/sec: 2859.61 - lr: 0.000012 - momentum: 0.000000 2023-10-16 09:51:20,019 epoch 7 - iter 712/1786 - loss 0.02534929 - time (sec): 34.98 - samples/sec: 2825.31 - lr: 0.000012 - momentum: 0.000000 2023-10-16 09:51:28,872 epoch 7 - iter 890/1786 - loss 0.02438722 - time (sec): 43.83 - samples/sec: 2823.89 - lr: 0.000012 - momentum: 0.000000 2023-10-16 09:51:37,782 epoch 7 - iter 1068/1786 - loss 0.02540106 - time (sec): 52.74 - samples/sec: 2813.95 - lr: 0.000011 - momentum: 0.000000 2023-10-16 09:51:46,965 epoch 7 - iter 1246/1786 - loss 0.02499926 - time (sec): 61.92 - samples/sec: 2809.89 - lr: 0.000011 - momentum: 0.000000 2023-10-16 09:51:55,652 epoch 7 - iter 1424/1786 - loss 0.02543629 - time (sec): 70.61 - samples/sec: 2793.82 - lr: 0.000011 - momentum: 0.000000 2023-10-16 09:52:04,706 epoch 7 - iter 1602/1786 - loss 0.02551404 - time (sec): 79.66 - samples/sec: 2803.81 - lr: 0.000010 - momentum: 0.000000 2023-10-16 09:52:13,431 epoch 7 - iter 1780/1786 - loss 0.02530936 - time (sec): 88.39 - samples/sec: 2805.62 - lr: 0.000010 - momentum: 0.000000 2023-10-16 09:52:13,715 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:52:13,715 EPOCH 7 done: loss 0.0253 - lr: 0.000010 2023-10-16 09:52:18,560 DEV : loss 0.1818549633026123 - f1-score (micro avg) 0.8199 2023-10-16 09:52:18,576 saving best model 2023-10-16 09:52:19,113 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:52:28,623 epoch 8 - iter 178/1786 - loss 0.02072833 - time (sec): 9.51 - samples/sec: 2849.01 - lr: 0.000010 - momentum: 0.000000 2023-10-16 09:52:37,582 epoch 8 - iter 356/1786 - loss 0.01525016 - time (sec): 18.47 - samples/sec: 2835.03 - lr: 0.000009 - momentum: 0.000000 2023-10-16 09:52:46,252 epoch 8 - iter 534/1786 - loss 0.01644632 - time (sec): 27.14 - samples/sec: 2846.40 - lr: 0.000009 - momentum: 0.000000 2023-10-16 09:52:54,980 epoch 8 - iter 712/1786 - loss 0.01665271 - time (sec): 35.87 - samples/sec: 2813.13 - lr: 0.000009 - momentum: 0.000000 2023-10-16 09:53:04,188 epoch 8 - iter 890/1786 - loss 0.01765939 - time (sec): 45.07 - samples/sec: 2771.55 - lr: 0.000008 - momentum: 0.000000 2023-10-16 09:53:13,163 epoch 8 - iter 1068/1786 - loss 0.01734742 - time (sec): 54.05 - samples/sec: 2745.30 - lr: 0.000008 - momentum: 0.000000 2023-10-16 09:53:22,142 epoch 8 - iter 1246/1786 - loss 0.01734634 - time (sec): 63.03 - samples/sec: 2776.57 - lr: 0.000008 - momentum: 0.000000 2023-10-16 09:53:30,968 epoch 8 - iter 1424/1786 - loss 0.01739432 - time (sec): 71.85 - samples/sec: 2772.27 - lr: 0.000007 - momentum: 0.000000 2023-10-16 09:53:39,721 epoch 8 - iter 1602/1786 - loss 0.01721767 - time (sec): 80.61 - samples/sec: 2754.32 - lr: 0.000007 - momentum: 0.000000 2023-10-16 09:53:48,509 epoch 8 - iter 1780/1786 - loss 0.01770109 - time (sec): 89.39 - samples/sec: 2775.47 - lr: 0.000007 - momentum: 0.000000 2023-10-16 09:53:48,777 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:53:48,777 EPOCH 8 done: loss 0.0177 - lr: 0.000007 2023-10-16 09:53:53,564 DEV : loss 0.18963442742824554 - f1-score (micro avg) 0.8128 2023-10-16 09:53:53,580 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:54:02,337 epoch 9 - iter 178/1786 - loss 0.01007676 - time (sec): 8.76 - samples/sec: 2909.33 - lr: 0.000006 - momentum: 0.000000 2023-10-16 09:54:11,003 epoch 9 - iter 356/1786 - loss 0.01038145 - time (sec): 17.42 - samples/sec: 2821.31 - lr: 0.000006 - momentum: 0.000000 2023-10-16 09:54:19,677 epoch 9 - iter 534/1786 - loss 0.00929853 - time (sec): 26.10 - samples/sec: 2830.26 - lr: 0.000006 - momentum: 0.000000 2023-10-16 09:54:28,542 epoch 9 - iter 712/1786 - loss 0.00942723 - time (sec): 34.96 - samples/sec: 2845.40 - lr: 0.000005 - momentum: 0.000000 2023-10-16 09:54:37,262 epoch 9 - iter 890/1786 - loss 0.01080356 - time (sec): 43.68 - samples/sec: 2813.33 - lr: 0.000005 - momentum: 0.000000 2023-10-16 09:54:45,873 epoch 9 - iter 1068/1786 - loss 0.01145128 - time (sec): 52.29 - samples/sec: 2815.46 - lr: 0.000005 - momentum: 0.000000 2023-10-16 09:54:54,450 epoch 9 - iter 1246/1786 - loss 0.01179870 - time (sec): 60.87 - samples/sec: 2823.11 - lr: 0.000004 - momentum: 0.000000 2023-10-16 09:55:03,201 epoch 9 - iter 1424/1786 - loss 0.01186486 - time (sec): 69.62 - samples/sec: 2824.56 - lr: 0.000004 - momentum: 0.000000 2023-10-16 09:55:11,956 epoch 9 - iter 1602/1786 - loss 0.01252162 - time (sec): 78.37 - samples/sec: 2820.68 - lr: 0.000004 - momentum: 0.000000 2023-10-16 09:55:21,133 epoch 9 - iter 1780/1786 - loss 0.01260169 - time (sec): 87.55 - samples/sec: 2832.02 - lr: 0.000003 - momentum: 0.000000 2023-10-16 09:55:21,408 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:55:21,408 EPOCH 9 done: loss 0.0126 - lr: 0.000003 2023-10-16 09:55:25,491 DEV : loss 0.20145297050476074 - f1-score (micro avg) 0.8042 2023-10-16 09:55:25,507 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:55:34,238 epoch 10 - iter 178/1786 - loss 0.00640268 - time (sec): 8.73 - samples/sec: 2762.39 - lr: 0.000003 - momentum: 0.000000 2023-10-16 09:55:43,024 epoch 10 - iter 356/1786 - loss 0.00598649 - time (sec): 17.52 - samples/sec: 2823.54 - lr: 0.000003 - momentum: 0.000000 2023-10-16 09:55:51,861 epoch 10 - iter 534/1786 - loss 0.00789000 - time (sec): 26.35 - samples/sec: 2827.16 - lr: 0.000002 - momentum: 0.000000 2023-10-16 09:56:00,615 epoch 10 - iter 712/1786 - loss 0.00896349 - time (sec): 35.11 - samples/sec: 2836.69 - lr: 0.000002 - momentum: 0.000000 2023-10-16 09:56:09,212 epoch 10 - iter 890/1786 - loss 0.01000578 - time (sec): 43.70 - samples/sec: 2850.59 - lr: 0.000002 - momentum: 0.000000 2023-10-16 09:56:18,040 epoch 10 - iter 1068/1786 - loss 0.00947271 - time (sec): 52.53 - samples/sec: 2855.88 - lr: 0.000001 - momentum: 0.000000 2023-10-16 09:56:27,011 epoch 10 - iter 1246/1786 - loss 0.00883309 - time (sec): 61.50 - samples/sec: 2856.10 - lr: 0.000001 - momentum: 0.000000 2023-10-16 09:56:35,737 epoch 10 - iter 1424/1786 - loss 0.00874636 - time (sec): 70.23 - samples/sec: 2845.64 - lr: 0.000001 - momentum: 0.000000 2023-10-16 09:56:44,623 epoch 10 - iter 1602/1786 - loss 0.00833858 - time (sec): 79.11 - samples/sec: 2839.53 - lr: 0.000000 - momentum: 0.000000 2023-10-16 09:56:53,522 epoch 10 - iter 1780/1786 - loss 0.00829271 - time (sec): 88.01 - samples/sec: 2820.09 - lr: 0.000000 - momentum: 0.000000 2023-10-16 09:56:53,791 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:56:53,791 EPOCH 10 done: loss 0.0083 - lr: 0.000000 2023-10-16 09:56:58,429 DEV : loss 0.20914104580879211 - f1-score (micro avg) 0.8046 2023-10-16 09:56:58,857 ---------------------------------------------------------------------------------------------------- 2023-10-16 09:56:58,858 Loading model from best epoch ... 2023-10-16 09:57:00,383 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-16 09:57:09,993 Results: - F-score (micro) 0.6947 - F-score (macro) 0.6317 - Accuracy 0.546 By class: precision recall f1-score support LOC 0.6816 0.7078 0.6944 1095 PER 0.7505 0.7757 0.7629 1012 ORG 0.4963 0.5686 0.5300 357 HumanProd 0.4286 0.7273 0.5393 33 micro avg 0.6748 0.7157 0.6947 2497 macro avg 0.5893 0.6948 0.6317 2497 weighted avg 0.6797 0.7157 0.6966 2497 2023-10-16 09:57:09,993 ----------------------------------------------------------------------------------------------------