stefan-it's picture
Upload folder using huggingface_hub
48f117e
2023-10-16 09:41:24,547 ----------------------------------------------------------------------------------------------------
2023-10-16 09:41:24,548 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0-11): 12 x BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=17, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-16 09:41:24,548 ----------------------------------------------------------------------------------------------------
2023-10-16 09:41:24,548 MultiCorpus: 7142 train + 698 dev + 2570 test sentences
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator
2023-10-16 09:41:24,548 ----------------------------------------------------------------------------------------------------
2023-10-16 09:41:24,548 Train: 7142 sentences
2023-10-16 09:41:24,548 (train_with_dev=False, train_with_test=False)
2023-10-16 09:41:24,548 ----------------------------------------------------------------------------------------------------
2023-10-16 09:41:24,548 Training Params:
2023-10-16 09:41:24,548 - learning_rate: "3e-05"
2023-10-16 09:41:24,548 - mini_batch_size: "4"
2023-10-16 09:41:24,548 - max_epochs: "10"
2023-10-16 09:41:24,548 - shuffle: "True"
2023-10-16 09:41:24,548 ----------------------------------------------------------------------------------------------------
2023-10-16 09:41:24,548 Plugins:
2023-10-16 09:41:24,548 - LinearScheduler | warmup_fraction: '0.1'
2023-10-16 09:41:24,548 ----------------------------------------------------------------------------------------------------
2023-10-16 09:41:24,548 Final evaluation on model from best epoch (best-model.pt)
2023-10-16 09:41:24,548 - metric: "('micro avg', 'f1-score')"
2023-10-16 09:41:24,549 ----------------------------------------------------------------------------------------------------
2023-10-16 09:41:24,549 Computation:
2023-10-16 09:41:24,549 - compute on device: cuda:0
2023-10-16 09:41:24,549 - embedding storage: none
2023-10-16 09:41:24,549 ----------------------------------------------------------------------------------------------------
2023-10-16 09:41:24,549 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2"
2023-10-16 09:41:24,549 ----------------------------------------------------------------------------------------------------
2023-10-16 09:41:24,549 ----------------------------------------------------------------------------------------------------
2023-10-16 09:41:33,533 epoch 1 - iter 178/1786 - loss 2.31725358 - time (sec): 8.98 - samples/sec: 2906.50 - lr: 0.000003 - momentum: 0.000000
2023-10-16 09:41:42,484 epoch 1 - iter 356/1786 - loss 1.47845564 - time (sec): 17.93 - samples/sec: 2820.10 - lr: 0.000006 - momentum: 0.000000
2023-10-16 09:41:51,268 epoch 1 - iter 534/1786 - loss 1.11555514 - time (sec): 26.72 - samples/sec: 2815.66 - lr: 0.000009 - momentum: 0.000000
2023-10-16 09:42:00,133 epoch 1 - iter 712/1786 - loss 0.91749738 - time (sec): 35.58 - samples/sec: 2796.69 - lr: 0.000012 - momentum: 0.000000
2023-10-16 09:42:08,756 epoch 1 - iter 890/1786 - loss 0.78960784 - time (sec): 44.21 - samples/sec: 2780.89 - lr: 0.000015 - momentum: 0.000000
2023-10-16 09:42:17,850 epoch 1 - iter 1068/1786 - loss 0.69748291 - time (sec): 53.30 - samples/sec: 2751.43 - lr: 0.000018 - momentum: 0.000000
2023-10-16 09:42:27,008 epoch 1 - iter 1246/1786 - loss 0.62001255 - time (sec): 62.46 - samples/sec: 2766.94 - lr: 0.000021 - momentum: 0.000000
2023-10-16 09:42:36,105 epoch 1 - iter 1424/1786 - loss 0.55925070 - time (sec): 71.56 - samples/sec: 2773.56 - lr: 0.000024 - momentum: 0.000000
2023-10-16 09:42:45,037 epoch 1 - iter 1602/1786 - loss 0.51380389 - time (sec): 80.49 - samples/sec: 2771.52 - lr: 0.000027 - momentum: 0.000000
2023-10-16 09:42:54,388 epoch 1 - iter 1780/1786 - loss 0.47570966 - time (sec): 89.84 - samples/sec: 2760.26 - lr: 0.000030 - momentum: 0.000000
2023-10-16 09:42:54,673 ----------------------------------------------------------------------------------------------------
2023-10-16 09:42:54,673 EPOCH 1 done: loss 0.4746 - lr: 0.000030
2023-10-16 09:42:57,223 DEV : loss 0.14207538962364197 - f1-score (micro avg) 0.6781
2023-10-16 09:42:57,239 saving best model
2023-10-16 09:42:57,669 ----------------------------------------------------------------------------------------------------
2023-10-16 09:43:06,735 epoch 2 - iter 178/1786 - loss 0.10458942 - time (sec): 9.06 - samples/sec: 2828.77 - lr: 0.000030 - momentum: 0.000000
2023-10-16 09:43:15,763 epoch 2 - iter 356/1786 - loss 0.11289530 - time (sec): 18.09 - samples/sec: 2789.86 - lr: 0.000029 - momentum: 0.000000
2023-10-16 09:43:24,751 epoch 2 - iter 534/1786 - loss 0.11838363 - time (sec): 27.08 - samples/sec: 2726.46 - lr: 0.000029 - momentum: 0.000000
2023-10-16 09:43:33,440 epoch 2 - iter 712/1786 - loss 0.11649418 - time (sec): 35.77 - samples/sec: 2770.55 - lr: 0.000029 - momentum: 0.000000
2023-10-16 09:43:42,555 epoch 2 - iter 890/1786 - loss 0.11836022 - time (sec): 44.88 - samples/sec: 2788.32 - lr: 0.000028 - momentum: 0.000000
2023-10-16 09:43:51,148 epoch 2 - iter 1068/1786 - loss 0.11714005 - time (sec): 53.48 - samples/sec: 2794.44 - lr: 0.000028 - momentum: 0.000000
2023-10-16 09:43:59,933 epoch 2 - iter 1246/1786 - loss 0.11797601 - time (sec): 62.26 - samples/sec: 2784.34 - lr: 0.000028 - momentum: 0.000000
2023-10-16 09:44:08,599 epoch 2 - iter 1424/1786 - loss 0.11713784 - time (sec): 70.93 - samples/sec: 2792.29 - lr: 0.000027 - momentum: 0.000000
2023-10-16 09:44:17,368 epoch 2 - iter 1602/1786 - loss 0.11647325 - time (sec): 79.70 - samples/sec: 2814.52 - lr: 0.000027 - momentum: 0.000000
2023-10-16 09:44:26,201 epoch 2 - iter 1780/1786 - loss 0.11414611 - time (sec): 88.53 - samples/sec: 2803.96 - lr: 0.000027 - momentum: 0.000000
2023-10-16 09:44:26,489 ----------------------------------------------------------------------------------------------------
2023-10-16 09:44:26,489 EPOCH 2 done: loss 0.1141 - lr: 0.000027
2023-10-16 09:44:31,250 DEV : loss 0.12184497714042664 - f1-score (micro avg) 0.7654
2023-10-16 09:44:31,266 saving best model
2023-10-16 09:44:31,758 ----------------------------------------------------------------------------------------------------
2023-10-16 09:44:40,509 epoch 3 - iter 178/1786 - loss 0.07028279 - time (sec): 8.75 - samples/sec: 2717.53 - lr: 0.000026 - momentum: 0.000000
2023-10-16 09:44:49,387 epoch 3 - iter 356/1786 - loss 0.07414788 - time (sec): 17.63 - samples/sec: 2805.34 - lr: 0.000026 - momentum: 0.000000
2023-10-16 09:44:58,046 epoch 3 - iter 534/1786 - loss 0.07616466 - time (sec): 26.29 - samples/sec: 2829.08 - lr: 0.000026 - momentum: 0.000000
2023-10-16 09:45:07,014 epoch 3 - iter 712/1786 - loss 0.07702373 - time (sec): 35.25 - samples/sec: 2857.75 - lr: 0.000025 - momentum: 0.000000
2023-10-16 09:45:15,560 epoch 3 - iter 890/1786 - loss 0.07733931 - time (sec): 43.80 - samples/sec: 2857.14 - lr: 0.000025 - momentum: 0.000000
2023-10-16 09:45:24,052 epoch 3 - iter 1068/1786 - loss 0.07927594 - time (sec): 52.29 - samples/sec: 2845.37 - lr: 0.000025 - momentum: 0.000000
2023-10-16 09:45:32,889 epoch 3 - iter 1246/1786 - loss 0.07824875 - time (sec): 61.13 - samples/sec: 2824.98 - lr: 0.000024 - momentum: 0.000000
2023-10-16 09:45:41,842 epoch 3 - iter 1424/1786 - loss 0.07833523 - time (sec): 70.08 - samples/sec: 2817.66 - lr: 0.000024 - momentum: 0.000000
2023-10-16 09:45:50,813 epoch 3 - iter 1602/1786 - loss 0.07784066 - time (sec): 79.05 - samples/sec: 2805.85 - lr: 0.000024 - momentum: 0.000000
2023-10-16 09:45:59,884 epoch 3 - iter 1780/1786 - loss 0.07736904 - time (sec): 88.12 - samples/sec: 2811.90 - lr: 0.000023 - momentum: 0.000000
2023-10-16 09:46:00,242 ----------------------------------------------------------------------------------------------------
2023-10-16 09:46:00,242 EPOCH 3 done: loss 0.0775 - lr: 0.000023
2023-10-16 09:46:04,416 DEV : loss 0.12185992300510406 - f1-score (micro avg) 0.7936
2023-10-16 09:46:04,433 saving best model
2023-10-16 09:46:04,894 ----------------------------------------------------------------------------------------------------
2023-10-16 09:46:14,224 epoch 4 - iter 178/1786 - loss 0.04631419 - time (sec): 9.33 - samples/sec: 2653.86 - lr: 0.000023 - momentum: 0.000000
2023-10-16 09:46:22,793 epoch 4 - iter 356/1786 - loss 0.05317882 - time (sec): 17.89 - samples/sec: 2767.12 - lr: 0.000023 - momentum: 0.000000
2023-10-16 09:46:31,654 epoch 4 - iter 534/1786 - loss 0.05386557 - time (sec): 26.76 - samples/sec: 2755.06 - lr: 0.000022 - momentum: 0.000000
2023-10-16 09:46:40,697 epoch 4 - iter 712/1786 - loss 0.05265052 - time (sec): 35.80 - samples/sec: 2794.02 - lr: 0.000022 - momentum: 0.000000
2023-10-16 09:46:49,295 epoch 4 - iter 890/1786 - loss 0.05513144 - time (sec): 44.40 - samples/sec: 2788.99 - lr: 0.000022 - momentum: 0.000000
2023-10-16 09:46:57,956 epoch 4 - iter 1068/1786 - loss 0.05490959 - time (sec): 53.06 - samples/sec: 2787.52 - lr: 0.000021 - momentum: 0.000000
2023-10-16 09:47:06,623 epoch 4 - iter 1246/1786 - loss 0.05446774 - time (sec): 61.72 - samples/sec: 2798.70 - lr: 0.000021 - momentum: 0.000000
2023-10-16 09:47:15,185 epoch 4 - iter 1424/1786 - loss 0.05609331 - time (sec): 70.29 - samples/sec: 2801.47 - lr: 0.000021 - momentum: 0.000000
2023-10-16 09:47:23,833 epoch 4 - iter 1602/1786 - loss 0.05699963 - time (sec): 78.93 - samples/sec: 2789.90 - lr: 0.000020 - momentum: 0.000000
2023-10-16 09:47:33,473 epoch 4 - iter 1780/1786 - loss 0.05709770 - time (sec): 88.57 - samples/sec: 2799.92 - lr: 0.000020 - momentum: 0.000000
2023-10-16 09:47:33,791 ----------------------------------------------------------------------------------------------------
2023-10-16 09:47:33,791 EPOCH 4 done: loss 0.0572 - lr: 0.000020
2023-10-16 09:47:37,982 DEV : loss 0.1453891098499298 - f1-score (micro avg) 0.8098
2023-10-16 09:47:38,000 saving best model
2023-10-16 09:47:38,481 ----------------------------------------------------------------------------------------------------
2023-10-16 09:47:47,325 epoch 5 - iter 178/1786 - loss 0.03591331 - time (sec): 8.84 - samples/sec: 2566.86 - lr: 0.000020 - momentum: 0.000000
2023-10-16 09:47:56,198 epoch 5 - iter 356/1786 - loss 0.04195969 - time (sec): 17.71 - samples/sec: 2718.76 - lr: 0.000019 - momentum: 0.000000
2023-10-16 09:48:05,110 epoch 5 - iter 534/1786 - loss 0.03788302 - time (sec): 26.62 - samples/sec: 2797.45 - lr: 0.000019 - momentum: 0.000000
2023-10-16 09:48:13,656 epoch 5 - iter 712/1786 - loss 0.03811789 - time (sec): 35.17 - samples/sec: 2784.76 - lr: 0.000019 - momentum: 0.000000
2023-10-16 09:48:22,647 epoch 5 - iter 890/1786 - loss 0.04117165 - time (sec): 44.16 - samples/sec: 2805.22 - lr: 0.000018 - momentum: 0.000000
2023-10-16 09:48:31,231 epoch 5 - iter 1068/1786 - loss 0.04034710 - time (sec): 52.74 - samples/sec: 2780.00 - lr: 0.000018 - momentum: 0.000000
2023-10-16 09:48:40,288 epoch 5 - iter 1246/1786 - loss 0.04037781 - time (sec): 61.80 - samples/sec: 2786.91 - lr: 0.000018 - momentum: 0.000000
2023-10-16 09:48:49,175 epoch 5 - iter 1424/1786 - loss 0.04044669 - time (sec): 70.69 - samples/sec: 2783.63 - lr: 0.000017 - momentum: 0.000000
2023-10-16 09:48:58,146 epoch 5 - iter 1602/1786 - loss 0.04252121 - time (sec): 79.66 - samples/sec: 2801.41 - lr: 0.000017 - momentum: 0.000000
2023-10-16 09:49:06,873 epoch 5 - iter 1780/1786 - loss 0.04260854 - time (sec): 88.39 - samples/sec: 2805.68 - lr: 0.000017 - momentum: 0.000000
2023-10-16 09:49:07,179 ----------------------------------------------------------------------------------------------------
2023-10-16 09:49:07,180 EPOCH 5 done: loss 0.0427 - lr: 0.000017
2023-10-16 09:49:12,015 DEV : loss 0.16638240218162537 - f1-score (micro avg) 0.8147
2023-10-16 09:49:12,032 saving best model
2023-10-16 09:49:12,511 ----------------------------------------------------------------------------------------------------
2023-10-16 09:49:21,429 epoch 6 - iter 178/1786 - loss 0.02981234 - time (sec): 8.91 - samples/sec: 2985.12 - lr: 0.000016 - momentum: 0.000000
2023-10-16 09:49:30,086 epoch 6 - iter 356/1786 - loss 0.02843138 - time (sec): 17.57 - samples/sec: 2895.44 - lr: 0.000016 - momentum: 0.000000
2023-10-16 09:49:39,196 epoch 6 - iter 534/1786 - loss 0.02832788 - time (sec): 26.68 - samples/sec: 2815.02 - lr: 0.000016 - momentum: 0.000000
2023-10-16 09:49:47,947 epoch 6 - iter 712/1786 - loss 0.03080102 - time (sec): 35.43 - samples/sec: 2804.26 - lr: 0.000015 - momentum: 0.000000
2023-10-16 09:49:56,775 epoch 6 - iter 890/1786 - loss 0.03150369 - time (sec): 44.26 - samples/sec: 2792.99 - lr: 0.000015 - momentum: 0.000000
2023-10-16 09:50:05,560 epoch 6 - iter 1068/1786 - loss 0.03084312 - time (sec): 53.04 - samples/sec: 2823.20 - lr: 0.000015 - momentum: 0.000000
2023-10-16 09:50:14,018 epoch 6 - iter 1246/1786 - loss 0.03053296 - time (sec): 61.50 - samples/sec: 2812.53 - lr: 0.000014 - momentum: 0.000000
2023-10-16 09:50:22,886 epoch 6 - iter 1424/1786 - loss 0.03100959 - time (sec): 70.37 - samples/sec: 2805.48 - lr: 0.000014 - momentum: 0.000000
2023-10-16 09:50:31,646 epoch 6 - iter 1602/1786 - loss 0.03088695 - time (sec): 79.13 - samples/sec: 2810.14 - lr: 0.000014 - momentum: 0.000000
2023-10-16 09:50:40,534 epoch 6 - iter 1780/1786 - loss 0.03238525 - time (sec): 88.02 - samples/sec: 2815.78 - lr: 0.000013 - momentum: 0.000000
2023-10-16 09:50:40,794 ----------------------------------------------------------------------------------------------------
2023-10-16 09:50:40,794 EPOCH 6 done: loss 0.0323 - lr: 0.000013
2023-10-16 09:50:45,022 DEV : loss 0.19458912312984467 - f1-score (micro avg) 0.7899
2023-10-16 09:50:45,041 ----------------------------------------------------------------------------------------------------
2023-10-16 09:50:53,864 epoch 7 - iter 178/1786 - loss 0.02985635 - time (sec): 8.82 - samples/sec: 2807.19 - lr: 0.000013 - momentum: 0.000000
2023-10-16 09:51:02,562 epoch 7 - iter 356/1786 - loss 0.02581944 - time (sec): 17.52 - samples/sec: 2859.72 - lr: 0.000013 - momentum: 0.000000
2023-10-16 09:51:11,349 epoch 7 - iter 534/1786 - loss 0.02750578 - time (sec): 26.31 - samples/sec: 2859.61 - lr: 0.000012 - momentum: 0.000000
2023-10-16 09:51:20,019 epoch 7 - iter 712/1786 - loss 0.02534929 - time (sec): 34.98 - samples/sec: 2825.31 - lr: 0.000012 - momentum: 0.000000
2023-10-16 09:51:28,872 epoch 7 - iter 890/1786 - loss 0.02438722 - time (sec): 43.83 - samples/sec: 2823.89 - lr: 0.000012 - momentum: 0.000000
2023-10-16 09:51:37,782 epoch 7 - iter 1068/1786 - loss 0.02540106 - time (sec): 52.74 - samples/sec: 2813.95 - lr: 0.000011 - momentum: 0.000000
2023-10-16 09:51:46,965 epoch 7 - iter 1246/1786 - loss 0.02499926 - time (sec): 61.92 - samples/sec: 2809.89 - lr: 0.000011 - momentum: 0.000000
2023-10-16 09:51:55,652 epoch 7 - iter 1424/1786 - loss 0.02543629 - time (sec): 70.61 - samples/sec: 2793.82 - lr: 0.000011 - momentum: 0.000000
2023-10-16 09:52:04,706 epoch 7 - iter 1602/1786 - loss 0.02551404 - time (sec): 79.66 - samples/sec: 2803.81 - lr: 0.000010 - momentum: 0.000000
2023-10-16 09:52:13,431 epoch 7 - iter 1780/1786 - loss 0.02530936 - time (sec): 88.39 - samples/sec: 2805.62 - lr: 0.000010 - momentum: 0.000000
2023-10-16 09:52:13,715 ----------------------------------------------------------------------------------------------------
2023-10-16 09:52:13,715 EPOCH 7 done: loss 0.0253 - lr: 0.000010
2023-10-16 09:52:18,560 DEV : loss 0.1818549633026123 - f1-score (micro avg) 0.8199
2023-10-16 09:52:18,576 saving best model
2023-10-16 09:52:19,113 ----------------------------------------------------------------------------------------------------
2023-10-16 09:52:28,623 epoch 8 - iter 178/1786 - loss 0.02072833 - time (sec): 9.51 - samples/sec: 2849.01 - lr: 0.000010 - momentum: 0.000000
2023-10-16 09:52:37,582 epoch 8 - iter 356/1786 - loss 0.01525016 - time (sec): 18.47 - samples/sec: 2835.03 - lr: 0.000009 - momentum: 0.000000
2023-10-16 09:52:46,252 epoch 8 - iter 534/1786 - loss 0.01644632 - time (sec): 27.14 - samples/sec: 2846.40 - lr: 0.000009 - momentum: 0.000000
2023-10-16 09:52:54,980 epoch 8 - iter 712/1786 - loss 0.01665271 - time (sec): 35.87 - samples/sec: 2813.13 - lr: 0.000009 - momentum: 0.000000
2023-10-16 09:53:04,188 epoch 8 - iter 890/1786 - loss 0.01765939 - time (sec): 45.07 - samples/sec: 2771.55 - lr: 0.000008 - momentum: 0.000000
2023-10-16 09:53:13,163 epoch 8 - iter 1068/1786 - loss 0.01734742 - time (sec): 54.05 - samples/sec: 2745.30 - lr: 0.000008 - momentum: 0.000000
2023-10-16 09:53:22,142 epoch 8 - iter 1246/1786 - loss 0.01734634 - time (sec): 63.03 - samples/sec: 2776.57 - lr: 0.000008 - momentum: 0.000000
2023-10-16 09:53:30,968 epoch 8 - iter 1424/1786 - loss 0.01739432 - time (sec): 71.85 - samples/sec: 2772.27 - lr: 0.000007 - momentum: 0.000000
2023-10-16 09:53:39,721 epoch 8 - iter 1602/1786 - loss 0.01721767 - time (sec): 80.61 - samples/sec: 2754.32 - lr: 0.000007 - momentum: 0.000000
2023-10-16 09:53:48,509 epoch 8 - iter 1780/1786 - loss 0.01770109 - time (sec): 89.39 - samples/sec: 2775.47 - lr: 0.000007 - momentum: 0.000000
2023-10-16 09:53:48,777 ----------------------------------------------------------------------------------------------------
2023-10-16 09:53:48,777 EPOCH 8 done: loss 0.0177 - lr: 0.000007
2023-10-16 09:53:53,564 DEV : loss 0.18963442742824554 - f1-score (micro avg) 0.8128
2023-10-16 09:53:53,580 ----------------------------------------------------------------------------------------------------
2023-10-16 09:54:02,337 epoch 9 - iter 178/1786 - loss 0.01007676 - time (sec): 8.76 - samples/sec: 2909.33 - lr: 0.000006 - momentum: 0.000000
2023-10-16 09:54:11,003 epoch 9 - iter 356/1786 - loss 0.01038145 - time (sec): 17.42 - samples/sec: 2821.31 - lr: 0.000006 - momentum: 0.000000
2023-10-16 09:54:19,677 epoch 9 - iter 534/1786 - loss 0.00929853 - time (sec): 26.10 - samples/sec: 2830.26 - lr: 0.000006 - momentum: 0.000000
2023-10-16 09:54:28,542 epoch 9 - iter 712/1786 - loss 0.00942723 - time (sec): 34.96 - samples/sec: 2845.40 - lr: 0.000005 - momentum: 0.000000
2023-10-16 09:54:37,262 epoch 9 - iter 890/1786 - loss 0.01080356 - time (sec): 43.68 - samples/sec: 2813.33 - lr: 0.000005 - momentum: 0.000000
2023-10-16 09:54:45,873 epoch 9 - iter 1068/1786 - loss 0.01145128 - time (sec): 52.29 - samples/sec: 2815.46 - lr: 0.000005 - momentum: 0.000000
2023-10-16 09:54:54,450 epoch 9 - iter 1246/1786 - loss 0.01179870 - time (sec): 60.87 - samples/sec: 2823.11 - lr: 0.000004 - momentum: 0.000000
2023-10-16 09:55:03,201 epoch 9 - iter 1424/1786 - loss 0.01186486 - time (sec): 69.62 - samples/sec: 2824.56 - lr: 0.000004 - momentum: 0.000000
2023-10-16 09:55:11,956 epoch 9 - iter 1602/1786 - loss 0.01252162 - time (sec): 78.37 - samples/sec: 2820.68 - lr: 0.000004 - momentum: 0.000000
2023-10-16 09:55:21,133 epoch 9 - iter 1780/1786 - loss 0.01260169 - time (sec): 87.55 - samples/sec: 2832.02 - lr: 0.000003 - momentum: 0.000000
2023-10-16 09:55:21,408 ----------------------------------------------------------------------------------------------------
2023-10-16 09:55:21,408 EPOCH 9 done: loss 0.0126 - lr: 0.000003
2023-10-16 09:55:25,491 DEV : loss 0.20145297050476074 - f1-score (micro avg) 0.8042
2023-10-16 09:55:25,507 ----------------------------------------------------------------------------------------------------
2023-10-16 09:55:34,238 epoch 10 - iter 178/1786 - loss 0.00640268 - time (sec): 8.73 - samples/sec: 2762.39 - lr: 0.000003 - momentum: 0.000000
2023-10-16 09:55:43,024 epoch 10 - iter 356/1786 - loss 0.00598649 - time (sec): 17.52 - samples/sec: 2823.54 - lr: 0.000003 - momentum: 0.000000
2023-10-16 09:55:51,861 epoch 10 - iter 534/1786 - loss 0.00789000 - time (sec): 26.35 - samples/sec: 2827.16 - lr: 0.000002 - momentum: 0.000000
2023-10-16 09:56:00,615 epoch 10 - iter 712/1786 - loss 0.00896349 - time (sec): 35.11 - samples/sec: 2836.69 - lr: 0.000002 - momentum: 0.000000
2023-10-16 09:56:09,212 epoch 10 - iter 890/1786 - loss 0.01000578 - time (sec): 43.70 - samples/sec: 2850.59 - lr: 0.000002 - momentum: 0.000000
2023-10-16 09:56:18,040 epoch 10 - iter 1068/1786 - loss 0.00947271 - time (sec): 52.53 - samples/sec: 2855.88 - lr: 0.000001 - momentum: 0.000000
2023-10-16 09:56:27,011 epoch 10 - iter 1246/1786 - loss 0.00883309 - time (sec): 61.50 - samples/sec: 2856.10 - lr: 0.000001 - momentum: 0.000000
2023-10-16 09:56:35,737 epoch 10 - iter 1424/1786 - loss 0.00874636 - time (sec): 70.23 - samples/sec: 2845.64 - lr: 0.000001 - momentum: 0.000000
2023-10-16 09:56:44,623 epoch 10 - iter 1602/1786 - loss 0.00833858 - time (sec): 79.11 - samples/sec: 2839.53 - lr: 0.000000 - momentum: 0.000000
2023-10-16 09:56:53,522 epoch 10 - iter 1780/1786 - loss 0.00829271 - time (sec): 88.01 - samples/sec: 2820.09 - lr: 0.000000 - momentum: 0.000000
2023-10-16 09:56:53,791 ----------------------------------------------------------------------------------------------------
2023-10-16 09:56:53,791 EPOCH 10 done: loss 0.0083 - lr: 0.000000
2023-10-16 09:56:58,429 DEV : loss 0.20914104580879211 - f1-score (micro avg) 0.8046
2023-10-16 09:56:58,857 ----------------------------------------------------------------------------------------------------
2023-10-16 09:56:58,858 Loading model from best epoch ...
2023-10-16 09:57:00,383 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd
2023-10-16 09:57:09,993
Results:
- F-score (micro) 0.6947
- F-score (macro) 0.6317
- Accuracy 0.546
By class:
precision recall f1-score support
LOC 0.6816 0.7078 0.6944 1095
PER 0.7505 0.7757 0.7629 1012
ORG 0.4963 0.5686 0.5300 357
HumanProd 0.4286 0.7273 0.5393 33
micro avg 0.6748 0.7157 0.6947 2497
macro avg 0.5893 0.6948 0.6317 2497
weighted avg 0.6797 0.7157 0.6966 2497
2023-10-16 09:57:09,993 ----------------------------------------------------------------------------------------------------