|
2023-10-16 09:41:24,547 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:41:24,548 Model: "SequenceTagger( |
|
(embeddings): TransformerWordEmbeddings( |
|
(model): BertModel( |
|
(embeddings): BertEmbeddings( |
|
(word_embeddings): Embedding(32001, 768) |
|
(position_embeddings): Embedding(512, 768) |
|
(token_type_embeddings): Embedding(2, 768) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(encoder): BertEncoder( |
|
(layer): ModuleList( |
|
(0-11): 12 x BertLayer( |
|
(attention): BertAttention( |
|
(self): BertSelfAttention( |
|
(query): Linear(in_features=768, out_features=768, bias=True) |
|
(key): Linear(in_features=768, out_features=768, bias=True) |
|
(value): Linear(in_features=768, out_features=768, bias=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
(output): BertSelfOutput( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
(intermediate): BertIntermediate( |
|
(dense): Linear(in_features=768, out_features=3072, bias=True) |
|
(intermediate_act_fn): GELUActivation() |
|
) |
|
(output): BertOutput( |
|
(dense): Linear(in_features=3072, out_features=768, bias=True) |
|
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) |
|
(dropout): Dropout(p=0.1, inplace=False) |
|
) |
|
) |
|
) |
|
) |
|
(pooler): BertPooler( |
|
(dense): Linear(in_features=768, out_features=768, bias=True) |
|
(activation): Tanh() |
|
) |
|
) |
|
) |
|
(locked_dropout): LockedDropout(p=0.5) |
|
(linear): Linear(in_features=768, out_features=17, bias=True) |
|
(loss_function): CrossEntropyLoss() |
|
)" |
|
2023-10-16 09:41:24,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:41:24,548 MultiCorpus: 7142 train + 698 dev + 2570 test sentences |
|
- NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator |
|
2023-10-16 09:41:24,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:41:24,548 Train: 7142 sentences |
|
2023-10-16 09:41:24,548 (train_with_dev=False, train_with_test=False) |
|
2023-10-16 09:41:24,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:41:24,548 Training Params: |
|
2023-10-16 09:41:24,548 - learning_rate: "3e-05" |
|
2023-10-16 09:41:24,548 - mini_batch_size: "4" |
|
2023-10-16 09:41:24,548 - max_epochs: "10" |
|
2023-10-16 09:41:24,548 - shuffle: "True" |
|
2023-10-16 09:41:24,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:41:24,548 Plugins: |
|
2023-10-16 09:41:24,548 - LinearScheduler | warmup_fraction: '0.1' |
|
2023-10-16 09:41:24,548 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:41:24,548 Final evaluation on model from best epoch (best-model.pt) |
|
2023-10-16 09:41:24,548 - metric: "('micro avg', 'f1-score')" |
|
2023-10-16 09:41:24,549 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:41:24,549 Computation: |
|
2023-10-16 09:41:24,549 - compute on device: cuda:0 |
|
2023-10-16 09:41:24,549 - embedding storage: none |
|
2023-10-16 09:41:24,549 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:41:24,549 Model training base path: "hmbench-newseye/fr-dbmdz/bert-base-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-2" |
|
2023-10-16 09:41:24,549 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:41:24,549 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:41:33,533 epoch 1 - iter 178/1786 - loss 2.31725358 - time (sec): 8.98 - samples/sec: 2906.50 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 09:41:42,484 epoch 1 - iter 356/1786 - loss 1.47845564 - time (sec): 17.93 - samples/sec: 2820.10 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 09:41:51,268 epoch 1 - iter 534/1786 - loss 1.11555514 - time (sec): 26.72 - samples/sec: 2815.66 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 09:42:00,133 epoch 1 - iter 712/1786 - loss 0.91749738 - time (sec): 35.58 - samples/sec: 2796.69 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 09:42:08,756 epoch 1 - iter 890/1786 - loss 0.78960784 - time (sec): 44.21 - samples/sec: 2780.89 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 09:42:17,850 epoch 1 - iter 1068/1786 - loss 0.69748291 - time (sec): 53.30 - samples/sec: 2751.43 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 09:42:27,008 epoch 1 - iter 1246/1786 - loss 0.62001255 - time (sec): 62.46 - samples/sec: 2766.94 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 09:42:36,105 epoch 1 - iter 1424/1786 - loss 0.55925070 - time (sec): 71.56 - samples/sec: 2773.56 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 09:42:45,037 epoch 1 - iter 1602/1786 - loss 0.51380389 - time (sec): 80.49 - samples/sec: 2771.52 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 09:42:54,388 epoch 1 - iter 1780/1786 - loss 0.47570966 - time (sec): 89.84 - samples/sec: 2760.26 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 09:42:54,673 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:42:54,673 EPOCH 1 done: loss 0.4746 - lr: 0.000030 |
|
2023-10-16 09:42:57,223 DEV : loss 0.14207538962364197 - f1-score (micro avg) 0.6781 |
|
2023-10-16 09:42:57,239 saving best model |
|
2023-10-16 09:42:57,669 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:43:06,735 epoch 2 - iter 178/1786 - loss 0.10458942 - time (sec): 9.06 - samples/sec: 2828.77 - lr: 0.000030 - momentum: 0.000000 |
|
2023-10-16 09:43:15,763 epoch 2 - iter 356/1786 - loss 0.11289530 - time (sec): 18.09 - samples/sec: 2789.86 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 09:43:24,751 epoch 2 - iter 534/1786 - loss 0.11838363 - time (sec): 27.08 - samples/sec: 2726.46 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 09:43:33,440 epoch 2 - iter 712/1786 - loss 0.11649418 - time (sec): 35.77 - samples/sec: 2770.55 - lr: 0.000029 - momentum: 0.000000 |
|
2023-10-16 09:43:42,555 epoch 2 - iter 890/1786 - loss 0.11836022 - time (sec): 44.88 - samples/sec: 2788.32 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 09:43:51,148 epoch 2 - iter 1068/1786 - loss 0.11714005 - time (sec): 53.48 - samples/sec: 2794.44 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 09:43:59,933 epoch 2 - iter 1246/1786 - loss 0.11797601 - time (sec): 62.26 - samples/sec: 2784.34 - lr: 0.000028 - momentum: 0.000000 |
|
2023-10-16 09:44:08,599 epoch 2 - iter 1424/1786 - loss 0.11713784 - time (sec): 70.93 - samples/sec: 2792.29 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 09:44:17,368 epoch 2 - iter 1602/1786 - loss 0.11647325 - time (sec): 79.70 - samples/sec: 2814.52 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 09:44:26,201 epoch 2 - iter 1780/1786 - loss 0.11414611 - time (sec): 88.53 - samples/sec: 2803.96 - lr: 0.000027 - momentum: 0.000000 |
|
2023-10-16 09:44:26,489 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:44:26,489 EPOCH 2 done: loss 0.1141 - lr: 0.000027 |
|
2023-10-16 09:44:31,250 DEV : loss 0.12184497714042664 - f1-score (micro avg) 0.7654 |
|
2023-10-16 09:44:31,266 saving best model |
|
2023-10-16 09:44:31,758 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:44:40,509 epoch 3 - iter 178/1786 - loss 0.07028279 - time (sec): 8.75 - samples/sec: 2717.53 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 09:44:49,387 epoch 3 - iter 356/1786 - loss 0.07414788 - time (sec): 17.63 - samples/sec: 2805.34 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 09:44:58,046 epoch 3 - iter 534/1786 - loss 0.07616466 - time (sec): 26.29 - samples/sec: 2829.08 - lr: 0.000026 - momentum: 0.000000 |
|
2023-10-16 09:45:07,014 epoch 3 - iter 712/1786 - loss 0.07702373 - time (sec): 35.25 - samples/sec: 2857.75 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 09:45:15,560 epoch 3 - iter 890/1786 - loss 0.07733931 - time (sec): 43.80 - samples/sec: 2857.14 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 09:45:24,052 epoch 3 - iter 1068/1786 - loss 0.07927594 - time (sec): 52.29 - samples/sec: 2845.37 - lr: 0.000025 - momentum: 0.000000 |
|
2023-10-16 09:45:32,889 epoch 3 - iter 1246/1786 - loss 0.07824875 - time (sec): 61.13 - samples/sec: 2824.98 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 09:45:41,842 epoch 3 - iter 1424/1786 - loss 0.07833523 - time (sec): 70.08 - samples/sec: 2817.66 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 09:45:50,813 epoch 3 - iter 1602/1786 - loss 0.07784066 - time (sec): 79.05 - samples/sec: 2805.85 - lr: 0.000024 - momentum: 0.000000 |
|
2023-10-16 09:45:59,884 epoch 3 - iter 1780/1786 - loss 0.07736904 - time (sec): 88.12 - samples/sec: 2811.90 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 09:46:00,242 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:46:00,242 EPOCH 3 done: loss 0.0775 - lr: 0.000023 |
|
2023-10-16 09:46:04,416 DEV : loss 0.12185992300510406 - f1-score (micro avg) 0.7936 |
|
2023-10-16 09:46:04,433 saving best model |
|
2023-10-16 09:46:04,894 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:46:14,224 epoch 4 - iter 178/1786 - loss 0.04631419 - time (sec): 9.33 - samples/sec: 2653.86 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 09:46:22,793 epoch 4 - iter 356/1786 - loss 0.05317882 - time (sec): 17.89 - samples/sec: 2767.12 - lr: 0.000023 - momentum: 0.000000 |
|
2023-10-16 09:46:31,654 epoch 4 - iter 534/1786 - loss 0.05386557 - time (sec): 26.76 - samples/sec: 2755.06 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 09:46:40,697 epoch 4 - iter 712/1786 - loss 0.05265052 - time (sec): 35.80 - samples/sec: 2794.02 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 09:46:49,295 epoch 4 - iter 890/1786 - loss 0.05513144 - time (sec): 44.40 - samples/sec: 2788.99 - lr: 0.000022 - momentum: 0.000000 |
|
2023-10-16 09:46:57,956 epoch 4 - iter 1068/1786 - loss 0.05490959 - time (sec): 53.06 - samples/sec: 2787.52 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 09:47:06,623 epoch 4 - iter 1246/1786 - loss 0.05446774 - time (sec): 61.72 - samples/sec: 2798.70 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 09:47:15,185 epoch 4 - iter 1424/1786 - loss 0.05609331 - time (sec): 70.29 - samples/sec: 2801.47 - lr: 0.000021 - momentum: 0.000000 |
|
2023-10-16 09:47:23,833 epoch 4 - iter 1602/1786 - loss 0.05699963 - time (sec): 78.93 - samples/sec: 2789.90 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 09:47:33,473 epoch 4 - iter 1780/1786 - loss 0.05709770 - time (sec): 88.57 - samples/sec: 2799.92 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 09:47:33,791 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:47:33,791 EPOCH 4 done: loss 0.0572 - lr: 0.000020 |
|
2023-10-16 09:47:37,982 DEV : loss 0.1453891098499298 - f1-score (micro avg) 0.8098 |
|
2023-10-16 09:47:38,000 saving best model |
|
2023-10-16 09:47:38,481 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:47:47,325 epoch 5 - iter 178/1786 - loss 0.03591331 - time (sec): 8.84 - samples/sec: 2566.86 - lr: 0.000020 - momentum: 0.000000 |
|
2023-10-16 09:47:56,198 epoch 5 - iter 356/1786 - loss 0.04195969 - time (sec): 17.71 - samples/sec: 2718.76 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 09:48:05,110 epoch 5 - iter 534/1786 - loss 0.03788302 - time (sec): 26.62 - samples/sec: 2797.45 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 09:48:13,656 epoch 5 - iter 712/1786 - loss 0.03811789 - time (sec): 35.17 - samples/sec: 2784.76 - lr: 0.000019 - momentum: 0.000000 |
|
2023-10-16 09:48:22,647 epoch 5 - iter 890/1786 - loss 0.04117165 - time (sec): 44.16 - samples/sec: 2805.22 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 09:48:31,231 epoch 5 - iter 1068/1786 - loss 0.04034710 - time (sec): 52.74 - samples/sec: 2780.00 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 09:48:40,288 epoch 5 - iter 1246/1786 - loss 0.04037781 - time (sec): 61.80 - samples/sec: 2786.91 - lr: 0.000018 - momentum: 0.000000 |
|
2023-10-16 09:48:49,175 epoch 5 - iter 1424/1786 - loss 0.04044669 - time (sec): 70.69 - samples/sec: 2783.63 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 09:48:58,146 epoch 5 - iter 1602/1786 - loss 0.04252121 - time (sec): 79.66 - samples/sec: 2801.41 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 09:49:06,873 epoch 5 - iter 1780/1786 - loss 0.04260854 - time (sec): 88.39 - samples/sec: 2805.68 - lr: 0.000017 - momentum: 0.000000 |
|
2023-10-16 09:49:07,179 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:49:07,180 EPOCH 5 done: loss 0.0427 - lr: 0.000017 |
|
2023-10-16 09:49:12,015 DEV : loss 0.16638240218162537 - f1-score (micro avg) 0.8147 |
|
2023-10-16 09:49:12,032 saving best model |
|
2023-10-16 09:49:12,511 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:49:21,429 epoch 6 - iter 178/1786 - loss 0.02981234 - time (sec): 8.91 - samples/sec: 2985.12 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 09:49:30,086 epoch 6 - iter 356/1786 - loss 0.02843138 - time (sec): 17.57 - samples/sec: 2895.44 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 09:49:39,196 epoch 6 - iter 534/1786 - loss 0.02832788 - time (sec): 26.68 - samples/sec: 2815.02 - lr: 0.000016 - momentum: 0.000000 |
|
2023-10-16 09:49:47,947 epoch 6 - iter 712/1786 - loss 0.03080102 - time (sec): 35.43 - samples/sec: 2804.26 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 09:49:56,775 epoch 6 - iter 890/1786 - loss 0.03150369 - time (sec): 44.26 - samples/sec: 2792.99 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 09:50:05,560 epoch 6 - iter 1068/1786 - loss 0.03084312 - time (sec): 53.04 - samples/sec: 2823.20 - lr: 0.000015 - momentum: 0.000000 |
|
2023-10-16 09:50:14,018 epoch 6 - iter 1246/1786 - loss 0.03053296 - time (sec): 61.50 - samples/sec: 2812.53 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 09:50:22,886 epoch 6 - iter 1424/1786 - loss 0.03100959 - time (sec): 70.37 - samples/sec: 2805.48 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 09:50:31,646 epoch 6 - iter 1602/1786 - loss 0.03088695 - time (sec): 79.13 - samples/sec: 2810.14 - lr: 0.000014 - momentum: 0.000000 |
|
2023-10-16 09:50:40,534 epoch 6 - iter 1780/1786 - loss 0.03238525 - time (sec): 88.02 - samples/sec: 2815.78 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 09:50:40,794 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:50:40,794 EPOCH 6 done: loss 0.0323 - lr: 0.000013 |
|
2023-10-16 09:50:45,022 DEV : loss 0.19458912312984467 - f1-score (micro avg) 0.7899 |
|
2023-10-16 09:50:45,041 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:50:53,864 epoch 7 - iter 178/1786 - loss 0.02985635 - time (sec): 8.82 - samples/sec: 2807.19 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 09:51:02,562 epoch 7 - iter 356/1786 - loss 0.02581944 - time (sec): 17.52 - samples/sec: 2859.72 - lr: 0.000013 - momentum: 0.000000 |
|
2023-10-16 09:51:11,349 epoch 7 - iter 534/1786 - loss 0.02750578 - time (sec): 26.31 - samples/sec: 2859.61 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 09:51:20,019 epoch 7 - iter 712/1786 - loss 0.02534929 - time (sec): 34.98 - samples/sec: 2825.31 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 09:51:28,872 epoch 7 - iter 890/1786 - loss 0.02438722 - time (sec): 43.83 - samples/sec: 2823.89 - lr: 0.000012 - momentum: 0.000000 |
|
2023-10-16 09:51:37,782 epoch 7 - iter 1068/1786 - loss 0.02540106 - time (sec): 52.74 - samples/sec: 2813.95 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 09:51:46,965 epoch 7 - iter 1246/1786 - loss 0.02499926 - time (sec): 61.92 - samples/sec: 2809.89 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 09:51:55,652 epoch 7 - iter 1424/1786 - loss 0.02543629 - time (sec): 70.61 - samples/sec: 2793.82 - lr: 0.000011 - momentum: 0.000000 |
|
2023-10-16 09:52:04,706 epoch 7 - iter 1602/1786 - loss 0.02551404 - time (sec): 79.66 - samples/sec: 2803.81 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 09:52:13,431 epoch 7 - iter 1780/1786 - loss 0.02530936 - time (sec): 88.39 - samples/sec: 2805.62 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 09:52:13,715 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:52:13,715 EPOCH 7 done: loss 0.0253 - lr: 0.000010 |
|
2023-10-16 09:52:18,560 DEV : loss 0.1818549633026123 - f1-score (micro avg) 0.8199 |
|
2023-10-16 09:52:18,576 saving best model |
|
2023-10-16 09:52:19,113 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:52:28,623 epoch 8 - iter 178/1786 - loss 0.02072833 - time (sec): 9.51 - samples/sec: 2849.01 - lr: 0.000010 - momentum: 0.000000 |
|
2023-10-16 09:52:37,582 epoch 8 - iter 356/1786 - loss 0.01525016 - time (sec): 18.47 - samples/sec: 2835.03 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 09:52:46,252 epoch 8 - iter 534/1786 - loss 0.01644632 - time (sec): 27.14 - samples/sec: 2846.40 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 09:52:54,980 epoch 8 - iter 712/1786 - loss 0.01665271 - time (sec): 35.87 - samples/sec: 2813.13 - lr: 0.000009 - momentum: 0.000000 |
|
2023-10-16 09:53:04,188 epoch 8 - iter 890/1786 - loss 0.01765939 - time (sec): 45.07 - samples/sec: 2771.55 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 09:53:13,163 epoch 8 - iter 1068/1786 - loss 0.01734742 - time (sec): 54.05 - samples/sec: 2745.30 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 09:53:22,142 epoch 8 - iter 1246/1786 - loss 0.01734634 - time (sec): 63.03 - samples/sec: 2776.57 - lr: 0.000008 - momentum: 0.000000 |
|
2023-10-16 09:53:30,968 epoch 8 - iter 1424/1786 - loss 0.01739432 - time (sec): 71.85 - samples/sec: 2772.27 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 09:53:39,721 epoch 8 - iter 1602/1786 - loss 0.01721767 - time (sec): 80.61 - samples/sec: 2754.32 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 09:53:48,509 epoch 8 - iter 1780/1786 - loss 0.01770109 - time (sec): 89.39 - samples/sec: 2775.47 - lr: 0.000007 - momentum: 0.000000 |
|
2023-10-16 09:53:48,777 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:53:48,777 EPOCH 8 done: loss 0.0177 - lr: 0.000007 |
|
2023-10-16 09:53:53,564 DEV : loss 0.18963442742824554 - f1-score (micro avg) 0.8128 |
|
2023-10-16 09:53:53,580 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:54:02,337 epoch 9 - iter 178/1786 - loss 0.01007676 - time (sec): 8.76 - samples/sec: 2909.33 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 09:54:11,003 epoch 9 - iter 356/1786 - loss 0.01038145 - time (sec): 17.42 - samples/sec: 2821.31 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 09:54:19,677 epoch 9 - iter 534/1786 - loss 0.00929853 - time (sec): 26.10 - samples/sec: 2830.26 - lr: 0.000006 - momentum: 0.000000 |
|
2023-10-16 09:54:28,542 epoch 9 - iter 712/1786 - loss 0.00942723 - time (sec): 34.96 - samples/sec: 2845.40 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 09:54:37,262 epoch 9 - iter 890/1786 - loss 0.01080356 - time (sec): 43.68 - samples/sec: 2813.33 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 09:54:45,873 epoch 9 - iter 1068/1786 - loss 0.01145128 - time (sec): 52.29 - samples/sec: 2815.46 - lr: 0.000005 - momentum: 0.000000 |
|
2023-10-16 09:54:54,450 epoch 9 - iter 1246/1786 - loss 0.01179870 - time (sec): 60.87 - samples/sec: 2823.11 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 09:55:03,201 epoch 9 - iter 1424/1786 - loss 0.01186486 - time (sec): 69.62 - samples/sec: 2824.56 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 09:55:11,956 epoch 9 - iter 1602/1786 - loss 0.01252162 - time (sec): 78.37 - samples/sec: 2820.68 - lr: 0.000004 - momentum: 0.000000 |
|
2023-10-16 09:55:21,133 epoch 9 - iter 1780/1786 - loss 0.01260169 - time (sec): 87.55 - samples/sec: 2832.02 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 09:55:21,408 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:55:21,408 EPOCH 9 done: loss 0.0126 - lr: 0.000003 |
|
2023-10-16 09:55:25,491 DEV : loss 0.20145297050476074 - f1-score (micro avg) 0.8042 |
|
2023-10-16 09:55:25,507 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:55:34,238 epoch 10 - iter 178/1786 - loss 0.00640268 - time (sec): 8.73 - samples/sec: 2762.39 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 09:55:43,024 epoch 10 - iter 356/1786 - loss 0.00598649 - time (sec): 17.52 - samples/sec: 2823.54 - lr: 0.000003 - momentum: 0.000000 |
|
2023-10-16 09:55:51,861 epoch 10 - iter 534/1786 - loss 0.00789000 - time (sec): 26.35 - samples/sec: 2827.16 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 09:56:00,615 epoch 10 - iter 712/1786 - loss 0.00896349 - time (sec): 35.11 - samples/sec: 2836.69 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 09:56:09,212 epoch 10 - iter 890/1786 - loss 0.01000578 - time (sec): 43.70 - samples/sec: 2850.59 - lr: 0.000002 - momentum: 0.000000 |
|
2023-10-16 09:56:18,040 epoch 10 - iter 1068/1786 - loss 0.00947271 - time (sec): 52.53 - samples/sec: 2855.88 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 09:56:27,011 epoch 10 - iter 1246/1786 - loss 0.00883309 - time (sec): 61.50 - samples/sec: 2856.10 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 09:56:35,737 epoch 10 - iter 1424/1786 - loss 0.00874636 - time (sec): 70.23 - samples/sec: 2845.64 - lr: 0.000001 - momentum: 0.000000 |
|
2023-10-16 09:56:44,623 epoch 10 - iter 1602/1786 - loss 0.00833858 - time (sec): 79.11 - samples/sec: 2839.53 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 09:56:53,522 epoch 10 - iter 1780/1786 - loss 0.00829271 - time (sec): 88.01 - samples/sec: 2820.09 - lr: 0.000000 - momentum: 0.000000 |
|
2023-10-16 09:56:53,791 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:56:53,791 EPOCH 10 done: loss 0.0083 - lr: 0.000000 |
|
2023-10-16 09:56:58,429 DEV : loss 0.20914104580879211 - f1-score (micro avg) 0.8046 |
|
2023-10-16 09:56:58,857 ---------------------------------------------------------------------------------------------------- |
|
2023-10-16 09:56:58,858 Loading model from best epoch ... |
|
2023-10-16 09:57:00,383 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd |
|
2023-10-16 09:57:09,993 |
|
Results: |
|
- F-score (micro) 0.6947 |
|
- F-score (macro) 0.6317 |
|
- Accuracy 0.546 |
|
|
|
By class: |
|
precision recall f1-score support |
|
|
|
LOC 0.6816 0.7078 0.6944 1095 |
|
PER 0.7505 0.7757 0.7629 1012 |
|
ORG 0.4963 0.5686 0.5300 357 |
|
HumanProd 0.4286 0.7273 0.5393 33 |
|
|
|
micro avg 0.6748 0.7157 0.6947 2497 |
|
macro avg 0.5893 0.6948 0.6317 2497 |
|
weighted avg 0.6797 0.7157 0.6966 2497 |
|
|
|
2023-10-16 09:57:09,993 ---------------------------------------------------------------------------------------------------- |
|
|