2023-10-17 12:25:36,997 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:25:36,998 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): ElectraModel( (embeddings): ElectraEmbeddings( (word_embeddings): Embedding(32001, 768) (position_embeddings): Embedding(512, 768) (token_type_embeddings): Embedding(2, 768) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): ElectraEncoder( (layer): ModuleList( (0-11): 12 x ElectraLayer( (attention): ElectraAttention( (self): ElectraSelfAttention( (query): Linear(in_features=768, out_features=768, bias=True) (key): Linear(in_features=768, out_features=768, bias=True) (value): Linear(in_features=768, out_features=768, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): ElectraSelfOutput( (dense): Linear(in_features=768, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): ElectraIntermediate( (dense): Linear(in_features=768, out_features=3072, bias=True) (intermediate_act_fn): GELUActivation() ) (output): ElectraOutput( (dense): Linear(in_features=3072, out_features=768, bias=True) (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=768, out_features=17, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-17 12:25:36,998 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:25:36,998 MultiCorpus: 7142 train + 698 dev + 2570 test sentences - NER_HIPE_2022 Corpus: 7142 train + 698 dev + 2570 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/newseye/fr/with_doc_seperator 2023-10-17 12:25:36,998 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:25:36,998 Train: 7142 sentences 2023-10-17 12:25:36,998 (train_with_dev=False, train_with_test=False) 2023-10-17 12:25:36,998 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:25:36,998 Training Params: 2023-10-17 12:25:36,998 - learning_rate: "5e-05" 2023-10-17 12:25:36,998 - mini_batch_size: "8" 2023-10-17 12:25:36,998 - max_epochs: "10" 2023-10-17 12:25:36,998 - shuffle: "True" 2023-10-17 12:25:36,998 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:25:36,998 Plugins: 2023-10-17 12:25:36,998 - TensorboardLogger 2023-10-17 12:25:36,998 - LinearScheduler | warmup_fraction: '0.1' 2023-10-17 12:25:36,999 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:25:36,999 Final evaluation on model from best epoch (best-model.pt) 2023-10-17 12:25:36,999 - metric: "('micro avg', 'f1-score')" 2023-10-17 12:25:36,999 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:25:36,999 Computation: 2023-10-17 12:25:36,999 - compute on device: cuda:0 2023-10-17 12:25:36,999 - embedding storage: none 2023-10-17 12:25:36,999 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:25:36,999 Model training base path: "hmbench-newseye/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-1" 2023-10-17 12:25:36,999 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:25:36,999 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:25:36,999 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-17 12:25:44,054 epoch 1 - iter 89/893 - loss 2.51270054 - time (sec): 7.05 - samples/sec: 3473.60 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:25:50,693 epoch 1 - iter 178/893 - loss 1.56797013 - time (sec): 13.69 - samples/sec: 3627.57 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:25:57,490 epoch 1 - iter 267/893 - loss 1.16082935 - time (sec): 20.49 - samples/sec: 3664.35 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:26:04,127 epoch 1 - iter 356/893 - loss 0.95358020 - time (sec): 27.13 - samples/sec: 3623.69 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:26:11,127 epoch 1 - iter 445/893 - loss 0.80942724 - time (sec): 34.13 - samples/sec: 3607.41 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:26:18,044 epoch 1 - iter 534/893 - loss 0.70411281 - time (sec): 41.04 - samples/sec: 3613.53 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:26:24,723 epoch 1 - iter 623/893 - loss 0.63046341 - time (sec): 47.72 - samples/sec: 3617.11 - lr: 0.000035 - momentum: 0.000000 2023-10-17 12:26:31,917 epoch 1 - iter 712/893 - loss 0.56581935 - time (sec): 54.92 - samples/sec: 3609.36 - lr: 0.000040 - momentum: 0.000000 2023-10-17 12:26:39,013 epoch 1 - iter 801/893 - loss 0.52137910 - time (sec): 62.01 - samples/sec: 3587.67 - lr: 0.000045 - momentum: 0.000000 2023-10-17 12:26:46,067 epoch 1 - iter 890/893 - loss 0.48321528 - time (sec): 69.07 - samples/sec: 3590.42 - lr: 0.000050 - momentum: 0.000000 2023-10-17 12:26:46,259 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:26:46,260 EPOCH 1 done: loss 0.4823 - lr: 0.000050 2023-10-17 12:26:49,326 DEV : loss 0.10747521370649338 - f1-score (micro avg) 0.7346 2023-10-17 12:26:49,342 saving best model 2023-10-17 12:26:49,698 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:26:55,961 epoch 2 - iter 89/893 - loss 0.13115492 - time (sec): 6.26 - samples/sec: 3759.94 - lr: 0.000049 - momentum: 0.000000 2023-10-17 12:27:02,936 epoch 2 - iter 178/893 - loss 0.12309626 - time (sec): 13.24 - samples/sec: 3673.41 - lr: 0.000049 - momentum: 0.000000 2023-10-17 12:27:10,293 epoch 2 - iter 267/893 - loss 0.11730129 - time (sec): 20.59 - samples/sec: 3585.41 - lr: 0.000048 - momentum: 0.000000 2023-10-17 12:27:17,302 epoch 2 - iter 356/893 - loss 0.11326307 - time (sec): 27.60 - samples/sec: 3571.89 - lr: 0.000048 - momentum: 0.000000 2023-10-17 12:27:24,123 epoch 2 - iter 445/893 - loss 0.11005394 - time (sec): 34.42 - samples/sec: 3578.68 - lr: 0.000047 - momentum: 0.000000 2023-10-17 12:27:31,020 epoch 2 - iter 534/893 - loss 0.11030052 - time (sec): 41.32 - samples/sec: 3591.15 - lr: 0.000047 - momentum: 0.000000 2023-10-17 12:27:38,052 epoch 2 - iter 623/893 - loss 0.11047052 - time (sec): 48.35 - samples/sec: 3563.48 - lr: 0.000046 - momentum: 0.000000 2023-10-17 12:27:44,943 epoch 2 - iter 712/893 - loss 0.10912533 - time (sec): 55.24 - samples/sec: 3573.03 - lr: 0.000046 - momentum: 0.000000 2023-10-17 12:27:52,179 epoch 2 - iter 801/893 - loss 0.10854466 - time (sec): 62.48 - samples/sec: 3599.44 - lr: 0.000045 - momentum: 0.000000 2023-10-17 12:27:59,024 epoch 2 - iter 890/893 - loss 0.10807744 - time (sec): 69.32 - samples/sec: 3576.74 - lr: 0.000044 - momentum: 0.000000 2023-10-17 12:27:59,280 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:27:59,280 EPOCH 2 done: loss 0.1081 - lr: 0.000044 2023-10-17 12:28:03,929 DEV : loss 0.10729347169399261 - f1-score (micro avg) 0.7712 2023-10-17 12:28:03,944 saving best model 2023-10-17 12:28:04,550 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:28:11,804 epoch 3 - iter 89/893 - loss 0.07385466 - time (sec): 7.25 - samples/sec: 3586.68 - lr: 0.000044 - momentum: 0.000000 2023-10-17 12:28:18,766 epoch 3 - iter 178/893 - loss 0.06943125 - time (sec): 14.21 - samples/sec: 3555.12 - lr: 0.000043 - momentum: 0.000000 2023-10-17 12:28:26,515 epoch 3 - iter 267/893 - loss 0.07106101 - time (sec): 21.96 - samples/sec: 3470.07 - lr: 0.000043 - momentum: 0.000000 2023-10-17 12:28:33,657 epoch 3 - iter 356/893 - loss 0.07024538 - time (sec): 29.11 - samples/sec: 3536.22 - lr: 0.000042 - momentum: 0.000000 2023-10-17 12:28:40,826 epoch 3 - iter 445/893 - loss 0.07099337 - time (sec): 36.27 - samples/sec: 3534.14 - lr: 0.000042 - momentum: 0.000000 2023-10-17 12:28:47,363 epoch 3 - iter 534/893 - loss 0.07156220 - time (sec): 42.81 - samples/sec: 3553.29 - lr: 0.000041 - momentum: 0.000000 2023-10-17 12:28:54,140 epoch 3 - iter 623/893 - loss 0.07248512 - time (sec): 49.59 - samples/sec: 3563.39 - lr: 0.000041 - momentum: 0.000000 2023-10-17 12:29:00,842 epoch 3 - iter 712/893 - loss 0.07186031 - time (sec): 56.29 - samples/sec: 3564.65 - lr: 0.000040 - momentum: 0.000000 2023-10-17 12:29:07,960 epoch 3 - iter 801/893 - loss 0.07047222 - time (sec): 63.41 - samples/sec: 3548.23 - lr: 0.000039 - momentum: 0.000000 2023-10-17 12:29:14,213 epoch 3 - iter 890/893 - loss 0.07129680 - time (sec): 69.66 - samples/sec: 3560.12 - lr: 0.000039 - momentum: 0.000000 2023-10-17 12:29:14,438 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:29:14,438 EPOCH 3 done: loss 0.0712 - lr: 0.000039 2023-10-17 12:29:18,643 DEV : loss 0.10302536189556122 - f1-score (micro avg) 0.8027 2023-10-17 12:29:18,660 saving best model 2023-10-17 12:29:19,131 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:29:26,034 epoch 4 - iter 89/893 - loss 0.04394687 - time (sec): 6.90 - samples/sec: 3629.74 - lr: 0.000038 - momentum: 0.000000 2023-10-17 12:29:32,980 epoch 4 - iter 178/893 - loss 0.04673629 - time (sec): 13.85 - samples/sec: 3600.47 - lr: 0.000038 - momentum: 0.000000 2023-10-17 12:29:40,460 epoch 4 - iter 267/893 - loss 0.04981548 - time (sec): 21.33 - samples/sec: 3539.04 - lr: 0.000037 - momentum: 0.000000 2023-10-17 12:29:47,170 epoch 4 - iter 356/893 - loss 0.05110971 - time (sec): 28.04 - samples/sec: 3559.54 - lr: 0.000037 - momentum: 0.000000 2023-10-17 12:29:54,331 epoch 4 - iter 445/893 - loss 0.05026700 - time (sec): 35.20 - samples/sec: 3546.87 - lr: 0.000036 - momentum: 0.000000 2023-10-17 12:30:01,406 epoch 4 - iter 534/893 - loss 0.05089379 - time (sec): 42.27 - samples/sec: 3562.51 - lr: 0.000036 - momentum: 0.000000 2023-10-17 12:30:08,585 epoch 4 - iter 623/893 - loss 0.04964456 - time (sec): 49.45 - samples/sec: 3552.29 - lr: 0.000035 - momentum: 0.000000 2023-10-17 12:30:15,076 epoch 4 - iter 712/893 - loss 0.04857108 - time (sec): 55.94 - samples/sec: 3550.83 - lr: 0.000034 - momentum: 0.000000 2023-10-17 12:30:21,863 epoch 4 - iter 801/893 - loss 0.04927161 - time (sec): 62.73 - samples/sec: 3550.83 - lr: 0.000034 - momentum: 0.000000 2023-10-17 12:30:28,909 epoch 4 - iter 890/893 - loss 0.04890995 - time (sec): 69.77 - samples/sec: 3556.09 - lr: 0.000033 - momentum: 0.000000 2023-10-17 12:30:29,111 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:29,112 EPOCH 4 done: loss 0.0489 - lr: 0.000033 2023-10-17 12:30:33,260 DEV : loss 0.1494728922843933 - f1-score (micro avg) 0.7642 2023-10-17 12:30:33,277 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:30:39,784 epoch 5 - iter 89/893 - loss 0.03866759 - time (sec): 6.51 - samples/sec: 3762.35 - lr: 0.000033 - momentum: 0.000000 2023-10-17 12:30:46,083 epoch 5 - iter 178/893 - loss 0.03849278 - time (sec): 12.81 - samples/sec: 3724.90 - lr: 0.000032 - momentum: 0.000000 2023-10-17 12:30:53,079 epoch 5 - iter 267/893 - loss 0.04490770 - time (sec): 19.80 - samples/sec: 3670.89 - lr: 0.000032 - momentum: 0.000000 2023-10-17 12:31:00,158 epoch 5 - iter 356/893 - loss 0.04415524 - time (sec): 26.88 - samples/sec: 3637.83 - lr: 0.000031 - momentum: 0.000000 2023-10-17 12:31:07,534 epoch 5 - iter 445/893 - loss 0.04553621 - time (sec): 34.26 - samples/sec: 3633.96 - lr: 0.000031 - momentum: 0.000000 2023-10-17 12:31:14,613 epoch 5 - iter 534/893 - loss 0.04328694 - time (sec): 41.34 - samples/sec: 3609.67 - lr: 0.000030 - momentum: 0.000000 2023-10-17 12:31:22,041 epoch 5 - iter 623/893 - loss 0.04161932 - time (sec): 48.76 - samples/sec: 3587.47 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:31:28,696 epoch 5 - iter 712/893 - loss 0.04050119 - time (sec): 55.42 - samples/sec: 3603.61 - lr: 0.000029 - momentum: 0.000000 2023-10-17 12:31:35,963 epoch 5 - iter 801/893 - loss 0.04014660 - time (sec): 62.69 - samples/sec: 3587.44 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:31:42,520 epoch 5 - iter 890/893 - loss 0.03909834 - time (sec): 69.24 - samples/sec: 3583.50 - lr: 0.000028 - momentum: 0.000000 2023-10-17 12:31:42,686 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:31:42,686 EPOCH 5 done: loss 0.0392 - lr: 0.000028 2023-10-17 12:31:47,303 DEV : loss 0.1533377468585968 - f1-score (micro avg) 0.81 2023-10-17 12:31:47,319 saving best model 2023-10-17 12:31:47,795 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:31:54,920 epoch 6 - iter 89/893 - loss 0.02756819 - time (sec): 7.12 - samples/sec: 3532.05 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:32:01,948 epoch 6 - iter 178/893 - loss 0.02664455 - time (sec): 14.15 - samples/sec: 3607.81 - lr: 0.000027 - momentum: 0.000000 2023-10-17 12:32:08,603 epoch 6 - iter 267/893 - loss 0.02870236 - time (sec): 20.80 - samples/sec: 3631.55 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:32:15,508 epoch 6 - iter 356/893 - loss 0.02957670 - time (sec): 27.71 - samples/sec: 3612.67 - lr: 0.000026 - momentum: 0.000000 2023-10-17 12:32:22,842 epoch 6 - iter 445/893 - loss 0.02928950 - time (sec): 35.04 - samples/sec: 3582.35 - lr: 0.000025 - momentum: 0.000000 2023-10-17 12:32:30,106 epoch 6 - iter 534/893 - loss 0.03036133 - time (sec): 42.31 - samples/sec: 3591.43 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:32:36,998 epoch 6 - iter 623/893 - loss 0.03000592 - time (sec): 49.20 - samples/sec: 3584.54 - lr: 0.000024 - momentum: 0.000000 2023-10-17 12:32:43,597 epoch 6 - iter 712/893 - loss 0.02940656 - time (sec): 55.80 - samples/sec: 3586.57 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:32:50,304 epoch 6 - iter 801/893 - loss 0.02915799 - time (sec): 62.50 - samples/sec: 3577.00 - lr: 0.000023 - momentum: 0.000000 2023-10-17 12:32:57,362 epoch 6 - iter 890/893 - loss 0.02938837 - time (sec): 69.56 - samples/sec: 3565.24 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:32:57,557 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:32:57,558 EPOCH 6 done: loss 0.0294 - lr: 0.000022 2023-10-17 12:33:02,212 DEV : loss 0.20796504616737366 - f1-score (micro avg) 0.8062 2023-10-17 12:33:02,229 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:33:08,958 epoch 7 - iter 89/893 - loss 0.01719836 - time (sec): 6.73 - samples/sec: 3465.65 - lr: 0.000022 - momentum: 0.000000 2023-10-17 12:33:16,469 epoch 7 - iter 178/893 - loss 0.01793337 - time (sec): 14.24 - samples/sec: 3498.55 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:33:23,220 epoch 7 - iter 267/893 - loss 0.01837316 - time (sec): 20.99 - samples/sec: 3516.43 - lr: 0.000021 - momentum: 0.000000 2023-10-17 12:33:30,566 epoch 7 - iter 356/893 - loss 0.01931535 - time (sec): 28.34 - samples/sec: 3552.60 - lr: 0.000020 - momentum: 0.000000 2023-10-17 12:33:37,346 epoch 7 - iter 445/893 - loss 0.01987539 - time (sec): 35.12 - samples/sec: 3592.41 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:33:43,813 epoch 7 - iter 534/893 - loss 0.02129779 - time (sec): 41.58 - samples/sec: 3572.74 - lr: 0.000019 - momentum: 0.000000 2023-10-17 12:33:50,667 epoch 7 - iter 623/893 - loss 0.02255359 - time (sec): 48.44 - samples/sec: 3551.20 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:33:57,376 epoch 7 - iter 712/893 - loss 0.02186109 - time (sec): 55.15 - samples/sec: 3557.59 - lr: 0.000018 - momentum: 0.000000 2023-10-17 12:34:04,807 epoch 7 - iter 801/893 - loss 0.02137142 - time (sec): 62.58 - samples/sec: 3550.20 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:34:11,865 epoch 7 - iter 890/893 - loss 0.02088132 - time (sec): 69.63 - samples/sec: 3563.55 - lr: 0.000017 - momentum: 0.000000 2023-10-17 12:34:12,047 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:34:12,047 EPOCH 7 done: loss 0.0208 - lr: 0.000017 2023-10-17 12:34:16,135 DEV : loss 0.20536133646965027 - f1-score (micro avg) 0.8148 2023-10-17 12:34:16,151 saving best model 2023-10-17 12:34:16,630 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:34:23,808 epoch 8 - iter 89/893 - loss 0.01542112 - time (sec): 7.18 - samples/sec: 3650.48 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:34:30,937 epoch 8 - iter 178/893 - loss 0.01339288 - time (sec): 14.30 - samples/sec: 3587.50 - lr: 0.000016 - momentum: 0.000000 2023-10-17 12:34:37,882 epoch 8 - iter 267/893 - loss 0.01315703 - time (sec): 21.25 - samples/sec: 3583.44 - lr: 0.000015 - momentum: 0.000000 2023-10-17 12:34:44,510 epoch 8 - iter 356/893 - loss 0.01462132 - time (sec): 27.88 - samples/sec: 3579.37 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:34:51,613 epoch 8 - iter 445/893 - loss 0.01468807 - time (sec): 34.98 - samples/sec: 3567.31 - lr: 0.000014 - momentum: 0.000000 2023-10-17 12:34:58,772 epoch 8 - iter 534/893 - loss 0.01541694 - time (sec): 42.14 - samples/sec: 3563.31 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:35:05,922 epoch 8 - iter 623/893 - loss 0.01464883 - time (sec): 49.29 - samples/sec: 3564.81 - lr: 0.000013 - momentum: 0.000000 2023-10-17 12:35:13,086 epoch 8 - iter 712/893 - loss 0.01436159 - time (sec): 56.45 - samples/sec: 3586.05 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:35:19,554 epoch 8 - iter 801/893 - loss 0.01472488 - time (sec): 62.92 - samples/sec: 3591.16 - lr: 0.000012 - momentum: 0.000000 2023-10-17 12:35:25,816 epoch 8 - iter 890/893 - loss 0.01463998 - time (sec): 69.18 - samples/sec: 3585.66 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:35:26,039 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:35:26,040 EPOCH 8 done: loss 0.0146 - lr: 0.000011 2023-10-17 12:35:30,693 DEV : loss 0.2075384557247162 - f1-score (micro avg) 0.8237 2023-10-17 12:35:30,710 saving best model 2023-10-17 12:35:31,198 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:35:38,366 epoch 9 - iter 89/893 - loss 0.01253730 - time (sec): 7.17 - samples/sec: 3569.12 - lr: 0.000011 - momentum: 0.000000 2023-10-17 12:35:45,712 epoch 9 - iter 178/893 - loss 0.01262600 - time (sec): 14.51 - samples/sec: 3506.66 - lr: 0.000010 - momentum: 0.000000 2023-10-17 12:35:52,657 epoch 9 - iter 267/893 - loss 0.01162607 - time (sec): 21.46 - samples/sec: 3567.08 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:35:59,395 epoch 9 - iter 356/893 - loss 0.01094062 - time (sec): 28.20 - samples/sec: 3564.61 - lr: 0.000009 - momentum: 0.000000 2023-10-17 12:36:06,132 epoch 9 - iter 445/893 - loss 0.01073676 - time (sec): 34.93 - samples/sec: 3590.56 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:36:12,723 epoch 9 - iter 534/893 - loss 0.01115983 - time (sec): 41.52 - samples/sec: 3613.46 - lr: 0.000008 - momentum: 0.000000 2023-10-17 12:36:19,548 epoch 9 - iter 623/893 - loss 0.01126760 - time (sec): 48.35 - samples/sec: 3607.99 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:36:26,185 epoch 9 - iter 712/893 - loss 0.01160709 - time (sec): 54.99 - samples/sec: 3597.68 - lr: 0.000007 - momentum: 0.000000 2023-10-17 12:36:33,180 epoch 9 - iter 801/893 - loss 0.01166356 - time (sec): 61.98 - samples/sec: 3599.54 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:36:40,480 epoch 9 - iter 890/893 - loss 0.01145458 - time (sec): 69.28 - samples/sec: 3581.17 - lr: 0.000006 - momentum: 0.000000 2023-10-17 12:36:40,683 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:36:40,683 EPOCH 9 done: loss 0.0115 - lr: 0.000006 2023-10-17 12:36:45,383 DEV : loss 0.22258678078651428 - f1-score (micro avg) 0.8193 2023-10-17 12:36:45,400 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:36:52,507 epoch 10 - iter 89/893 - loss 0.00776903 - time (sec): 7.11 - samples/sec: 3560.36 - lr: 0.000005 - momentum: 0.000000 2023-10-17 12:36:59,560 epoch 10 - iter 178/893 - loss 0.00915230 - time (sec): 14.16 - samples/sec: 3535.40 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:37:06,350 epoch 10 - iter 267/893 - loss 0.00784800 - time (sec): 20.95 - samples/sec: 3548.45 - lr: 0.000004 - momentum: 0.000000 2023-10-17 12:37:13,531 epoch 10 - iter 356/893 - loss 0.00767143 - time (sec): 28.13 - samples/sec: 3559.37 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:37:20,242 epoch 10 - iter 445/893 - loss 0.00765126 - time (sec): 34.84 - samples/sec: 3582.01 - lr: 0.000003 - momentum: 0.000000 2023-10-17 12:37:27,267 epoch 10 - iter 534/893 - loss 0.00790874 - time (sec): 41.87 - samples/sec: 3545.89 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:37:33,931 epoch 10 - iter 623/893 - loss 0.00716107 - time (sec): 48.53 - samples/sec: 3552.59 - lr: 0.000002 - momentum: 0.000000 2023-10-17 12:37:40,932 epoch 10 - iter 712/893 - loss 0.00685343 - time (sec): 55.53 - samples/sec: 3536.99 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:37:47,813 epoch 10 - iter 801/893 - loss 0.00683376 - time (sec): 62.41 - samples/sec: 3547.70 - lr: 0.000001 - momentum: 0.000000 2023-10-17 12:37:55,138 epoch 10 - iter 890/893 - loss 0.00691368 - time (sec): 69.74 - samples/sec: 3557.99 - lr: 0.000000 - momentum: 0.000000 2023-10-17 12:37:55,342 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:37:55,342 EPOCH 10 done: loss 0.0069 - lr: 0.000000 2023-10-17 12:37:59,528 DEV : loss 0.22205151617527008 - f1-score (micro avg) 0.8235 2023-10-17 12:37:59,904 ---------------------------------------------------------------------------------------------------- 2023-10-17 12:37:59,906 Loading model from best epoch ... 2023-10-17 12:38:01,386 SequenceTagger predicts: Dictionary with 17 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG, S-HumanProd, B-HumanProd, E-HumanProd, I-HumanProd 2023-10-17 12:38:10,955 Results: - F-score (micro) 0.703 - F-score (macro) 0.6203 - Accuracy 0.56 By class: precision recall f1-score support LOC 0.7466 0.6995 0.7223 1095 PER 0.7818 0.7648 0.7732 1012 ORG 0.4431 0.5994 0.5095 357 HumanProd 0.3922 0.6061 0.4762 33 micro avg 0.6957 0.7105 0.7030 2497 macro avg 0.5909 0.6675 0.6203 2497 weighted avg 0.7128 0.7105 0.7093 2497 2023-10-17 12:38:10,955 ----------------------------------------------------------------------------------------------------