stefan-it's picture
Upload folder using huggingface_hub
afadf5b
2023-10-17 13:14:59,271 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:59,272 Model: "SequenceTagger(
(embeddings): TransformerWordEmbeddings(
(model): ElectraModel(
(embeddings): ElectraEmbeddings(
(word_embeddings): Embedding(32001, 768)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(encoder): ElectraEncoder(
(layer): ModuleList(
(0-11): 12 x ElectraLayer(
(attention): ElectraAttention(
(self): ElectraSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1, inplace=False)
)
(output): ElectraSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
(intermediate): ElectraIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
(intermediate_act_fn): GELUActivation()
)
(output): ElectraOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
(dropout): Dropout(p=0.1, inplace=False)
)
)
)
)
)
)
(locked_dropout): LockedDropout(p=0.5)
(linear): Linear(in_features=768, out_features=13, bias=True)
(loss_function): CrossEntropyLoss()
)"
2023-10-17 13:14:59,272 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:59,273 MultiCorpus: 7936 train + 992 dev + 992 test sentences
- NER_ICDAR_EUROPEANA Corpus: 7936 train + 992 dev + 992 test sentences - /root/.flair/datasets/ner_icdar_europeana/fr
2023-10-17 13:14:59,273 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:59,273 Train: 7936 sentences
2023-10-17 13:14:59,273 (train_with_dev=False, train_with_test=False)
2023-10-17 13:14:59,273 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:59,273 Training Params:
2023-10-17 13:14:59,273 - learning_rate: "5e-05"
2023-10-17 13:14:59,273 - mini_batch_size: "8"
2023-10-17 13:14:59,273 - max_epochs: "10"
2023-10-17 13:14:59,273 - shuffle: "True"
2023-10-17 13:14:59,273 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:59,273 Plugins:
2023-10-17 13:14:59,273 - TensorboardLogger
2023-10-17 13:14:59,273 - LinearScheduler | warmup_fraction: '0.1'
2023-10-17 13:14:59,273 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:59,273 Final evaluation on model from best epoch (best-model.pt)
2023-10-17 13:14:59,273 - metric: "('micro avg', 'f1-score')"
2023-10-17 13:14:59,273 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:59,273 Computation:
2023-10-17 13:14:59,273 - compute on device: cuda:0
2023-10-17 13:14:59,273 - embedding storage: none
2023-10-17 13:14:59,273 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:59,273 Model training base path: "hmbench-icdar/fr-hmteams/teams-base-historic-multilingual-discriminator-bs8-wsFalse-e10-lr5e-05-poolingfirst-layers-1-crfFalse-3"
2023-10-17 13:14:59,273 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:59,273 ----------------------------------------------------------------------------------------------------
2023-10-17 13:14:59,273 Logging anything other than scalars to TensorBoard is currently not supported.
2023-10-17 13:15:05,295 epoch 1 - iter 99/992 - loss 2.37029663 - time (sec): 6.02 - samples/sec: 2836.14 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:15:11,137 epoch 1 - iter 198/992 - loss 1.42416275 - time (sec): 11.86 - samples/sec: 2784.92 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:15:17,507 epoch 1 - iter 297/992 - loss 1.03740415 - time (sec): 18.23 - samples/sec: 2744.60 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:15:23,330 epoch 1 - iter 396/992 - loss 0.83714066 - time (sec): 24.06 - samples/sec: 2738.69 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:15:29,614 epoch 1 - iter 495/992 - loss 0.69593175 - time (sec): 30.34 - samples/sec: 2736.27 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:15:35,822 epoch 1 - iter 594/992 - loss 0.59927064 - time (sec): 36.55 - samples/sec: 2749.66 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:15:42,169 epoch 1 - iter 693/992 - loss 0.53186441 - time (sec): 42.89 - samples/sec: 2742.71 - lr: 0.000035 - momentum: 0.000000
2023-10-17 13:15:48,050 epoch 1 - iter 792/992 - loss 0.48827616 - time (sec): 48.78 - samples/sec: 2725.13 - lr: 0.000040 - momentum: 0.000000
2023-10-17 13:15:53,920 epoch 1 - iter 891/992 - loss 0.45262959 - time (sec): 54.65 - samples/sec: 2708.63 - lr: 0.000045 - momentum: 0.000000
2023-10-17 13:15:59,754 epoch 1 - iter 990/992 - loss 0.42092898 - time (sec): 60.48 - samples/sec: 2706.81 - lr: 0.000050 - momentum: 0.000000
2023-10-17 13:15:59,862 ----------------------------------------------------------------------------------------------------
2023-10-17 13:15:59,862 EPOCH 1 done: loss 0.4209 - lr: 0.000050
2023-10-17 13:16:03,260 DEV : loss 0.08556017279624939 - f1-score (micro avg) 0.7181
2023-10-17 13:16:03,284 saving best model
2023-10-17 13:16:03,706 ----------------------------------------------------------------------------------------------------
2023-10-17 13:16:09,578 epoch 2 - iter 99/992 - loss 0.11844195 - time (sec): 5.87 - samples/sec: 2569.51 - lr: 0.000049 - momentum: 0.000000
2023-10-17 13:16:15,696 epoch 2 - iter 198/992 - loss 0.11939327 - time (sec): 11.99 - samples/sec: 2634.08 - lr: 0.000049 - momentum: 0.000000
2023-10-17 13:16:21,468 epoch 2 - iter 297/992 - loss 0.12037513 - time (sec): 17.76 - samples/sec: 2643.91 - lr: 0.000048 - momentum: 0.000000
2023-10-17 13:16:27,737 epoch 2 - iter 396/992 - loss 0.11645305 - time (sec): 24.03 - samples/sec: 2642.00 - lr: 0.000048 - momentum: 0.000000
2023-10-17 13:16:33,971 epoch 2 - iter 495/992 - loss 0.11602227 - time (sec): 30.26 - samples/sec: 2652.74 - lr: 0.000047 - momentum: 0.000000
2023-10-17 13:16:39,758 epoch 2 - iter 594/992 - loss 0.11405300 - time (sec): 36.05 - samples/sec: 2682.51 - lr: 0.000047 - momentum: 0.000000
2023-10-17 13:16:45,730 epoch 2 - iter 693/992 - loss 0.11194828 - time (sec): 42.02 - samples/sec: 2680.69 - lr: 0.000046 - momentum: 0.000000
2023-10-17 13:16:52,651 epoch 2 - iter 792/992 - loss 0.11137892 - time (sec): 48.94 - samples/sec: 2643.62 - lr: 0.000046 - momentum: 0.000000
2023-10-17 13:16:58,820 epoch 2 - iter 891/992 - loss 0.11016016 - time (sec): 55.11 - samples/sec: 2663.66 - lr: 0.000045 - momentum: 0.000000
2023-10-17 13:17:04,760 epoch 2 - iter 990/992 - loss 0.10803629 - time (sec): 61.05 - samples/sec: 2681.10 - lr: 0.000044 - momentum: 0.000000
2023-10-17 13:17:04,874 ----------------------------------------------------------------------------------------------------
2023-10-17 13:17:04,874 EPOCH 2 done: loss 0.1079 - lr: 0.000044
2023-10-17 13:17:08,500 DEV : loss 0.09252572059631348 - f1-score (micro avg) 0.7468
2023-10-17 13:17:08,522 saving best model
2023-10-17 13:17:09,023 ----------------------------------------------------------------------------------------------------
2023-10-17 13:17:14,747 epoch 3 - iter 99/992 - loss 0.07613487 - time (sec): 5.72 - samples/sec: 2785.56 - lr: 0.000044 - momentum: 0.000000
2023-10-17 13:17:20,906 epoch 3 - iter 198/992 - loss 0.07588141 - time (sec): 11.88 - samples/sec: 2751.47 - lr: 0.000043 - momentum: 0.000000
2023-10-17 13:17:26,809 epoch 3 - iter 297/992 - loss 0.07548816 - time (sec): 17.78 - samples/sec: 2755.77 - lr: 0.000043 - momentum: 0.000000
2023-10-17 13:17:32,924 epoch 3 - iter 396/992 - loss 0.07568375 - time (sec): 23.90 - samples/sec: 2731.38 - lr: 0.000042 - momentum: 0.000000
2023-10-17 13:17:38,727 epoch 3 - iter 495/992 - loss 0.07493804 - time (sec): 29.70 - samples/sec: 2720.87 - lr: 0.000042 - momentum: 0.000000
2023-10-17 13:17:44,768 epoch 3 - iter 594/992 - loss 0.07584050 - time (sec): 35.74 - samples/sec: 2711.17 - lr: 0.000041 - momentum: 0.000000
2023-10-17 13:17:50,762 epoch 3 - iter 693/992 - loss 0.07567410 - time (sec): 41.73 - samples/sec: 2722.75 - lr: 0.000041 - momentum: 0.000000
2023-10-17 13:17:57,136 epoch 3 - iter 792/992 - loss 0.07595643 - time (sec): 48.11 - samples/sec: 2735.76 - lr: 0.000040 - momentum: 0.000000
2023-10-17 13:18:02,952 epoch 3 - iter 891/992 - loss 0.07610041 - time (sec): 53.92 - samples/sec: 2740.07 - lr: 0.000039 - momentum: 0.000000
2023-10-17 13:18:08,940 epoch 3 - iter 990/992 - loss 0.07608824 - time (sec): 59.91 - samples/sec: 2732.03 - lr: 0.000039 - momentum: 0.000000
2023-10-17 13:18:09,067 ----------------------------------------------------------------------------------------------------
2023-10-17 13:18:09,067 EPOCH 3 done: loss 0.0760 - lr: 0.000039
2023-10-17 13:18:12,705 DEV : loss 0.09796484559774399 - f1-score (micro avg) 0.743
2023-10-17 13:18:12,728 ----------------------------------------------------------------------------------------------------
2023-10-17 13:18:18,838 epoch 4 - iter 99/992 - loss 0.05370749 - time (sec): 6.11 - samples/sec: 2803.21 - lr: 0.000038 - momentum: 0.000000
2023-10-17 13:18:24,824 epoch 4 - iter 198/992 - loss 0.05126212 - time (sec): 12.10 - samples/sec: 2722.25 - lr: 0.000038 - momentum: 0.000000
2023-10-17 13:18:30,600 epoch 4 - iter 297/992 - loss 0.04823145 - time (sec): 17.87 - samples/sec: 2740.59 - lr: 0.000037 - momentum: 0.000000
2023-10-17 13:18:36,555 epoch 4 - iter 396/992 - loss 0.05004986 - time (sec): 23.83 - samples/sec: 2729.66 - lr: 0.000037 - momentum: 0.000000
2023-10-17 13:18:42,757 epoch 4 - iter 495/992 - loss 0.05199699 - time (sec): 30.03 - samples/sec: 2726.05 - lr: 0.000036 - momentum: 0.000000
2023-10-17 13:18:48,709 epoch 4 - iter 594/992 - loss 0.05345247 - time (sec): 35.98 - samples/sec: 2711.17 - lr: 0.000036 - momentum: 0.000000
2023-10-17 13:18:55,097 epoch 4 - iter 693/992 - loss 0.05435726 - time (sec): 42.37 - samples/sec: 2701.73 - lr: 0.000035 - momentum: 0.000000
2023-10-17 13:19:01,303 epoch 4 - iter 792/992 - loss 0.05525748 - time (sec): 48.57 - samples/sec: 2696.82 - lr: 0.000034 - momentum: 0.000000
2023-10-17 13:19:07,402 epoch 4 - iter 891/992 - loss 0.05470790 - time (sec): 54.67 - samples/sec: 2696.65 - lr: 0.000034 - momentum: 0.000000
2023-10-17 13:19:13,257 epoch 4 - iter 990/992 - loss 0.05624270 - time (sec): 60.53 - samples/sec: 2703.77 - lr: 0.000033 - momentum: 0.000000
2023-10-17 13:19:13,374 ----------------------------------------------------------------------------------------------------
2023-10-17 13:19:13,375 EPOCH 4 done: loss 0.0563 - lr: 0.000033
2023-10-17 13:19:17,015 DEV : loss 0.13502389192581177 - f1-score (micro avg) 0.7351
2023-10-17 13:19:17,039 ----------------------------------------------------------------------------------------------------
2023-10-17 13:19:23,271 epoch 5 - iter 99/992 - loss 0.04548482 - time (sec): 6.23 - samples/sec: 2647.00 - lr: 0.000033 - momentum: 0.000000
2023-10-17 13:19:29,232 epoch 5 - iter 198/992 - loss 0.03764309 - time (sec): 12.19 - samples/sec: 2709.56 - lr: 0.000032 - momentum: 0.000000
2023-10-17 13:19:34,819 epoch 5 - iter 297/992 - loss 0.04076199 - time (sec): 17.78 - samples/sec: 2739.34 - lr: 0.000032 - momentum: 0.000000
2023-10-17 13:19:40,687 epoch 5 - iter 396/992 - loss 0.04113845 - time (sec): 23.65 - samples/sec: 2743.63 - lr: 0.000031 - momentum: 0.000000
2023-10-17 13:19:46,682 epoch 5 - iter 495/992 - loss 0.04216041 - time (sec): 29.64 - samples/sec: 2746.99 - lr: 0.000031 - momentum: 0.000000
2023-10-17 13:19:52,359 epoch 5 - iter 594/992 - loss 0.04162863 - time (sec): 35.32 - samples/sec: 2738.04 - lr: 0.000030 - momentum: 0.000000
2023-10-17 13:19:58,468 epoch 5 - iter 693/992 - loss 0.04196048 - time (sec): 41.43 - samples/sec: 2747.11 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:20:04,661 epoch 5 - iter 792/992 - loss 0.04200273 - time (sec): 47.62 - samples/sec: 2733.11 - lr: 0.000029 - momentum: 0.000000
2023-10-17 13:20:10,825 epoch 5 - iter 891/992 - loss 0.04111233 - time (sec): 53.78 - samples/sec: 2728.76 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:20:17,147 epoch 5 - iter 990/992 - loss 0.04120519 - time (sec): 60.11 - samples/sec: 2722.68 - lr: 0.000028 - momentum: 0.000000
2023-10-17 13:20:17,252 ----------------------------------------------------------------------------------------------------
2023-10-17 13:20:17,252 EPOCH 5 done: loss 0.0413 - lr: 0.000028
2023-10-17 13:20:21,297 DEV : loss 0.16693313419818878 - f1-score (micro avg) 0.7468
2023-10-17 13:20:21,318 saving best model
2023-10-17 13:20:21,823 ----------------------------------------------------------------------------------------------------
2023-10-17 13:20:27,842 epoch 6 - iter 99/992 - loss 0.03143511 - time (sec): 6.01 - samples/sec: 2723.17 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:20:33,977 epoch 6 - iter 198/992 - loss 0.03222403 - time (sec): 12.14 - samples/sec: 2689.22 - lr: 0.000027 - momentum: 0.000000
2023-10-17 13:20:39,814 epoch 6 - iter 297/992 - loss 0.03239315 - time (sec): 17.98 - samples/sec: 2738.44 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:20:46,020 epoch 6 - iter 396/992 - loss 0.03107293 - time (sec): 24.18 - samples/sec: 2718.08 - lr: 0.000026 - momentum: 0.000000
2023-10-17 13:20:51,966 epoch 6 - iter 495/992 - loss 0.03192172 - time (sec): 30.13 - samples/sec: 2704.51 - lr: 0.000025 - momentum: 0.000000
2023-10-17 13:20:57,687 epoch 6 - iter 594/992 - loss 0.03209592 - time (sec): 35.85 - samples/sec: 2691.46 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:21:03,674 epoch 6 - iter 693/992 - loss 0.03197057 - time (sec): 41.84 - samples/sec: 2698.71 - lr: 0.000024 - momentum: 0.000000
2023-10-17 13:21:09,768 epoch 6 - iter 792/992 - loss 0.03184049 - time (sec): 47.93 - samples/sec: 2719.96 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:21:15,647 epoch 6 - iter 891/992 - loss 0.03193054 - time (sec): 53.81 - samples/sec: 2718.30 - lr: 0.000023 - momentum: 0.000000
2023-10-17 13:21:21,836 epoch 6 - iter 990/992 - loss 0.03148429 - time (sec): 60.00 - samples/sec: 2727.29 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:21:21,978 ----------------------------------------------------------------------------------------------------
2023-10-17 13:21:21,979 EPOCH 6 done: loss 0.0314 - lr: 0.000022
2023-10-17 13:21:25,589 DEV : loss 0.20110860466957092 - f1-score (micro avg) 0.7578
2023-10-17 13:21:25,610 saving best model
2023-10-17 13:21:26,155 ----------------------------------------------------------------------------------------------------
2023-10-17 13:21:31,969 epoch 7 - iter 99/992 - loss 0.02399857 - time (sec): 5.81 - samples/sec: 2688.29 - lr: 0.000022 - momentum: 0.000000
2023-10-17 13:21:38,459 epoch 7 - iter 198/992 - loss 0.02332968 - time (sec): 12.30 - samples/sec: 2696.87 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:21:44,257 epoch 7 - iter 297/992 - loss 0.02310479 - time (sec): 18.10 - samples/sec: 2722.47 - lr: 0.000021 - momentum: 0.000000
2023-10-17 13:21:50,142 epoch 7 - iter 396/992 - loss 0.02381511 - time (sec): 23.98 - samples/sec: 2716.04 - lr: 0.000020 - momentum: 0.000000
2023-10-17 13:21:56,060 epoch 7 - iter 495/992 - loss 0.02318505 - time (sec): 29.90 - samples/sec: 2723.81 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:22:02,276 epoch 7 - iter 594/992 - loss 0.02242971 - time (sec): 36.12 - samples/sec: 2720.51 - lr: 0.000019 - momentum: 0.000000
2023-10-17 13:22:08,501 epoch 7 - iter 693/992 - loss 0.02235423 - time (sec): 42.34 - samples/sec: 2721.65 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:22:14,887 epoch 7 - iter 792/992 - loss 0.02243338 - time (sec): 48.73 - samples/sec: 2696.54 - lr: 0.000018 - momentum: 0.000000
2023-10-17 13:22:20,929 epoch 7 - iter 891/992 - loss 0.02280661 - time (sec): 54.77 - samples/sec: 2692.86 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:22:26,941 epoch 7 - iter 990/992 - loss 0.02265879 - time (sec): 60.78 - samples/sec: 2693.46 - lr: 0.000017 - momentum: 0.000000
2023-10-17 13:22:27,063 ----------------------------------------------------------------------------------------------------
2023-10-17 13:22:27,063 EPOCH 7 done: loss 0.0226 - lr: 0.000017
2023-10-17 13:22:30,642 DEV : loss 0.2132951319217682 - f1-score (micro avg) 0.7572
2023-10-17 13:22:30,665 ----------------------------------------------------------------------------------------------------
2023-10-17 13:22:36,466 epoch 8 - iter 99/992 - loss 0.01675590 - time (sec): 5.80 - samples/sec: 2772.29 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:22:42,460 epoch 8 - iter 198/992 - loss 0.01668702 - time (sec): 11.79 - samples/sec: 2722.69 - lr: 0.000016 - momentum: 0.000000
2023-10-17 13:22:48,546 epoch 8 - iter 297/992 - loss 0.01664626 - time (sec): 17.88 - samples/sec: 2703.17 - lr: 0.000015 - momentum: 0.000000
2023-10-17 13:22:54,477 epoch 8 - iter 396/992 - loss 0.01673593 - time (sec): 23.81 - samples/sec: 2725.71 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:23:00,709 epoch 8 - iter 495/992 - loss 0.01742427 - time (sec): 30.04 - samples/sec: 2724.28 - lr: 0.000014 - momentum: 0.000000
2023-10-17 13:23:06,647 epoch 8 - iter 594/992 - loss 0.01770378 - time (sec): 35.98 - samples/sec: 2703.27 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:23:13,013 epoch 8 - iter 693/992 - loss 0.01737786 - time (sec): 42.35 - samples/sec: 2700.66 - lr: 0.000013 - momentum: 0.000000
2023-10-17 13:23:19,165 epoch 8 - iter 792/992 - loss 0.01656451 - time (sec): 48.50 - samples/sec: 2710.43 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:23:25,236 epoch 8 - iter 891/992 - loss 0.01664994 - time (sec): 54.57 - samples/sec: 2708.08 - lr: 0.000012 - momentum: 0.000000
2023-10-17 13:23:31,064 epoch 8 - iter 990/992 - loss 0.01632570 - time (sec): 60.40 - samples/sec: 2709.31 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:23:31,205 ----------------------------------------------------------------------------------------------------
2023-10-17 13:23:31,205 EPOCH 8 done: loss 0.0163 - lr: 0.000011
2023-10-17 13:23:34,873 DEV : loss 0.23159745335578918 - f1-score (micro avg) 0.759
2023-10-17 13:23:34,896 saving best model
2023-10-17 13:23:35,440 ----------------------------------------------------------------------------------------------------
2023-10-17 13:23:41,557 epoch 9 - iter 99/992 - loss 0.00768430 - time (sec): 6.11 - samples/sec: 2721.88 - lr: 0.000011 - momentum: 0.000000
2023-10-17 13:23:47,443 epoch 9 - iter 198/992 - loss 0.00767863 - time (sec): 12.00 - samples/sec: 2792.73 - lr: 0.000010 - momentum: 0.000000
2023-10-17 13:23:53,085 epoch 9 - iter 297/992 - loss 0.00921750 - time (sec): 17.64 - samples/sec: 2774.70 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:23:58,924 epoch 9 - iter 396/992 - loss 0.00997090 - time (sec): 23.48 - samples/sec: 2787.79 - lr: 0.000009 - momentum: 0.000000
2023-10-17 13:24:04,964 epoch 9 - iter 495/992 - loss 0.01094716 - time (sec): 29.52 - samples/sec: 2776.73 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:24:11,173 epoch 9 - iter 594/992 - loss 0.01122230 - time (sec): 35.73 - samples/sec: 2773.13 - lr: 0.000008 - momentum: 0.000000
2023-10-17 13:24:16,929 epoch 9 - iter 693/992 - loss 0.01035585 - time (sec): 41.49 - samples/sec: 2762.10 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:24:23,057 epoch 9 - iter 792/992 - loss 0.01060860 - time (sec): 47.61 - samples/sec: 2749.82 - lr: 0.000007 - momentum: 0.000000
2023-10-17 13:24:29,027 epoch 9 - iter 891/992 - loss 0.01051512 - time (sec): 53.58 - samples/sec: 2744.37 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:24:35,147 epoch 9 - iter 990/992 - loss 0.01045569 - time (sec): 59.70 - samples/sec: 2738.58 - lr: 0.000006 - momentum: 0.000000
2023-10-17 13:24:35,288 ----------------------------------------------------------------------------------------------------
2023-10-17 13:24:35,288 EPOCH 9 done: loss 0.0106 - lr: 0.000006
2023-10-17 13:24:39,270 DEV : loss 0.25076672434806824 - f1-score (micro avg) 0.7642
2023-10-17 13:24:39,295 saving best model
2023-10-17 13:24:39,818 ----------------------------------------------------------------------------------------------------
2023-10-17 13:24:45,647 epoch 10 - iter 99/992 - loss 0.00464891 - time (sec): 5.83 - samples/sec: 2759.88 - lr: 0.000005 - momentum: 0.000000
2023-10-17 13:24:51,594 epoch 10 - iter 198/992 - loss 0.00654277 - time (sec): 11.77 - samples/sec: 2755.86 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:24:57,739 epoch 10 - iter 297/992 - loss 0.00598182 - time (sec): 17.92 - samples/sec: 2784.81 - lr: 0.000004 - momentum: 0.000000
2023-10-17 13:25:03,780 epoch 10 - iter 396/992 - loss 0.00599650 - time (sec): 23.96 - samples/sec: 2760.42 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:25:09,987 epoch 10 - iter 495/992 - loss 0.00613028 - time (sec): 30.17 - samples/sec: 2724.26 - lr: 0.000003 - momentum: 0.000000
2023-10-17 13:25:16,059 epoch 10 - iter 594/992 - loss 0.00732373 - time (sec): 36.24 - samples/sec: 2717.20 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:25:21,937 epoch 10 - iter 693/992 - loss 0.00697744 - time (sec): 42.12 - samples/sec: 2720.67 - lr: 0.000002 - momentum: 0.000000
2023-10-17 13:25:27,971 epoch 10 - iter 792/992 - loss 0.00723983 - time (sec): 48.15 - samples/sec: 2736.85 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:25:33,810 epoch 10 - iter 891/992 - loss 0.00699368 - time (sec): 53.99 - samples/sec: 2739.57 - lr: 0.000001 - momentum: 0.000000
2023-10-17 13:25:39,728 epoch 10 - iter 990/992 - loss 0.00715494 - time (sec): 59.91 - samples/sec: 2733.12 - lr: 0.000000 - momentum: 0.000000
2023-10-17 13:25:39,836 ----------------------------------------------------------------------------------------------------
2023-10-17 13:25:39,836 EPOCH 10 done: loss 0.0072 - lr: 0.000000
2023-10-17 13:25:43,265 DEV : loss 0.25413334369659424 - f1-score (micro avg) 0.7654
2023-10-17 13:25:43,286 saving best model
2023-10-17 13:25:44,230 ----------------------------------------------------------------------------------------------------
2023-10-17 13:25:44,231 Loading model from best epoch ...
2023-10-17 13:25:45,694 SequenceTagger predicts: Dictionary with 13 tags: O, S-PER, B-PER, E-PER, I-PER, S-LOC, B-LOC, E-LOC, I-LOC, S-ORG, B-ORG, E-ORG, I-ORG
2023-10-17 13:25:49,114
Results:
- F-score (micro) 0.7775
- F-score (macro) 0.6973
- Accuracy 0.6586
By class:
precision recall f1-score support
LOC 0.8221 0.8534 0.8375 655
PER 0.6783 0.7848 0.7277 223
ORG 0.6082 0.4646 0.5268 127
micro avg 0.7662 0.7891 0.7775 1005
macro avg 0.7029 0.7009 0.6973 1005
weighted avg 0.7631 0.7891 0.7738 1005
2023-10-17 13:25:49,114 ----------------------------------------------------------------------------------------------------