2023-10-18 14:45:41,385 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:41,386 Model: "SequenceTagger( (embeddings): TransformerWordEmbeddings( (model): BertModel( (embeddings): BertEmbeddings( (word_embeddings): Embedding(32001, 128) (position_embeddings): Embedding(512, 128) (token_type_embeddings): Embedding(2, 128) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) (encoder): BertEncoder( (layer): ModuleList( (0-1): 2 x BertLayer( (attention): BertAttention( (self): BertSelfAttention( (query): Linear(in_features=128, out_features=128, bias=True) (key): Linear(in_features=128, out_features=128, bias=True) (value): Linear(in_features=128, out_features=128, bias=True) (dropout): Dropout(p=0.1, inplace=False) ) (output): BertSelfOutput( (dense): Linear(in_features=128, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) (intermediate): BertIntermediate( (dense): Linear(in_features=128, out_features=512, bias=True) (intermediate_act_fn): GELUActivation() ) (output): BertOutput( (dense): Linear(in_features=512, out_features=128, bias=True) (LayerNorm): LayerNorm((128,), eps=1e-12, elementwise_affine=True) (dropout): Dropout(p=0.1, inplace=False) ) ) ) ) (pooler): BertPooler( (dense): Linear(in_features=128, out_features=128, bias=True) (activation): Tanh() ) ) ) (locked_dropout): LockedDropout(p=0.5) (linear): Linear(in_features=128, out_features=25, bias=True) (loss_function): CrossEntropyLoss() )" 2023-10-18 14:45:41,386 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:41,386 MultiCorpus: 1100 train + 206 dev + 240 test sentences - NER_HIPE_2022 Corpus: 1100 train + 206 dev + 240 test sentences - /root/.flair/datasets/ner_hipe_2022/v2.1/ajmc/de/with_doc_seperator 2023-10-18 14:45:41,386 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:41,386 Train: 1100 sentences 2023-10-18 14:45:41,386 (train_with_dev=False, train_with_test=False) 2023-10-18 14:45:41,386 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:41,386 Training Params: 2023-10-18 14:45:41,386 - learning_rate: "3e-05" 2023-10-18 14:45:41,386 - mini_batch_size: "4" 2023-10-18 14:45:41,386 - max_epochs: "10" 2023-10-18 14:45:41,386 - shuffle: "True" 2023-10-18 14:45:41,386 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:41,386 Plugins: 2023-10-18 14:45:41,386 - TensorboardLogger 2023-10-18 14:45:41,386 - LinearScheduler | warmup_fraction: '0.1' 2023-10-18 14:45:41,386 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:41,386 Final evaluation on model from best epoch (best-model.pt) 2023-10-18 14:45:41,386 - metric: "('micro avg', 'f1-score')" 2023-10-18 14:45:41,386 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:41,386 Computation: 2023-10-18 14:45:41,386 - compute on device: cuda:0 2023-10-18 14:45:41,386 - embedding storage: none 2023-10-18 14:45:41,387 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:41,387 Model training base path: "hmbench-ajmc/de-dbmdz/bert-tiny-historic-multilingual-cased-bs4-wsFalse-e10-lr3e-05-poolingfirst-layers-1-crfFalse-5" 2023-10-18 14:45:41,387 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:41,387 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:41,387 Logging anything other than scalars to TensorBoard is currently not supported. 2023-10-18 14:45:41,761 epoch 1 - iter 27/275 - loss 3.61403715 - time (sec): 0.37 - samples/sec: 6364.24 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:45:42,153 epoch 1 - iter 54/275 - loss 3.57884665 - time (sec): 0.77 - samples/sec: 5917.59 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:45:42,599 epoch 1 - iter 81/275 - loss 3.53550525 - time (sec): 1.21 - samples/sec: 5660.43 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:45:43,020 epoch 1 - iter 108/275 - loss 3.40180194 - time (sec): 1.63 - samples/sec: 5562.29 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:45:43,430 epoch 1 - iter 135/275 - loss 3.27013803 - time (sec): 2.04 - samples/sec: 5513.12 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:45:43,838 epoch 1 - iter 162/275 - loss 3.08856242 - time (sec): 2.45 - samples/sec: 5553.48 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:45:44,254 epoch 1 - iter 189/275 - loss 2.89899616 - time (sec): 2.87 - samples/sec: 5602.75 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:45:44,632 epoch 1 - iter 216/275 - loss 2.70701162 - time (sec): 3.24 - samples/sec: 5680.17 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:45:44,996 epoch 1 - iter 243/275 - loss 2.55846395 - time (sec): 3.61 - samples/sec: 5682.03 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:45:45,361 epoch 1 - iter 270/275 - loss 2.43448977 - time (sec): 3.97 - samples/sec: 5632.59 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:45:45,428 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:45,429 EPOCH 1 done: loss 2.4149 - lr: 0.000029 2023-10-18 14:45:45,680 DEV : loss 0.8993664383888245 - f1-score (micro avg) 0.0 2023-10-18 14:45:45,684 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:46,061 epoch 2 - iter 27/275 - loss 0.95057934 - time (sec): 0.38 - samples/sec: 6622.68 - lr: 0.000030 - momentum: 0.000000 2023-10-18 14:45:46,428 epoch 2 - iter 54/275 - loss 0.90831395 - time (sec): 0.74 - samples/sec: 6227.28 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:45:46,802 epoch 2 - iter 81/275 - loss 0.91840237 - time (sec): 1.12 - samples/sec: 6095.08 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:45:47,190 epoch 2 - iter 108/275 - loss 0.91071127 - time (sec): 1.51 - samples/sec: 6076.49 - lr: 0.000029 - momentum: 0.000000 2023-10-18 14:45:47,593 epoch 2 - iter 135/275 - loss 0.89192523 - time (sec): 1.91 - samples/sec: 6061.87 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:45:47,995 epoch 2 - iter 162/275 - loss 0.87794770 - time (sec): 2.31 - samples/sec: 5995.31 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:45:48,395 epoch 2 - iter 189/275 - loss 0.87310900 - time (sec): 2.71 - samples/sec: 5862.76 - lr: 0.000028 - momentum: 0.000000 2023-10-18 14:45:48,803 epoch 2 - iter 216/275 - loss 0.87265140 - time (sec): 3.12 - samples/sec: 5753.45 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:45:49,216 epoch 2 - iter 243/275 - loss 0.85712264 - time (sec): 3.53 - samples/sec: 5743.99 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:45:49,630 epoch 2 - iter 270/275 - loss 0.84688442 - time (sec): 3.95 - samples/sec: 5671.26 - lr: 0.000027 - momentum: 0.000000 2023-10-18 14:45:49,707 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:49,708 EPOCH 2 done: loss 0.8487 - lr: 0.000027 2023-10-18 14:45:50,069 DEV : loss 0.595435619354248 - f1-score (micro avg) 0.0727 2023-10-18 14:45:50,075 saving best model 2023-10-18 14:45:50,107 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:50,515 epoch 3 - iter 27/275 - loss 0.71606056 - time (sec): 0.41 - samples/sec: 5213.30 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:45:50,930 epoch 3 - iter 54/275 - loss 0.68351099 - time (sec): 0.82 - samples/sec: 5154.79 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:45:51,339 epoch 3 - iter 81/275 - loss 0.69432067 - time (sec): 1.23 - samples/sec: 5483.49 - lr: 0.000026 - momentum: 0.000000 2023-10-18 14:45:51,763 epoch 3 - iter 108/275 - loss 0.66327644 - time (sec): 1.66 - samples/sec: 5521.62 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:45:52,168 epoch 3 - iter 135/275 - loss 0.66298617 - time (sec): 2.06 - samples/sec: 5555.34 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:45:52,568 epoch 3 - iter 162/275 - loss 0.66224183 - time (sec): 2.46 - samples/sec: 5499.09 - lr: 0.000025 - momentum: 0.000000 2023-10-18 14:45:52,973 epoch 3 - iter 189/275 - loss 0.66933211 - time (sec): 2.87 - samples/sec: 5404.28 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:45:53,389 epoch 3 - iter 216/275 - loss 0.66801062 - time (sec): 3.28 - samples/sec: 5448.95 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:45:53,795 epoch 3 - iter 243/275 - loss 0.67011315 - time (sec): 3.69 - samples/sec: 5414.79 - lr: 0.000024 - momentum: 0.000000 2023-10-18 14:45:54,203 epoch 3 - iter 270/275 - loss 0.65537307 - time (sec): 4.10 - samples/sec: 5478.07 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:45:54,276 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:54,276 EPOCH 3 done: loss 0.6574 - lr: 0.000023 2023-10-18 14:45:54,640 DEV : loss 0.5139510631561279 - f1-score (micro avg) 0.2552 2023-10-18 14:45:54,644 saving best model 2023-10-18 14:45:54,677 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:55,075 epoch 4 - iter 27/275 - loss 0.52447434 - time (sec): 0.40 - samples/sec: 5997.93 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:45:55,474 epoch 4 - iter 54/275 - loss 0.50935710 - time (sec): 0.80 - samples/sec: 5853.84 - lr: 0.000023 - momentum: 0.000000 2023-10-18 14:45:55,860 epoch 4 - iter 81/275 - loss 0.52624243 - time (sec): 1.18 - samples/sec: 5639.94 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:45:56,256 epoch 4 - iter 108/275 - loss 0.51373632 - time (sec): 1.58 - samples/sec: 5631.58 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:45:56,662 epoch 4 - iter 135/275 - loss 0.52554536 - time (sec): 1.98 - samples/sec: 5712.26 - lr: 0.000022 - momentum: 0.000000 2023-10-18 14:45:57,058 epoch 4 - iter 162/275 - loss 0.52357952 - time (sec): 2.38 - samples/sec: 5624.73 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:45:57,456 epoch 4 - iter 189/275 - loss 0.53016382 - time (sec): 2.78 - samples/sec: 5563.56 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:45:57,847 epoch 4 - iter 216/275 - loss 0.53699475 - time (sec): 3.17 - samples/sec: 5546.23 - lr: 0.000021 - momentum: 0.000000 2023-10-18 14:45:58,254 epoch 4 - iter 243/275 - loss 0.54915683 - time (sec): 3.58 - samples/sec: 5594.44 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:45:58,658 epoch 4 - iter 270/275 - loss 0.54349471 - time (sec): 3.98 - samples/sec: 5615.28 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:45:58,739 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:58,739 EPOCH 4 done: loss 0.5449 - lr: 0.000020 2023-10-18 14:45:59,222 DEV : loss 0.4314418435096741 - f1-score (micro avg) 0.3719 2023-10-18 14:45:59,226 saving best model 2023-10-18 14:45:59,262 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:45:59,682 epoch 5 - iter 27/275 - loss 0.43609765 - time (sec): 0.42 - samples/sec: 5169.58 - lr: 0.000020 - momentum: 0.000000 2023-10-18 14:46:00,081 epoch 5 - iter 54/275 - loss 0.49476005 - time (sec): 0.82 - samples/sec: 5294.92 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:46:00,508 epoch 5 - iter 81/275 - loss 0.47844502 - time (sec): 1.24 - samples/sec: 5351.46 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:46:00,910 epoch 5 - iter 108/275 - loss 0.46736626 - time (sec): 1.65 - samples/sec: 5401.83 - lr: 0.000019 - momentum: 0.000000 2023-10-18 14:46:01,313 epoch 5 - iter 135/275 - loss 0.48918677 - time (sec): 2.05 - samples/sec: 5318.47 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:46:01,730 epoch 5 - iter 162/275 - loss 0.48711969 - time (sec): 2.47 - samples/sec: 5288.98 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:46:02,118 epoch 5 - iter 189/275 - loss 0.48936608 - time (sec): 2.85 - samples/sec: 5458.63 - lr: 0.000018 - momentum: 0.000000 2023-10-18 14:46:02,484 epoch 5 - iter 216/275 - loss 0.47343115 - time (sec): 3.22 - samples/sec: 5445.99 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:46:02,859 epoch 5 - iter 243/275 - loss 0.48904474 - time (sec): 3.60 - samples/sec: 5553.97 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:46:03,229 epoch 5 - iter 270/275 - loss 0.49007239 - time (sec): 3.97 - samples/sec: 5614.83 - lr: 0.000017 - momentum: 0.000000 2023-10-18 14:46:03,300 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:03,300 EPOCH 5 done: loss 0.4899 - lr: 0.000017 2023-10-18 14:46:03,666 DEV : loss 0.3777647614479065 - f1-score (micro avg) 0.4661 2023-10-18 14:46:03,670 saving best model 2023-10-18 14:46:03,703 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:04,070 epoch 6 - iter 27/275 - loss 0.44371819 - time (sec): 0.37 - samples/sec: 6268.96 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:46:04,437 epoch 6 - iter 54/275 - loss 0.43581155 - time (sec): 0.73 - samples/sec: 5861.33 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:46:04,809 epoch 6 - iter 81/275 - loss 0.45182491 - time (sec): 1.11 - samples/sec: 5828.12 - lr: 0.000016 - momentum: 0.000000 2023-10-18 14:46:05,181 epoch 6 - iter 108/275 - loss 0.46675065 - time (sec): 1.48 - samples/sec: 5913.51 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:46:05,549 epoch 6 - iter 135/275 - loss 0.46393032 - time (sec): 1.85 - samples/sec: 5961.68 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:46:05,921 epoch 6 - iter 162/275 - loss 0.45780821 - time (sec): 2.22 - samples/sec: 6003.60 - lr: 0.000015 - momentum: 0.000000 2023-10-18 14:46:06,296 epoch 6 - iter 189/275 - loss 0.45814821 - time (sec): 2.59 - samples/sec: 6038.43 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:46:06,674 epoch 6 - iter 216/275 - loss 0.45599556 - time (sec): 2.97 - samples/sec: 6013.33 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:46:07,045 epoch 6 - iter 243/275 - loss 0.45797244 - time (sec): 3.34 - samples/sec: 6017.24 - lr: 0.000014 - momentum: 0.000000 2023-10-18 14:46:07,425 epoch 6 - iter 270/275 - loss 0.46349411 - time (sec): 3.72 - samples/sec: 6022.72 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:46:07,492 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:07,492 EPOCH 6 done: loss 0.4611 - lr: 0.000013 2023-10-18 14:46:07,861 DEV : loss 0.3510936200618744 - f1-score (micro avg) 0.5469 2023-10-18 14:46:07,865 saving best model 2023-10-18 14:46:07,898 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:08,273 epoch 7 - iter 27/275 - loss 0.45503883 - time (sec): 0.37 - samples/sec: 6178.99 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:46:08,671 epoch 7 - iter 54/275 - loss 0.45594630 - time (sec): 0.77 - samples/sec: 5826.42 - lr: 0.000013 - momentum: 0.000000 2023-10-18 14:46:09,077 epoch 7 - iter 81/275 - loss 0.45305617 - time (sec): 1.18 - samples/sec: 5725.18 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:46:09,481 epoch 7 - iter 108/275 - loss 0.44772519 - time (sec): 1.58 - samples/sec: 5781.69 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:46:09,889 epoch 7 - iter 135/275 - loss 0.43549455 - time (sec): 1.99 - samples/sec: 5696.92 - lr: 0.000012 - momentum: 0.000000 2023-10-18 14:46:10,296 epoch 7 - iter 162/275 - loss 0.41836240 - time (sec): 2.40 - samples/sec: 5677.77 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:46:10,697 epoch 7 - iter 189/275 - loss 0.41445629 - time (sec): 2.80 - samples/sec: 5597.23 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:46:11,109 epoch 7 - iter 216/275 - loss 0.42313533 - time (sec): 3.21 - samples/sec: 5600.42 - lr: 0.000011 - momentum: 0.000000 2023-10-18 14:46:11,514 epoch 7 - iter 243/275 - loss 0.42253073 - time (sec): 3.62 - samples/sec: 5603.63 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:46:11,910 epoch 7 - iter 270/275 - loss 0.42477315 - time (sec): 4.01 - samples/sec: 5578.56 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:46:11,986 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:11,986 EPOCH 7 done: loss 0.4222 - lr: 0.000010 2023-10-18 14:46:12,351 DEV : loss 0.3423939347267151 - f1-score (micro avg) 0.5689 2023-10-18 14:46:12,355 saving best model 2023-10-18 14:46:12,388 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:12,794 epoch 8 - iter 27/275 - loss 0.43736643 - time (sec): 0.41 - samples/sec: 4953.84 - lr: 0.000010 - momentum: 0.000000 2023-10-18 14:46:13,194 epoch 8 - iter 54/275 - loss 0.41255335 - time (sec): 0.80 - samples/sec: 5180.26 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:46:13,598 epoch 8 - iter 81/275 - loss 0.42081822 - time (sec): 1.21 - samples/sec: 5248.17 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:46:14,016 epoch 8 - iter 108/275 - loss 0.43083601 - time (sec): 1.63 - samples/sec: 5343.69 - lr: 0.000009 - momentum: 0.000000 2023-10-18 14:46:14,443 epoch 8 - iter 135/275 - loss 0.42333866 - time (sec): 2.05 - samples/sec: 5469.25 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:46:14,863 epoch 8 - iter 162/275 - loss 0.41692525 - time (sec): 2.47 - samples/sec: 5513.19 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:46:15,334 epoch 8 - iter 189/275 - loss 0.41477261 - time (sec): 2.95 - samples/sec: 5392.00 - lr: 0.000008 - momentum: 0.000000 2023-10-18 14:46:15,798 epoch 8 - iter 216/275 - loss 0.40583899 - time (sec): 3.41 - samples/sec: 5316.38 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:46:16,241 epoch 8 - iter 243/275 - loss 0.40464465 - time (sec): 3.85 - samples/sec: 5257.30 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:46:16,697 epoch 8 - iter 270/275 - loss 0.40523073 - time (sec): 4.31 - samples/sec: 5200.80 - lr: 0.000007 - momentum: 0.000000 2023-10-18 14:46:16,781 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:16,781 EPOCH 8 done: loss 0.4047 - lr: 0.000007 2023-10-18 14:46:17,145 DEV : loss 0.3340139091014862 - f1-score (micro avg) 0.5558 2023-10-18 14:46:17,149 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:17,579 epoch 9 - iter 27/275 - loss 0.49204204 - time (sec): 0.43 - samples/sec: 5099.28 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:46:17,976 epoch 9 - iter 54/275 - loss 0.44236043 - time (sec): 0.83 - samples/sec: 5440.93 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:46:18,384 epoch 9 - iter 81/275 - loss 0.44917740 - time (sec): 1.23 - samples/sec: 5500.62 - lr: 0.000006 - momentum: 0.000000 2023-10-18 14:46:18,791 epoch 9 - iter 108/275 - loss 0.43185725 - time (sec): 1.64 - samples/sec: 5455.75 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:46:19,208 epoch 9 - iter 135/275 - loss 0.42651176 - time (sec): 2.06 - samples/sec: 5434.57 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:46:19,613 epoch 9 - iter 162/275 - loss 0.41439848 - time (sec): 2.46 - samples/sec: 5385.55 - lr: 0.000005 - momentum: 0.000000 2023-10-18 14:46:20,026 epoch 9 - iter 189/275 - loss 0.39496290 - time (sec): 2.88 - samples/sec: 5404.40 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:46:20,445 epoch 9 - iter 216/275 - loss 0.39308957 - time (sec): 3.29 - samples/sec: 5385.27 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:46:20,878 epoch 9 - iter 243/275 - loss 0.39362998 - time (sec): 3.73 - samples/sec: 5393.44 - lr: 0.000004 - momentum: 0.000000 2023-10-18 14:46:21,286 epoch 9 - iter 270/275 - loss 0.39707453 - time (sec): 4.14 - samples/sec: 5411.97 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:46:21,363 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:21,363 EPOCH 9 done: loss 0.3986 - lr: 0.000003 2023-10-18 14:46:21,738 DEV : loss 0.3237362802028656 - f1-score (micro avg) 0.5626 2023-10-18 14:46:21,742 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:22,147 epoch 10 - iter 27/275 - loss 0.40589974 - time (sec): 0.40 - samples/sec: 5849.70 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:46:22,550 epoch 10 - iter 54/275 - loss 0.37318012 - time (sec): 0.81 - samples/sec: 5557.90 - lr: 0.000003 - momentum: 0.000000 2023-10-18 14:46:22,963 epoch 10 - iter 81/275 - loss 0.37948564 - time (sec): 1.22 - samples/sec: 5488.52 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:46:23,381 epoch 10 - iter 108/275 - loss 0.38605282 - time (sec): 1.64 - samples/sec: 5614.79 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:46:23,791 epoch 10 - iter 135/275 - loss 0.38252040 - time (sec): 2.05 - samples/sec: 5595.64 - lr: 0.000002 - momentum: 0.000000 2023-10-18 14:46:24,214 epoch 10 - iter 162/275 - loss 0.39219651 - time (sec): 2.47 - samples/sec: 5577.71 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:46:24,626 epoch 10 - iter 189/275 - loss 0.38941559 - time (sec): 2.88 - samples/sec: 5546.46 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:46:25,039 epoch 10 - iter 216/275 - loss 0.39700080 - time (sec): 3.30 - samples/sec: 5507.05 - lr: 0.000001 - momentum: 0.000000 2023-10-18 14:46:25,442 epoch 10 - iter 243/275 - loss 0.39378096 - time (sec): 3.70 - samples/sec: 5457.74 - lr: 0.000000 - momentum: 0.000000 2023-10-18 14:46:25,846 epoch 10 - iter 270/275 - loss 0.39039419 - time (sec): 4.10 - samples/sec: 5445.62 - lr: 0.000000 - momentum: 0.000000 2023-10-18 14:46:25,921 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:25,921 EPOCH 10 done: loss 0.3916 - lr: 0.000000 2023-10-18 14:46:26,297 DEV : loss 0.32053643465042114 - f1-score (micro avg) 0.5644 2023-10-18 14:46:26,331 ---------------------------------------------------------------------------------------------------- 2023-10-18 14:46:26,331 Loading model from best epoch ... 2023-10-18 14:46:26,412 SequenceTagger predicts: Dictionary with 25 tags: O, S-scope, B-scope, E-scope, I-scope, S-pers, B-pers, E-pers, I-pers, S-work, B-work, E-work, I-work, S-loc, B-loc, E-loc, I-loc, S-object, B-object, E-object, I-object, S-date, B-date, E-date, I-date 2023-10-18 14:46:26,712 Results: - F-score (micro) 0.5722 - F-score (macro) 0.3371 - Accuracy 0.4106 By class: precision recall f1-score support scope 0.5722 0.6080 0.5895 176 pers 0.8000 0.5625 0.6606 128 work 0.3854 0.5000 0.4353 74 object 0.0000 0.0000 0.0000 2 loc 0.0000 0.0000 0.0000 2 micro avg 0.5791 0.5654 0.5722 382 macro avg 0.3515 0.3341 0.3371 382 weighted avg 0.6064 0.5654 0.5773 382 2023-10-18 14:46:26,712 ----------------------------------------------------------------------------------------------------