paride92
/

IdeoBERT-it-v2

Model card Files Files and versions

IdeoBERT-it-v2 / trainig.txt

paride92's picture

Upload Model

dcd6759 verified 24 days ago

4.13 kB

	config.json: 100%
	433/433 [00:00<00:00, 53.7kB/s]
	model.safetensors: 100%
	445M/445M [00:04<00:00, 205MB/s]
	Loading weights: 100%
	199/199 [00:00<00:00, 974.29it/s, Materializing param=bert.pooler.dense.weight]
	BertForSequenceClassification LOAD REPORT from: dbmdz/bert-base-italian-xxl-cased
	Key \| Status \|
	-------------------------------------------+------------+-
	cls.seq_relationship.weight \| UNEXPECTED \|
	cls.predictions.bias \| UNEXPECTED \|
	cls.predictions.transform.dense.bias \| UNEXPECTED \|
	cls.seq_relationship.bias \| UNEXPECTED \|
	cls.predictions.transform.LayerNorm.bias \| UNEXPECTED \|
	cls.predictions.transform.LayerNorm.weight \| UNEXPECTED \|
	cls.predictions.transform.dense.weight \| UNEXPECTED \|
	classifier.bias \| MISSING \|
	classifier.weight \| MISSING \|

	Notes:
	- UNEXPECTED :can be ignored when loading from different task/architecture; not ok if you expect identical arch.
	- MISSING :those params were newly initialized because missing from the checkpoint. Consider training on your downstream task.

	======== Epoch 1 / 3 ========
	Training...
	Batch 40 of 378. Elapsed: 0:00:19.
	Batch 80 of 378. Elapsed: 0:00:38.
	Batch 120 of 378. Elapsed: 0:00:56.
	Batch 160 of 378. Elapsed: 0:01:14.
	Batch 200 of 378. Elapsed: 0:01:33.
	Batch 240 of 378. Elapsed: 0:01:51.
	Batch 280 of 378. Elapsed: 0:02:09.
	Batch 320 of 378. Elapsed: 0:02:28.
	Batch 360 of 378. Elapsed: 0:02:46.

	Average training loss: 0.39
	Training took: 0:02:54

	Running Validation...

	Average test loss: 0.36
	Validation took: 0:00:15
	precision recall f1-score support

	0 0.80 0.93 0.86 2823
	1 0.90 0.71 0.79 2351

	accuracy 0.83 5174
	macro avg 0.85 0.82 0.83 5174
	weighted avg 0.84 0.83 0.83 5174


	======== Epoch 2 / 3 ========
	Training...
	Batch 40 of 378. Elapsed: 0:00:18.
	Batch 80 of 378. Elapsed: 0:00:36.
	Batch 120 of 378. Elapsed: 0:00:55.
	Batch 160 of 378. Elapsed: 0:01:13.
	Batch 200 of 378. Elapsed: 0:01:31.
	Batch 240 of 378. Elapsed: 0:01:50.
	Batch 280 of 378. Elapsed: 0:02:08.
	Batch 320 of 378. Elapsed: 0:02:26.
	Batch 360 of 378. Elapsed: 0:02:45.

	Average training loss: 0.20
	Training took: 0:02:53

	Running Validation...

	Average test loss: 0.41
	Validation took: 0:00:15
	precision recall f1-score support

	0 0.82 0.91 0.87 2823
	1 0.88 0.77 0.82 2351

	accuracy 0.85 5174
	macro avg 0.85 0.84 0.84 5174
	weighted avg 0.85 0.85 0.85 5174


	======== Epoch 3 / 3 ========
	Training...
	Batch 40 of 378. Elapsed: 0:00:18.
	Batch 80 of 378. Elapsed: 0:00:36.
	Batch 120 of 378. Elapsed: 0:00:55.
	Batch 160 of 378. Elapsed: 0:01:13.
	Batch 200 of 378. Elapsed: 0:01:31.
	Batch 240 of 378. Elapsed: 0:01:50.
	Batch 280 of 378. Elapsed: 0:02:08.
	Batch 320 of 378. Elapsed: 0:02:26.
	Batch 360 of 378. Elapsed: 0:02:45.

	Average training loss: 0.07
	Training took: 0:02:53

	Running Validation...

	Average test loss: 0.60
	Validation took: 0:00:15
	precision recall f1-score support

	0 0.86 0.89 0.88 2823
	1 0.87 0.83 0.85 2351

	accuracy 0.86 5174
	macro avg 0.86 0.86 0.86 5174
	weighted avg 0.86 0.86 0.86 5174


	Training complete!