| config.json: 100% | |
| 433/433 [00:00<00:00, 53.7kB/s] | |
| model.safetensors: 100% | |
| 445M/445M [00:04<00:00, 205MB/s] | |
| Loading weights: 100% | |
| 199/199 [00:00<00:00, 974.29it/s, Materializing param=bert.pooler.dense.weight] | |
| BertForSequenceClassification LOAD REPORT from: dbmdz/bert-base-italian-xxl-cased | |
| Key | Status | | |
| -------------------------------------------+------------+- | |
| cls.seq_relationship.weight | UNEXPECTED | | |
| cls.predictions.bias | UNEXPECTED | | |
| cls.predictions.transform.dense.bias | UNEXPECTED | | |
| cls.seq_relationship.bias | UNEXPECTED | | |
| cls.predictions.transform.LayerNorm.bias | UNEXPECTED | | |
| cls.predictions.transform.LayerNorm.weight | UNEXPECTED | | |
| cls.predictions.transform.dense.weight | UNEXPECTED | | |
| classifier.bias | MISSING | | |
| classifier.weight | MISSING | | |
| Notes: | |
| - UNEXPECTED :can be ignored when loading from different task/architecture; not ok if you expect identical arch. | |
| - MISSING :those params were newly initialized because missing from the checkpoint. Consider training on your downstream task. | |
| ======== Epoch 1 / 3 ======== | |
| Training... | |
| Batch 40 of 378. Elapsed: 0:00:19. | |
| Batch 80 of 378. Elapsed: 0:00:38. | |
| Batch 120 of 378. Elapsed: 0:00:56. | |
| Batch 160 of 378. Elapsed: 0:01:14. | |
| Batch 200 of 378. Elapsed: 0:01:33. | |
| Batch 240 of 378. Elapsed: 0:01:51. | |
| Batch 280 of 378. Elapsed: 0:02:09. | |
| Batch 320 of 378. Elapsed: 0:02:28. | |
| Batch 360 of 378. Elapsed: 0:02:46. | |
| Average training loss: 0.39 | |
| Training took: 0:02:54 | |
| Running Validation... | |
| Average test loss: 0.36 | |
| Validation took: 0:00:15 | |
| precision recall f1-score support | |
| 0 0.80 0.93 0.86 2823 | |
| 1 0.90 0.71 0.79 2351 | |
| accuracy 0.83 5174 | |
| macro avg 0.85 0.82 0.83 5174 | |
| weighted avg 0.84 0.83 0.83 5174 | |
| ======== Epoch 2 / 3 ======== | |
| Training... | |
| Batch 40 of 378. Elapsed: 0:00:18. | |
| Batch 80 of 378. Elapsed: 0:00:36. | |
| Batch 120 of 378. Elapsed: 0:00:55. | |
| Batch 160 of 378. Elapsed: 0:01:13. | |
| Batch 200 of 378. Elapsed: 0:01:31. | |
| Batch 240 of 378. Elapsed: 0:01:50. | |
| Batch 280 of 378. Elapsed: 0:02:08. | |
| Batch 320 of 378. Elapsed: 0:02:26. | |
| Batch 360 of 378. Elapsed: 0:02:45. | |
| Average training loss: 0.20 | |
| Training took: 0:02:53 | |
| Running Validation... | |
| Average test loss: 0.41 | |
| Validation took: 0:00:15 | |
| precision recall f1-score support | |
| 0 0.82 0.91 0.87 2823 | |
| 1 0.88 0.77 0.82 2351 | |
| accuracy 0.85 5174 | |
| macro avg 0.85 0.84 0.84 5174 | |
| weighted avg 0.85 0.85 0.85 5174 | |
| ======== Epoch 3 / 3 ======== | |
| Training... | |
| Batch 40 of 378. Elapsed: 0:00:18. | |
| Batch 80 of 378. Elapsed: 0:00:36. | |
| Batch 120 of 378. Elapsed: 0:00:55. | |
| Batch 160 of 378. Elapsed: 0:01:13. | |
| Batch 200 of 378. Elapsed: 0:01:31. | |
| Batch 240 of 378. Elapsed: 0:01:50. | |
| Batch 280 of 378. Elapsed: 0:02:08. | |
| Batch 320 of 378. Elapsed: 0:02:26. | |
| Batch 360 of 378. Elapsed: 0:02:45. | |
| Average training loss: 0.07 | |
| Training took: 0:02:53 | |
| Running Validation... | |
| Average test loss: 0.60 | |
| Validation took: 0:00:15 | |
| precision recall f1-score support | |
| 0 0.86 0.89 0.88 2823 | |
| 1 0.87 0.83 0.85 2351 | |
| accuracy 0.86 5174 | |
| macro avg 0.86 0.86 0.86 5174 | |
| weighted avg 0.86 0.86 0.86 5174 | |
| Training complete! |