IdeoBERT-it-v2 / trainig.txt
paride92's picture
Upload Model
dcd6759 verified
raw
history blame
4.13 kB
config.json: 100%
 433/433 [00:00<00:00, 53.7kB/s]
model.safetensors: 100%
 445M/445M [00:04<00:00, 205MB/s]
Loading weights: 100%
 199/199 [00:00<00:00, 974.29it/s, Materializing param=bert.pooler.dense.weight]
BertForSequenceClassification LOAD REPORT from: dbmdz/bert-base-italian-xxl-cased
Key | Status |
-------------------------------------------+------------+-
cls.seq_relationship.weight | UNEXPECTED |
cls.predictions.bias | UNEXPECTED |
cls.predictions.transform.dense.bias | UNEXPECTED |
cls.seq_relationship.bias | UNEXPECTED |
cls.predictions.transform.LayerNorm.bias | UNEXPECTED |
cls.predictions.transform.LayerNorm.weight | UNEXPECTED |
cls.predictions.transform.dense.weight | UNEXPECTED |
classifier.bias | MISSING |
classifier.weight | MISSING |
Notes:
- UNEXPECTED :can be ignored when loading from different task/architecture; not ok if you expect identical arch.
- MISSING :those params were newly initialized because missing from the checkpoint. Consider training on your downstream task.
======== Epoch 1 / 3 ========
Training...
Batch 40 of 378. Elapsed: 0:00:19.
Batch 80 of 378. Elapsed: 0:00:38.
Batch 120 of 378. Elapsed: 0:00:56.
Batch 160 of 378. Elapsed: 0:01:14.
Batch 200 of 378. Elapsed: 0:01:33.
Batch 240 of 378. Elapsed: 0:01:51.
Batch 280 of 378. Elapsed: 0:02:09.
Batch 320 of 378. Elapsed: 0:02:28.
Batch 360 of 378. Elapsed: 0:02:46.
Average training loss: 0.39
Training took: 0:02:54
Running Validation...
Average test loss: 0.36
Validation took: 0:00:15
precision recall f1-score support
0 0.80 0.93 0.86 2823
1 0.90 0.71 0.79 2351
accuracy 0.83 5174
macro avg 0.85 0.82 0.83 5174
weighted avg 0.84 0.83 0.83 5174
======== Epoch 2 / 3 ========
Training...
Batch 40 of 378. Elapsed: 0:00:18.
Batch 80 of 378. Elapsed: 0:00:36.
Batch 120 of 378. Elapsed: 0:00:55.
Batch 160 of 378. Elapsed: 0:01:13.
Batch 200 of 378. Elapsed: 0:01:31.
Batch 240 of 378. Elapsed: 0:01:50.
Batch 280 of 378. Elapsed: 0:02:08.
Batch 320 of 378. Elapsed: 0:02:26.
Batch 360 of 378. Elapsed: 0:02:45.
Average training loss: 0.20
Training took: 0:02:53
Running Validation...
Average test loss: 0.41
Validation took: 0:00:15
precision recall f1-score support
0 0.82 0.91 0.87 2823
1 0.88 0.77 0.82 2351
accuracy 0.85 5174
macro avg 0.85 0.84 0.84 5174
weighted avg 0.85 0.85 0.85 5174
======== Epoch 3 / 3 ========
Training...
Batch 40 of 378. Elapsed: 0:00:18.
Batch 80 of 378. Elapsed: 0:00:36.
Batch 120 of 378. Elapsed: 0:00:55.
Batch 160 of 378. Elapsed: 0:01:13.
Batch 200 of 378. Elapsed: 0:01:31.
Batch 240 of 378. Elapsed: 0:01:50.
Batch 280 of 378. Elapsed: 0:02:08.
Batch 320 of 378. Elapsed: 0:02:26.
Batch 360 of 378. Elapsed: 0:02:45.
Average training loss: 0.07
Training took: 0:02:53
Running Validation...
Average test loss: 0.60
Validation took: 0:00:15
precision recall f1-score support
0 0.86 0.89 0.88 2823
1 0.87 0.83 0.85 2351
accuracy 0.86 5174
macro avg 0.86 0.86 0.86 5174
weighted avg 0.86 0.86 0.86 5174
Training complete!