File size: 4,128 Bytes

dcd6759

config.json: 100%
 433/433 [00:00<00:00, 53.7kB/s]
model.safetensors: 100%
 445M/445M [00:04<00:00, 205MB/s]
Loading weights: 100%
 199/199 [00:00<00:00, 974.29it/s, Materializing param=bert.pooler.dense.weight]
BertForSequenceClassification LOAD REPORT from: dbmdz/bert-base-italian-xxl-cased
Key                                        | Status     | 
-------------------------------------------+------------+-
cls.seq_relationship.weight                | UNEXPECTED | 
cls.predictions.bias                       | UNEXPECTED | 
cls.predictions.transform.dense.bias       | UNEXPECTED | 
cls.seq_relationship.bias                  | UNEXPECTED | 
cls.predictions.transform.LayerNorm.bias   | UNEXPECTED | 
cls.predictions.transform.LayerNorm.weight | UNEXPECTED | 
cls.predictions.transform.dense.weight     | UNEXPECTED | 
classifier.bias                            | MISSING    | 
classifier.weight                          | MISSING    | 

Notes:
- UNEXPECTED	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.
- MISSING	:those params were newly initialized because missing from the checkpoint. Consider training on your downstream task.

======== Epoch 1 / 3 ========
Training...
  Batch    40  of    378.    Elapsed: 0:00:19.
  Batch    80  of    378.    Elapsed: 0:00:38.
  Batch   120  of    378.    Elapsed: 0:00:56.
  Batch   160  of    378.    Elapsed: 0:01:14.
  Batch   200  of    378.    Elapsed: 0:01:33.
  Batch   240  of    378.    Elapsed: 0:01:51.
  Batch   280  of    378.    Elapsed: 0:02:09.
  Batch   320  of    378.    Elapsed: 0:02:28.
  Batch   360  of    378.    Elapsed: 0:02:46.

  Average training loss: 0.39
  Training took: 0:02:54

Running Validation...

  Average test loss: 0.36
  Validation took: 0:00:15
              precision    recall  f1-score   support

           0       0.80      0.93      0.86      2823
           1       0.90      0.71      0.79      2351

    accuracy                           0.83      5174
   macro avg       0.85      0.82      0.83      5174
weighted avg       0.84      0.83      0.83      5174


======== Epoch 2 / 3 ========
Training...
  Batch    40  of    378.    Elapsed: 0:00:18.
  Batch    80  of    378.    Elapsed: 0:00:36.
  Batch   120  of    378.    Elapsed: 0:00:55.
  Batch   160  of    378.    Elapsed: 0:01:13.
  Batch   200  of    378.    Elapsed: 0:01:31.
  Batch   240  of    378.    Elapsed: 0:01:50.
  Batch   280  of    378.    Elapsed: 0:02:08.
  Batch   320  of    378.    Elapsed: 0:02:26.
  Batch   360  of    378.    Elapsed: 0:02:45.

  Average training loss: 0.20
  Training took: 0:02:53

Running Validation...

  Average test loss: 0.41
  Validation took: 0:00:15
              precision    recall  f1-score   support

           0       0.82      0.91      0.87      2823
           1       0.88      0.77      0.82      2351

    accuracy                           0.85      5174
   macro avg       0.85      0.84      0.84      5174
weighted avg       0.85      0.85      0.85      5174


======== Epoch 3 / 3 ========
Training...
  Batch    40  of    378.    Elapsed: 0:00:18.
  Batch    80  of    378.    Elapsed: 0:00:36.
  Batch   120  of    378.    Elapsed: 0:00:55.
  Batch   160  of    378.    Elapsed: 0:01:13.
  Batch   200  of    378.    Elapsed: 0:01:31.
  Batch   240  of    378.    Elapsed: 0:01:50.
  Batch   280  of    378.    Elapsed: 0:02:08.
  Batch   320  of    378.    Elapsed: 0:02:26.
  Batch   360  of    378.    Elapsed: 0:02:45.

  Average training loss: 0.07
  Training took: 0:02:53

Running Validation...

  Average test loss: 0.60
  Validation took: 0:00:15
              precision    recall  f1-score   support

           0       0.86      0.89      0.88      2823
           1       0.87      0.83      0.85      2351

    accuracy                           0.86      5174
   macro avg       0.86      0.86      0.86      5174
weighted avg       0.86      0.86      0.86      5174


Training complete!