convnext-Mistral-SYDNEY-without-captioning

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 64
eval_batch_size: 64
seed: 50
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1024
num_epochs: 128
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Accuracy	Bleu-1	Bleu-2	Bleu-3	Bleu-4	Meteor	Rouge-l	Cider
No log	1.0	44	4.1274	15.85	0.1592	0.0702	0.0187	0.0046	0.1675	0.1995	0.0297
No log	2.0	88	3.6176	44.55	0.1722	0.0991	0.0482	0.0234	0.1199	0.1919	0.0449
No log	3.0	132	2.6945	54.06	0.4796	0.4131	0.3526	0.3018	0.4825	0.4832	0.7144
No log	4.0	176	1.1003	63.69	0.6803	0.5727	0.4940	0.4260	0.6185	0.6096	1.9285
No log	5.0	220	0.8785	64.59	0.6991	0.5960	0.5174	0.4568	0.6156	0.6107	1.8999
No log	6.0	264	0.8430	66.56	0.7087	0.6125	0.5405	0.4797	0.6845	0.6561	2.1839
No log	7.0	308	0.8344	65.76	0.7480	0.6570	0.5820	0.5169	0.6814	0.6759	2.2340
No log	8.0	352	0.8744	64.16	0.6873	0.5899	0.5208	0.4661	0.6778	0.6405	2.1731
No log	9.0	396	0.8152	65.58	0.7681	0.6929	0.6344	0.5855	0.7326	0.7228	2.7204
No log	10.0	440	0.8463	65.68	0.7484	0.6613	0.5913	0.5331	0.6891	0.6779	2.3994
No log	11.0	484	0.8295	66.19	0.7308	0.6472	0.5765	0.5175	0.6854	0.6762	2.3252
No log	12.0	528	0.8563	66.76	0.7071	0.6033	0.5286	0.4649	0.6508	0.6274	1.9048
No log	13.0	572	0.9066	65.26	0.7745	0.6850	0.6148	0.5563	0.7058	0.6980	2.3512
No log	14.0	616	0.9738	66.73	0.6833	0.5907	0.5193	0.4645	0.6206	0.6089	2.0059
No log	15.0	660	0.9778	65.21	0.7471	0.6589	0.5880	0.5276	0.6800	0.6704	2.3928
No log	16.0	704	1.0099	67.3	0.7493	0.6671	0.6059	0.5569	0.7204	0.6960	2.3103
No log	17.0	748	1.0429	67.33	0.6889	0.5977	0.5291	0.4738	0.6658	0.6307	2.1452
No log	18.0	792	1.0302	67.17	0.7137	0.6411	0.5831	0.5334	0.6533	0.6467	2.0804
No log	19.0	836	1.1330	66.89	0.7077	0.6268	0.5589	0.5054	0.6430	0.6403	2.1213

Safetensors

Model size

0.3B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support