swadhindas324/swin-resnet-mistral-SYDNEY-with-all-captioning

Browse files

Files changed (5) hide show

README.md +95 -195
config.json +86 -0
generation_config.json +39 -0
model.safetensors +3 -0
training_args.bin +3 -0

README.md CHANGED Viewed

@@ -1,199 +1,99 @@
 ---
 library_name: transformers
-tags: []
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-## Model Details
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
-This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
-## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
-## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
-## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-[More Information Needed]
-### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
-## Model Card Contact
-[More Information Needed]

 ---
 library_name: transformers
+tags:
+- generated_from_trainer
+metrics:
+- accuracy
+model-index:
+- name: swin-resnet-mistral-SYDNEY-with-all-captioning
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# swin-resnet-mistral-SYDNEY-with-all-captioning
+This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.3232
+- Accuracy: 65.7
+- Bleu-1: 0.5949
+- Bleu-2: 0.5339
+- Bleu-3: 0.4914
+- Bleu-4: 0.4557
+- Meteor: 0.5436
+- Rouge-l: 0.6098
+- Cider: 1.6017
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 64
+- eval_batch_size: 64
+- seed: 50
+- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 1024
+- num_epochs: 128
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy | Bleu-1 | Bleu-2 | Bleu-3 | Bleu-4 | Meteor | Rouge-l | Cider  |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:------:|:------:|:------:|:------:|:-------:|:------:|
+| No log        | 1.0   | 44   | 1.2803          | 40.4     | 0.3525 | 0.2768 | 0.2257 | 0.1881 | 0.5016 | 0.4202  | 0.5907 |
+| No log        | 2.0   | 88   | 1.0514          | 61.77    | 0.4774 | 0.3355 | 0.2522 | 0.1907 | 0.4166 | 0.4002  | 0.4287 |
+| No log        | 3.0   | 132  | 0.9772          | 63.06    | 0.4879 | 0.3441 | 0.2625 | 0.2029 | 0.4153 | 0.4023  | 0.4082 |
+| No log        | 4.0   | 176  | 0.9050          | 63.2     | 0.4597 | 0.3120 | 0.2307 | 0.1715 | 0.3604 | 0.3639  | 0.3364 |
+| No log        | 5.0   | 220  | 0.8187          | 63.37    | 0.5164 | 0.3760 | 0.2936 | 0.2326 | 0.4161 | 0.4183  | 0.5877 |
+| No log        | 6.0   | 264  | 0.7221          | 64.99    | 0.4858 | 0.3419 | 0.2627 | 0.2076 | 0.3554 | 0.3831  | 0.6604 |
+| No log        | 7.0   | 308  | 0.6234          | 65.16    | 0.5295 | 0.3903 | 0.3082 | 0.2448 | 0.4427 | 0.4342  | 0.7510 |
+| No log        | 8.0   | 352  | 0.5437          | 65.47    | 0.5267 | 0.3811 | 0.2972 | 0.2365 | 0.4509 | 0.4387  | 0.7940 |
+| No log        | 9.0   | 396  | 0.5210          | 66.25    | 0.5267 | 0.4112 | 0.3427 | 0.2956 | 0.4322 | 0.4488  | 1.1293 |
+| No log        | 10.0  | 440  | 0.5277          | 66.31    | 0.6364 | 0.5407 | 0.4771 | 0.4297 | 0.5541 | 0.5557  | 1.8319 |
+| No log        | 11.0  | 484  | 0.5085          | 65.06    | 0.6104 | 0.5088 | 0.4397 | 0.3882 | 0.5494 | 0.5520  | 1.7389 |
+| No log        | 12.0  | 528  | 0.5123          | 66.97    | 0.6496 | 0.5495 | 0.4797 | 0.4301 | 0.5773 | 0.5768  | 1.7341 |
+| No log        | 13.0  | 572  | 0.5340          | 66.23    | 0.4950 | 0.3817 | 0.3181 | 0.2718 | 0.4101 | 0.4507  | 1.2149 |
+| No log        | 14.0  | 616  | 0.5329          | 65.39    | 0.6253 | 0.5224 | 0.4452 | 0.3868 | 0.5576 | 0.5502  | 1.5926 |
+| No log        | 15.0  | 660  | 0.5461          | 65.92    | 0.6656 | 0.5754 | 0.5075 | 0.4546 | 0.5780 | 0.5894  | 1.8762 |
+| No log        | 16.0  | 704  | 0.5435          | 64.68    | 0.6365 | 0.5344 | 0.4565 | 0.3999 | 0.5685 | 0.5655  | 1.8068 |
+| No log        | 17.0  | 748  | 0.5619          | 66.19    | 0.6833 | 0.5911 | 0.5134 | 0.4465 | 0.5917 | 0.6082  | 1.7530 |
+| No log        | 18.0  | 792  | 0.5653          | 67.21    | 0.6432 | 0.5915 | 0.5493 | 0.5167 | 0.6103 | 0.6437  | 2.0025 |
+| No log        | 19.0  | 836  | 0.5855          | 63.68    | 0.6954 | 0.5975 | 0.5215 | 0.4622 | 0.6169 | 0.6120  | 1.9900 |
+| No log        | 20.0  | 880  | 0.6408          | 65.66    | 0.6691 | 0.5775 | 0.5106 | 0.4595 | 0.6005 | 0.6201  | 1.6515 |
+| No log        | 21.0  | 924  | 0.6872          | 67.74    | 0.6715 | 0.5886 | 0.5357 | 0.4988 | 0.5834 | 0.6151  | 1.9363 |
+| No log        | 22.0  | 968  | 0.6886          | 67.71    | 0.6965 | 0.6232 | 0.5719 | 0.5328 | 0.6193 | 0.6512  | 1.9162 |
+| No log        | 23.0  | 1012 | 0.7542          | 68.1     | 0.6502 | 0.5819 | 0.5336 | 0.4944 | 0.5734 | 0.6080  | 1.7041 |
+| 0.6311        | 24.0  | 1056 | 0.8377          | 68.45    | 0.6886 | 0.6151 | 0.5662 | 0.5214 | 0.5968 | 0.6452  | 1.8615 |
+| 0.6311        | 25.0  | 1100 | 1.1727          | 66.68    | 0.6665 | 0.5867 | 0.5296 | 0.4833 | 0.5548 | 0.6124  | 1.4923 |
+| 0.6311        | 26.0  | 1144 | 1.2276          | 65.85    | 0.6264 | 0.5559 | 0.5134 | 0.4719 | 0.5668 | 0.6141  | 1.6275 |
+| 0.6311        | 27.0  | 1188 | 1.3551          | 66.24    | 0.5980 | 0.5307 | 0.4856 | 0.4470 | 0.5345 | 0.6031  | 1.4574 |
+| 0.6311        | 28.0  | 1232 | 1.2643          | 67.33    | 0.6410 | 0.5789 | 0.5327 | 0.4950 | 0.5766 | 0.6416  | 1.6530 |
+| 0.6311        | 29.0  | 1276 | 1.4213          | 65.98    | 0.4811 | 0.3962 | 0.3426 | 0.2991 | 0.4590 | 0.5586  | 0.9854 |
+| 0.6311        | 30.0  | 1320 | 1.3364          | 65.73    | 0.5691 | 0.4999 | 0.4555 | 0.4207 | 0.5231 | 0.5991  | 1.3969 |
+| 0.6311        | 31.0  | 1364 | 1.3737          | 65.49    | 0.5799 | 0.5158 | 0.4759 | 0.4416 | 0.5276 | 0.6097  | 1.4832 |
+| 0.6311        | 32.0  | 1408 | 1.3232          | 65.7     | 0.5949 | 0.5339 | 0.4914 | 0.4557 | 0.5436 | 0.6098  | 1.6017 |
+### Framework versions
+- Transformers 5.12.1
+- Pytorch 2.12.1+cu130
+- Datasets 5.0.0
+- Tokenizers 0.22.2

config.json ADDED Viewed

	@@ -0,0 +1,86 @@

+{
+  "architectures": [
+    "MultiTaskVED"
+  ],
+  "decoder": {
+    "_name_or_path": "swadhindas324/mistral-SYDNEY",
+    "add_cross_attention": true,
+    "architectures": [
+      "MistralForCausalLM"
+    ],
+    "attention_dropout": 0.0,
+    "bos_token_id": 480,
+    "chunk_size_feed_forward": 0,
+    "dtype": "float32",
+    "eos_token_id": 481,
+    "head_dim": 64,
+    "hidden_act": "silu",
+    "hidden_size": 768,
+    "id2label": {
+      "0": "LABEL_0",
+      "1": "LABEL_1"
+    },
+    "initializer_range": 0.02,
+    "intermediate_size": 3072,
+    "is_decoder": true,
+    "is_encoder_decoder": false,
+    "label2id": {
+      "LABEL_0": 0,
+      "LABEL_1": 1
+    },
+    "max_position_embeddings": 45,
+    "model_type": "mistral",
+    "num_attention_heads": 12,
+    "num_hidden_layers": 12,
+    "num_key_value_heads": 4,
+    "output_attentions": false,
+    "output_hidden_states": false,
+    "pad_token_id": 483,
+    "problem_type": null,
+    "return_dict": true,
+    "rms_norm_eps": 1e-06,
+    "rope_parameters": {
+      "rope_theta": 10000.0,
+      "rope_type": "default"
+    },
+    "sliding_window": 4096,
+    "tie_word_embeddings": false,
+    "type_vocab_size": 1,
+    "use_cache": false,
+    "vocab_size": 484
+  },
+  "decoder_start_token_id": 480,
+  "dtype": "float32",
+  "encoder": {
+    "_name_or_path": "swadhindas324/swin-resnet-vit",
+    "add_cross_attention": false,
+    "architectures": [
+      "Swin_Backbone"
+    ],
+    "chunk_size_feed_forward": 0,
+    "dtype": "float32",
+    "hidden_size": 1024,
+    "id2label": {
+      "0": "LABEL_0",
+      "1": "LABEL_1"
+    },
+    "is_encoder_decoder": false,
+    "label2id": {
+      "LABEL_0": 0,
+      "LABEL_1": 1
+    },
+    "model_type": "swin_vit",
+    "output_attentions": false,
+    "output_hidden_states": false,
+    "pooler_output": 1024,
+    "problem_type": null,
+    "return_dict": true
+  },
+  "is_encoder_decoder": true,
+  "model_type": "vision-encoder-decoder",
+  "pad_token_id": 483,
+  "tie_word_embeddings": false,
+  "transformers_version": "5.12.1",
+  "use_cache": false,
+  "vocab_size": 484
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,39 @@

+{
+  "_from_model_config": false,
+  "assistant_confidence_threshold": 0.4,
+  "assistant_lookbehind": 10,
+  "bos_token_id": 480,
+  "diversity_penalty": 0.0,
+  "do_sample": false,
+  "early_stopping": true,
+  "encoder_no_repeat_ngram_size": 0,
+  "encoder_repetition_penalty": 1.0,
+  "eos_token_id": 481,
+  "epsilon_cutoff": 0.0,
+  "eta_cutoff": 0.0,
+  "length_penalty": 2.0,
+  "max_length": 40,
+  "max_new_tokens": 40,
+  "min_length": 0,
+  "model_max_length": 40,
+  "no_repeat_ngram_size": 3,
+  "num_assistant_tokens": 20,
+  "num_assistant_tokens_schedule": "constant",
+  "num_beam_groups": 1,
+  "num_beams": 5,
+  "num_return_sequences": 1,
+  "output_attentions": false,
+  "output_hidden_states": false,
+  "output_scores": false,
+  "pad_token_id": 483,
+  "remove_invalid_values": false,
+  "repetition_penalty": 1.0,
+  "return_dict_in_generate": false,
+  "target_lookbehind": 10,
+  "temperature": 1.0,
+  "top_k": 50,
+  "top_p": 1.0,
+  "transformers_version": "5.12.1",
+  "typical_p": 1.0,
+  "use_cache": false
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:04f834ef5ba6132efb3d8e6507600642c1fb572faefc28861ae90f1161c13d3e
+size 1686203912

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5ea919a36683c529726008f0b790dea4fcddfd92cf6cd36e6c9e868ef7100f01
+size 5393