lbourdois commited on
Commit
8975882
·
verified ·
1 Parent(s): 2b2c834

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +150 -138
README.md CHANGED
@@ -1,139 +1,151 @@
1
- ---
2
- base_model: Qwen/Qwen2.5-1.5B-Instruct
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - qwen2
8
- - trl
9
- - gammacorpus
10
- - zurich
11
- - chat
12
- - conversational
13
- license: apache-2.0
14
- language:
15
- - en
16
- datasets:
17
- - rubenroy/GammaCorpus-v2-100k
18
- pipeline_tag: text-generation
19
- library_name: transformers
20
- ---
21
-
22
- ![Zunich Banner](https://cdn.ruben-roy.com/AI/Zurich/img/banner-1.5B-100k.png)
23
-
24
- # Zurich 1.5B GammaCorpus v2-100k
25
- *A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*
26
-
27
- ## Overview
28
- Zurich 1.5B GammaCorpus v2-100k is a fine-tune of Alibaba's **Qwen 2.5 1.5B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-100k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-100k).
29
-
30
- ## Model Details
31
- - **Base Model:** [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
32
- - **Type:** Causal Language Models
33
- - **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
34
- - **Number of Parameters:** 1.54B
35
- - **Number of Paramaters (Non-Embedding)**: 1.31B
36
- - **Number of Layers:** 28
37
- - **Number of Attention Heads (GQA):** 12 for Q and 2 for KV
38
-
39
-
40
- ## Training Details
41
-
42
- Zurich-1.5B-GCv2-100k underwent fine-tuning with 1 A100 GPU for ~10 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-1.5B-GCv2-100k was trained for **60 Epochs**.
43
-
44
- ## Usage
45
-
46
- ### Requirements
47
-
48
- We **strongly** recommend you use the latest version of the `transformers` package. You may install it via `pip` as follows:
49
-
50
- ```
51
- pip install transformers
52
- ```
53
-
54
- ### Quickstart
55
-
56
- Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents;
57
-
58
- ```python
59
- from transformers import AutoModelForCausalLM, AutoTokenizer
60
-
61
- model_name = "rubenroy/Zurich-1.5B-GCv2-100k"
62
-
63
- model = AutoModelForCausalLM.from_pretrained(
64
- model_name,
65
- torch_dtype="auto",
66
- device_map="auto"
67
- )
68
- tokenizer = AutoTokenizer.from_pretrained(model_name)
69
-
70
- prompt = "How tall is the Eiffel tower?"
71
- messages = [
72
- {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 1.5B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
73
- {"role": "user", "content": prompt}
74
- ]
75
- text = tokenizer.apply_chat_template(
76
- messages,
77
- tokenize=False,
78
- add_generation_prompt=True
79
- )
80
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
81
-
82
- generated_ids = model.generate(
83
- **model_inputs,
84
- max_new_tokens=512
85
- )
86
- generated_ids = [
87
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
88
- ]
89
-
90
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
91
- ```
92
-
93
- ## About GammaCorpus
94
-
95
- This model, and all Zurich models, are trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that is filled with structured and filtered multi-turn conversations.
96
- GammaCorpus has 4 version with different sizes in each. These are the following versions and sizes:
97
-
98
- ### GammaCorpus v1
99
- - 10k UNFILTERED
100
- - 50k UNFILTERED
101
- - 70k UNFILTERED
102
-
103
- Here is a link to the GCv1 dataset collection:<br>
104
- https://huggingface.co/collections/rubenroy/gammacorpus-v1-67935e4e52a04215f15a7a60
105
-
106
- ### GammaCorpus v2
107
- - 10k
108
- - 500
109
- - **100k <-- This is the version of GammaCorpus v2 that the Zurich model you are using was trained on.**
110
- - 500k
111
- - 1m
112
- - 5m
113
-
114
- Here is a link to the GCv2 dataset collection:<br>
115
- https://huggingface.co/collections/rubenroy/gammacorpus-v2-67935e895e1259c404a579df
116
-
117
- ### GammaCorpus CoT
118
- - Math 170k
119
-
120
- Here is a link to the GC-CoT dataset collection:<br>
121
- https://huggingface.co/collections/rubenroy/gammacorpus-cot-6795bbc950b62b1ced41d14f
122
-
123
- ### GammaCorpus QA
124
- - Fact 450k
125
-
126
- Here is a link to the GC-QA dataset collection:<br>
127
- https://huggingface.co/collections/rubenroy/gammacorpus-qa-679857017bb3855234c1d8c7
128
-
129
- ### The link to the full GammaCorpus dataset collection can be found [here](https://huggingface.co/collections/rubenroy/gammacorpus-67765abf607615a0eb6d61ac).
130
-
131
- ## Known Limitations
132
-
133
- - **Bias:** We have tried our best to mitigate as much bias we can, but please be aware of the possibility that the model might generate some biased answers.
134
-
135
- ## Additional Information
136
-
137
- ### Licensing Information
138
-
 
 
 
 
 
 
 
 
 
 
 
 
139
  The model is released under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**. Please refer to the license for usage rights and restrictions.
 
1
+ ---
2
+ base_model: Qwen/Qwen2.5-1.5B-Instruct
3
+ tags:
4
+ - text-generation-inference
5
+ - transformers
6
+ - unsloth
7
+ - qwen2
8
+ - trl
9
+ - gammacorpus
10
+ - zurich
11
+ - chat
12
+ - conversational
13
+ license: apache-2.0
14
+ language:
15
+ - zho
16
+ - eng
17
+ - fra
18
+ - spa
19
+ - por
20
+ - deu
21
+ - ita
22
+ - rus
23
+ - jpn
24
+ - kor
25
+ - vie
26
+ - tha
27
+ - ara
28
+ datasets:
29
+ - rubenroy/GammaCorpus-v2-100k
30
+ pipeline_tag: text-generation
31
+ library_name: transformers
32
+ ---
33
+
34
+ ![Zunich Banner](https://cdn.ruben-roy.com/AI/Zurich/img/banner-1.5B-100k.png)
35
+
36
+ # Zurich 1.5B GammaCorpus v2-100k
37
+ *A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*
38
+
39
+ ## Overview
40
+ Zurich 1.5B GammaCorpus v2-100k is a fine-tune of Alibaba's **Qwen 2.5 1.5B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-100k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-100k).
41
+
42
+ ## Model Details
43
+ - **Base Model:** [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)
44
+ - **Type:** Causal Language Models
45
+ - **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
46
+ - **Number of Parameters:** 1.54B
47
+ - **Number of Paramaters (Non-Embedding)**: 1.31B
48
+ - **Number of Layers:** 28
49
+ - **Number of Attention Heads (GQA):** 12 for Q and 2 for KV
50
+
51
+
52
+ ## Training Details
53
+
54
+ Zurich-1.5B-GCv2-100k underwent fine-tuning with 1 A100 GPU for ~10 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-1.5B-GCv2-100k was trained for **60 Epochs**.
55
+
56
+ ## Usage
57
+
58
+ ### Requirements
59
+
60
+ We **strongly** recommend you use the latest version of the `transformers` package. You may install it via `pip` as follows:
61
+
62
+ ```
63
+ pip install transformers
64
+ ```
65
+
66
+ ### Quickstart
67
+
68
+ Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents;
69
+
70
+ ```python
71
+ from transformers import AutoModelForCausalLM, AutoTokenizer
72
+
73
+ model_name = "rubenroy/Zurich-1.5B-GCv2-100k"
74
+
75
+ model = AutoModelForCausalLM.from_pretrained(
76
+ model_name,
77
+ torch_dtype="auto",
78
+ device_map="auto"
79
+ )
80
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
81
+
82
+ prompt = "How tall is the Eiffel tower?"
83
+ messages = [
84
+ {"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 1.5B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
85
+ {"role": "user", "content": prompt}
86
+ ]
87
+ text = tokenizer.apply_chat_template(
88
+ messages,
89
+ tokenize=False,
90
+ add_generation_prompt=True
91
+ )
92
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
93
+
94
+ generated_ids = model.generate(
95
+ **model_inputs,
96
+ max_new_tokens=512
97
+ )
98
+ generated_ids = [
99
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
100
+ ]
101
+
102
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
103
+ ```
104
+
105
+ ## About GammaCorpus
106
+
107
+ This model, and all Zurich models, are trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that is filled with structured and filtered multi-turn conversations.
108
+ GammaCorpus has 4 version with different sizes in each. These are the following versions and sizes:
109
+
110
+ ### GammaCorpus v1
111
+ - 10k UNFILTERED
112
+ - 50k UNFILTERED
113
+ - 70k UNFILTERED
114
+
115
+ Here is a link to the GCv1 dataset collection:<br>
116
+ https://huggingface.co/collections/rubenroy/gammacorpus-v1-67935e4e52a04215f15a7a60
117
+
118
+ ### GammaCorpus v2
119
+ - 10k
120
+ - 500
121
+ - **100k <-- This is the version of GammaCorpus v2 that the Zurich model you are using was trained on.**
122
+ - 500k
123
+ - 1m
124
+ - 5m
125
+
126
+ Here is a link to the GCv2 dataset collection:<br>
127
+ https://huggingface.co/collections/rubenroy/gammacorpus-v2-67935e895e1259c404a579df
128
+
129
+ ### GammaCorpus CoT
130
+ - Math 170k
131
+
132
+ Here is a link to the GC-CoT dataset collection:<br>
133
+ https://huggingface.co/collections/rubenroy/gammacorpus-cot-6795bbc950b62b1ced41d14f
134
+
135
+ ### GammaCorpus QA
136
+ - Fact 450k
137
+
138
+ Here is a link to the GC-QA dataset collection:<br>
139
+ https://huggingface.co/collections/rubenroy/gammacorpus-qa-679857017bb3855234c1d8c7
140
+
141
+ ### The link to the full GammaCorpus dataset collection can be found [here](https://huggingface.co/collections/rubenroy/gammacorpus-67765abf607615a0eb6d61ac).
142
+
143
+ ## Known Limitations
144
+
145
+ - **Bias:** We have tried our best to mitigate as much bias we can, but please be aware of the possibility that the model might generate some biased answers.
146
+
147
+ ## Additional Information
148
+
149
+ ### Licensing Information
150
+
151
  The model is released under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**. Please refer to the license for usage rights and restrictions.