Text Generation
Transformers
PyTorch
English
qwen2
text-generation-inference
unsloth
trl
gammacorpus
zurich
chat
conversational
Instructions to use rubenroy/Zurich-14B-GCv2-100k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use rubenroy/Zurich-14B-GCv2-100k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="rubenroy/Zurich-14B-GCv2-100k") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("rubenroy/Zurich-14B-GCv2-100k") model = AutoModelForCausalLM.from_pretrained("rubenroy/Zurich-14B-GCv2-100k") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use rubenroy/Zurich-14B-GCv2-100k with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "rubenroy/Zurich-14B-GCv2-100k" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rubenroy/Zurich-14B-GCv2-100k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/rubenroy/Zurich-14B-GCv2-100k
- SGLang
How to use rubenroy/Zurich-14B-GCv2-100k with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "rubenroy/Zurich-14B-GCv2-100k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rubenroy/Zurich-14B-GCv2-100k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "rubenroy/Zurich-14B-GCv2-100k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "rubenroy/Zurich-14B-GCv2-100k", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use rubenroy/Zurich-14B-GCv2-100k with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for rubenroy/Zurich-14B-GCv2-100k to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for rubenroy/Zurich-14B-GCv2-100k to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for rubenroy/Zurich-14B-GCv2-100k to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="rubenroy/Zurich-14B-GCv2-100k", max_seq_length=2048, ) - Docker Model Runner
How to use rubenroy/Zurich-14B-GCv2-100k with Docker Model Runner:
docker model run hf.co/rubenroy/Zurich-14B-GCv2-100k
File size: 4,651 Bytes
560c23f cda6a71 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 | ---
base_model: Qwen/Qwen2.5-14B-Instruct
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
- gammacorpus
- zurich
- chat
- conversational
license: apache-2.0
language:
- zho
- eng
- fra
- spa
- por
- deu
- ita
- rus
- jpn
- kor
- vie
- tha
- ara
datasets:
- rubenroy/GammaCorpus-v2-100k
pipeline_tag: text-generation
library_name: transformers
---

# Zurich 14B GammaCorpus v2-100k
*A Qwen 2.5 model fine-tuned on the GammaCorpus dataset*
## Overview
Zurich 14B GammaCorpus v2-100k is a fine-tune of Alibaba's **Qwen 2.5 14B Instruct** model. Zurich is designed to outperform other models that have a similar size while also showcasing [GammaCorpus v2-100k](https://huggingface.co/datasets/rubenroy/GammaCorpus-v2-100k).
## Model Details
- **Base Model:** [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
- **Type:** Causal Language Models
- **Architecture:** Transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
- **Number of Parameters:** 14.7B
- **Number of Paramaters (Non-Embedding):** 13.1B
- **Number of Layers:** 48
- **Number of Attention Heads (GQA):** 40 for Q and 8 for KV
## Training Details
Zurich-14B-GCv2-100k underwent fine-tuning with 1 A100 GPU for ~30 minutes and trained with the [Unsloth](https://unsloth.ai/) framework. Zurich-14B-GCv2-100k was trained for **60 Epochs**.
## Usage
### Requirements
We **strongly** recommend you use the latest version of the `transformers` package. You may install it via `pip` as follows:
```
pip install transformers
```
### Quickstart
Here is a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents;
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "rubenroy/Zurich-14B-GCv2-100k"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "How tall is the Eiffel tower?"
messages = [
{"role": "system", "content": "You are Zurich, an AI assistant built on the Qwen 2.5 14B model developed by Alibaba Cloud, and fine-tuned by Ruben Roy. You are a helpful assistant."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=512
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```
## About GammaCorpus
This model, and all Zurich models, are trained with GammaCorpus. GammaCorpus is a dataset on HuggingFace that is filled with structured and filtered multi-turn conversations.
GammaCorpus has 4 version with different sizes in each. These are the following versions and sizes:
### GammaCorpus v1
- 10k UNFILTERED
- 50k UNFILTERED
- 70k UNFILTERED
Here is a link to the GCv1 dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-v1-67935e4e52a04215f15a7a60
### GammaCorpus v2
- 10k
- 50k
- **100k <-- This is the version of GammaCorpus v2 that the Zurich model you are using was trained on.**
- 500k
- 1m
- 5m
Here is a link to the GCv2 dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-v2-67935e895e1259c404a579df
### GammaCorpus CoT
- Math 170k
Here is a link to the GC-CoT dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-cot-6795bbc950b62b1ced41d14f
### GammaCorpus QA
- Fact 450k
Here is a link to the GC-QA dataset collection:<br>
https://huggingface.co/collections/rubenroy/gammacorpus-qa-679857017bb3855234c1d8c7
### The link to the full GammaCorpus dataset collection can be found [here](https://huggingface.co/collections/rubenroy/gammacorpus-67765abf607615a0eb6d61ac).
## Known Limitations
- **Bias:** We have tried our best to mitigate as much bias we can, but please be aware of the possibility that the model might generate some biased answers.
## Additional Information
### Licensing Information
The model is released under the **[Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)**. Please refer to the license for usage rights and restrictions. |