Instructions to use yongjer/HY-MT1.5-1.8B-Trad-Chinese-ORPO with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps Settings
- Unsloth Studio
How to use yongjer/HY-MT1.5-1.8B-Trad-Chinese-ORPO with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for yongjer/HY-MT1.5-1.8B-Trad-Chinese-ORPO to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for yongjer/HY-MT1.5-1.8B-Trad-Chinese-ORPO to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for yongjer/HY-MT1.5-1.8B-Trad-Chinese-ORPO to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="yongjer/HY-MT1.5-1.8B-Trad-Chinese-ORPO", max_seq_length=2048, )
HY-MT1.5-1.8B-Trad-Chinese-ORPO
This model is a fine-tuned version of tencent/HY-MT1.5-1.8B specialized for English to Traditional Chinese translation.
It was trained using ORPO (Odds Ratio Preference Optimization) via the Unsloth library to enhance its ability to generate high-quality Traditional Chinese while explicitly rejecting Simplified Chinese characters and phrasing.
Model Details
- Base Model: tencent/HY-MT1.5-1.8B
- Training Method: ORPO (Odds Ratio Preference Optimization)
- Quantization: 4-bit (bitsandbytes)
- Language Pair: English -> Traditional Chinese
- Fine-tuning Tool: Unsloth & TRL
Training Data
The model was trained on 10,000 samples derived from the HuggingFaceFW/finetranslations (Mandarin subset).
- Chosen: Traditional Chinese translations (converted and verified using OpenCC).
- Rejected: Original translations containing Simplified Chinese or mixed scripts.
- Prompt:
Translate the following segment into Traditional Chinese, without additional explanation.\n\n{English Text}
Performance Improvements
Through ORPO training, the model's preference for Traditional Chinese was significantly strengthened:
- Rewards Margin: Increased from ~0.07 to ~0.35 during training.
- Accuracy: Maintained near 100% in distinguishing preferred Traditional Chinese responses.
Usage
You can load this model using the unsloth library for fast inference:
from unsloth import FastLanguageModel
import torch
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "yongjer/HY-MT1.5-1.8B-Trad-Chinese-ORPO",
max_seq_length = 2048,
load_in_4bit = True,
)
FastLanguageModel.for_inference(model)
messages = [
{"role": "user", "content": "Translate the following segment into Traditional Chinese, without additional explanation.\n\nArtificial Intelligence is transforming the world at an unprecedented pace."},
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
outputs = model.generate(input_ids=inputs, max_new_tokens=128)
print(tokenizer.batch_decode(outputs))
Acknowledgments
- Thanks to the Unsloth AI team for their efficient fine-tuning library.
- Original model by Tencent Hunyuan.
Model tree for yongjer/HY-MT1.5-1.8B-Trad-Chinese-ORPO
Base model
tencent/HY-MT1.5-1.8B