Model Card for 522H0134-NguyenNhatHuy/Qwen-1.5-1.8B-Chat-DPO

This model is a DPO-aligned version of Qwen/Qwen1.5-1.8B-Chat, trained to produce safer, more helpful responses in Vietnamese through Direct Preference Optimization (DPO). It builds on the previously fine-tuned SFT model, using preference pairs (chosen vs rejected completions) to optimize for alignment and instruction-following.


🧠 Model Details

  • Base Model: Qwen/Qwen1.5-1.8B-Chat
  • Model Type: Causal Language Model (Chat)
  • Languages: Vietnamese
  • License: Apache 2.0
  • Fine-tuning Method: Direct Preference Optimization (DPO)
  • Training Framework: HuggingFace TRL (transformers + accelerate)
  • Preference Dataset: 10,000 Vietnamese prompt pairs (chosen/rejected)

βœ… Intended Uses

Direct Use

  • Aligned, polite, and safe Vietnamese conversation
  • General instruction following and open-domain QA
  • Scenarios requiring sensitive language generation

Out-of-Scope Use

  • Safety-critical systems (e.g. legal, medical, financial advice)
  • English-language generation (model is tuned for Vietnamese)
  • Jailbreak testing without appropriate safeguards

πŸ“¦ Dataset

  • Source: Translated + curated prompts from HarmBench, OpenAssistant, and JailbreakEval
  • Format: JSONL pairs of prompts with chosen and rejected completions
  • Composition: 60% safe / 40% unsafe adversarial instructions
  • Filtering: Detoxify used to score and select completions with varied toxicity

πŸ§ͺ Evaluation

Metrics

  • Helpfulness (human-rated or automated ranking)
  • Toxicity (via Detoxify threshold > 0.5)
  • Preference Win Rate vs. base SFT model

Summary

  • 85% rejection on unsafe or adversarial inputs

  • 80% preference over SFT responses on standard Vietnamese prompts

  • Maintains helpfulness while increasing safety and appropriateness

πŸš€ How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load model
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-1.8B-Chat")
model = AutoModelForCausalLM.from_pretrained("522H0134-NguyenNhatHuy/Qwen-1.5-1.8B-Chat-DPO")

# Chat with the model
prompt = "TΓ΄i cαΊ§n lời khuyΓͺn để giαΊ£m stress cΓ΄ng việc."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for 522H0134-NguyenNhatHuy/Qwen-1.5-1.8B-Chat-DPO

Adapter
(335)
this model