--- language: - mn license: apache-2.0 base_model: mistralai/Mistral-7B-Instruct-v0.2 tags: - mongolian - fine-tuned - lora - chatbot datasets: - custom --- # mongolian-mistral-7b-chatbot ## Description Mistral 7B fine-tuned on Mongolian news data for chatbot ## Model Details - **Base Model:** mistralai/Mistral-7B-Instruct-v0.2 - **Language:** Mongolian (mn) - **Fine-tuning Method:** LoRA (Low-Rank Adaptation) - **Training Data:** Eduge Mongolian News Dataset (75,000+ articles) ## Training Configuration - **LoRA Rank:** 32 - **LoRA Alpha:** 64 - **Epochs:** 3 - **Learning Rate:** 2e-4 - **Batch Size:** 4 - **Max Sequence Length:** 1024 ## Mongolian Tokens Added - Total new tokens: ~9,500 - Sources: Mongolian-NLP repository - Most frequent words - Abbreviations - District/place names - Country names - Named entities (NER) ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel # Load tokenizer tokenizer = AutoTokenizer.from_pretrained("ErkaMarka/mongolian-mistral-7b-chatbot") # Load base model base_model = AutoModelForCausalLM.from_pretrained( "mistralai/Mistral-7B-Instruct-v0.2", torch_dtype=torch.float16, device_map="auto" ) # Resize embeddings for new tokens base_model.resize_token_embeddings(len(tokenizer)) # Load LoRA adapter model = PeftModel.from_pretrained(base_model, "ErkaMarka/mongolian-mistral-7b-chatbot") # Generate messages = [{"role": "user", "content": "Монгол улсын нийслэл хот юу вэ?"}] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=150) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Evaluation Results Evaluated on 100 Mongolian Q&A pairs using BLEU score. ## License Apache 2.0 ## Citation ``` @misc{mongolian_mistral_7b_chatbot}, author = {Your Name}, title = {mongolian-mistral-7b-chatbot}, year = {2024}, publisher = {Hugging Face}, url = {https://huggingface.co/ErkaMarka/mongolian-mistral-7b-chatbot} } ```