Instructions to use charlottepuopolo/sealion-3v-9b-it-taglish with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use charlottepuopolo/sealion-3v-9b-it-taglish with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="charlottepuopolo/sealion-3v-9b-it-taglish")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("charlottepuopolo/sealion-3v-9b-it-taglish") model = AutoModelForMultimodalLM.from_pretrained("charlottepuopolo/sealion-3v-9b-it-taglish") - Notebooks
- Google Colab
- Kaggle
Sea-Lion Taglish Translation Model
Model Summary
This model is a fine-tuned version of Sea Lion v3 9B IT, adapted for English-to-Taglish machine translation. Taglish is a code-switched variety of English and Tagalog commonly used in the Philippines. The model was trained to generate fluent, naturalistic Taglish output from English input, with a focus on informal and social media domains.
The fine-tuning process involved lightweight QLoRA-based training on synthetic parallel examples from the Tweet Taglish dataset, following structured chat-style instruction tuning.
Intended Use
This model is intended for research, experimentation, and development of machine translation systems that support bilingual or code-switched output. It is particularly suited for:
- Translating English to Taglish
- Studying code-switching behavior in LLMs
- Applications in multilingual NLP for Southeast Asian languages
Not recommended for high-stakes or formal use cases such as medical, legal, or governmental translation.
Model Specs
- Developed by: Charlotte Puopolo
- Model type: Machine Translation trained on English-Taglish parallel Tweets
- Language(s) (NLP): Taglish (English-Tagalog code-switching)
- License: [More Information Needed]
- Finetuned from model [optional]: Tweet Taglish dataset (Herrera et al., 2022)
Model Sources
- Repository: https://github.com/puopolo/Taglish-Translation/tree/main/prompts
- Paper: [Coming soon]
How to Use
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("your-hf-username/sea-lion-taglish")
model = AutoModelForCausalLM.from_pretrained("your-hf-username/sea-lion-taglish")
prompt = "Translate to Tagalog-English code-switching: I need to go shopping later."
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(output[0], skip_special_tokens=True))
How to Cite
If you use this model, please cite:
@misc{puopolo2025taglish, author = {Charlotte Puopolo}, title = {Analyzing LLM Performance on Taglish Translation}, year = {2025}, note = {Hugging Face Model Repository}, url = {https://huggingface.co/charlottepuopolo/sealion-3v-9b-it-taglish} }
- Downloads last month
- 2
Model tree for charlottepuopolo/sealion-3v-9b-it-taglish
Base model
google/gemma-2-9b