TranslateGemma-4B SFT for Spanish-Valencian

Model Summary

guerreropaula/translategemma4b-sft-es-va is a supervised fine-tuned adaptation of google/translategemma-4b-it for Spanish to Valencian translation. It is part of the spanish-valencian-mt-rl EAMT 2026 submission and serves as the foundation checkpoint for both GRPO variants in this project.

The model is trained on public Spanish-Valencian parallel data with a QLoRA setup designed for low-resource adaptation. In this project, Valencian is generated through the model's Catalan target channel because the base system does not expose a dedicated Valencian language code.

Model Details

  • Model ID: guerreropaula/translategemma4b-sft-es-va
  • Collection: guerreropaula/spanish-valencian-mt-rl
  • Developed by: Paula Guerrero Castello
  • Base model: google/translategemma-4b-it
  • Task: Spanish to Valencian machine translation
  • Languages: Spanish (es) input, Valencian/Catalan (ca) output
  • License for model weights: Gemma license
  • Repository text and paper license: CC BY-ND 4.0

Intended Use

This model is intended for:

  • research on low-resource dialectal MT
  • Spanish to Valencian translation experiments
  • reproducing the SFT baseline reported in the EAMT submission
  • initialization for GRPO-based post-training

It is not intended for:

  • fully automatic high-stakes translation without human review
  • general multilingual generation outside the ES-VA setting
  • use as evidence of normative Valencian in legal, medical, or educational settings without expert validation

Training Data

The model is trained on gplsi/amic_parallel, using up to 50,000 Spanish-Valencian sentence pairs from the training split.

  • Training split: 50,000 examples
  • Validation split: 2%
  • Validation samples used during checkpoint selection: 200
  • Source column: ES
  • Target column: VA

Training Procedure

Training uses 4-bit QLoRA over the instruction-tuned TranslateGemma base model.

  • LoRA rank: 16
  • LoRA alpha: 32
  • LoRA dropout: 0.05
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Max steps: 2,000
  • Learning rate: 2e-4
  • Batch size: 1
  • Gradient accumulation: 32
  • Warmup steps: 25
  • Max sequence length: 256
  • Optimizer: paged_adamw_8bit
  • Scheduler: cosine
  • Precision: bf16 when supported, otherwise fp16

Evaluation

The model was evaluated on 1,000 sentences from gplsi/ES-VA_translation_test.

Metric Score
chrF 83.16
BLEU 60.16
TER 22.80
BLEURT 0.524
COMET 0.934
Dialectal Valencian Score 41.0%

This checkpoint substantially improves over the zero-shot baseline and obtains the strongest dialectal Valencian usage rate among the systems evaluated in the repository.

How To Use

Inference in this repo is performed with the base TranslateGemma model plus the SFT adapter:

from config import Config
from utils.model import build_bnb_config, load_base_tokenizer
from utils.data import make_inference_prompt
from transformers import AutoModelForCausalLM
from peft import PeftModel

cfg = Config()
bnb = build_bnb_config(cfg)
tokenizer = load_base_tokenizer(cfg)

base_model = AutoModelForCausalLM.from_pretrained(
    cfg.base_model_id,
    quantization_config=bnb,
    device_map="auto",
    use_safetensors=True,
)

model = PeftModel.from_pretrained(base_model, cfg.sft_model_id)
prompt = make_inference_prompt("Buenos dias a todas y todos.", tokenizer, cfg)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Limitations

  • The target is Valencian, but the base model is prompted with the Catalan language channel.
  • Performance is measured on a single 1,000-sentence public test set.
  • The model can still produce standard Catalan forms instead of Valencian-preferred variants.
  • The model has not been evaluated for domain robustness, toxicity, or factual faithfulness beyond translation quality metrics.

License

This model is distributed under the Gemma license inherited from the base model google/translategemma-4b-it. Users should also review the licenses and terms of the training and evaluation datasets before redistribution or deployment.

Citation

@inproceedings{guerrero-2026-enhancing,
  title     = {Enhancing LLM Translation Performance for Spanish-Valencian through Supervised Fine-tuning and Reinforcement Learning},
  author    = {Guerrero Castello, Paula},
  booktitle = {Proceedings of the 25th Annual Conference of the European Association for Machine Translation},
  year      = {2026}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for guerreropaula/translategemma4b-sft-es-va

Adapter
(7)
this model
Finetunes
1 model
Quantizations
1 model

Dataset used to train guerreropaula/translategemma4b-sft-es-va

Collection including guerreropaula/translategemma4b-sft-es-va