TranslateGemma-4B SFT for Spanish-Valencian

Model Summary

guerreropaula/translategemma4b-sft-es-va is a supervised fine-tuned adaptation of google/translategemma-4b-it for Spanish to Valencian translation. It is part of the spanish-valencian-mt-rl EAMT 2026 submission and serves as the foundation checkpoint for both GRPO variants in this project.

The model is trained on public Spanish-Valencian parallel data with a QLoRA setup designed for low-resource adaptation. In this project, Valencian is generated through the model's Catalan target channel because the base system does not expose a dedicated Valencian language code.

Model Details

Model ID: guerreropaula/translategemma4b-sft-es-va
Collection: guerreropaula/spanish-valencian-mt-rl
Developed by: Paula Guerrero Castello
Base model: google/translategemma-4b-it
Task: Spanish to Valencian machine translation
Languages: Spanish (es) input, Valencian/Catalan (ca) output
License for model weights: Gemma license
Repository text and paper license: CC BY-ND 4.0

Intended Use

This model is intended for:

research on low-resource dialectal MT
Spanish to Valencian translation experiments
reproducing the SFT baseline reported in the EAMT submission
initialization for GRPO-based post-training

It is not intended for:

fully automatic high-stakes translation without human review
general multilingual generation outside the ES-VA setting
use as evidence of normative Valencian in legal, medical, or educational settings without expert validation

Training Data

The model is trained on gplsi/amic_parallel, using up to 50,000 Spanish-Valencian sentence pairs from the training split.

Training split: 50,000 examples
Validation split: 2%
Validation samples used during checkpoint selection: 200
Source column: ES
Target column: VA

Training Procedure

Training uses 4-bit QLoRA over the instruction-tuned TranslateGemma base model.

LoRA rank: 16
LoRA alpha: 32
LoRA dropout: 0.05
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Max steps: 2,000
Learning rate: 2e-4
Batch size: 1
Gradient accumulation: 32
Warmup steps: 25
Max sequence length: 256
Optimizer: paged_adamw_8bit
Scheduler: cosine
Precision: bf16 when supported, otherwise fp16

Evaluation

The model was evaluated on 1,000 sentences from gplsi/ES-VA_translation_test.

Metric	Score
chrF	83.16
BLEU	60.16
TER	22.80
BLEURT	0.524
COMET	0.934
Dialectal Valencian Score	41.0%

This checkpoint substantially improves over the zero-shot baseline and obtains the strongest dialectal Valencian usage rate among the systems evaluated in the repository.

How To Use

Inference in this repo is performed with the base TranslateGemma model plus the SFT adapter:

from config import Config
from utils.model import build_bnb_config, load_base_tokenizer
from utils.data import make_inference_prompt
from transformers import AutoModelForCausalLM
from peft import PeftModel

cfg = Config()
bnb = build_bnb_config(cfg)
tokenizer = load_base_tokenizer(cfg)

base_model = AutoModelForCausalLM.from_pretrained(
    cfg.base_model_id,
    quantization_config=bnb,
    device_map="auto",
    use_safetensors=True,
)

model = PeftModel.from_pretrained(base_model, cfg.sft_model_id)
prompt = make_inference_prompt("Buenos dias a todas y todos.", tokenizer, cfg)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Limitations

The target is Valencian, but the base model is prompted with the Catalan language channel.
Performance is measured on a single 1,000-sentence public test set.
The model can still produce standard Catalan forms instead of Valencian-preferred variants.
The model has not been evaluated for domain robustness, toxicity, or factual faithfulness beyond translation quality metrics.

License

This model is distributed under the Gemma license inherited from the base model google/translategemma-4b-it. Users should also review the licenses and terms of the training and evaluation datasets before redistribution or deployment.

Citation

@inproceedings{guerrero-2026-enhancing,
  title     = {Enhancing LLM Translation Performance for Spanish-Valencian through Supervised Fine-tuning and Reinforcement Learning},
  author    = {Guerrero Castello, Paula},
  booktitle = {Proceedings of the 25th Annual Conference of the European Association for Machine Translation},
  year      = {2026}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for guerreropaula/translategemma4b-sft-es-va

Base model

google/translategemma-4b-it

Adapter

(7)

this model

Finetunes

1 model

Quantizations

1 model

Dataset used to train guerreropaula/translategemma4b-sft-es-va

Collection including guerreropaula/translategemma4b-sft-es-va

spanish-valencian-mt-rl

Collection

Adapted models to the low-resource Spanish-Valencian (ES-VLCA) direction using supervised fine-tuning (SFT) and GRPO • 4 items • Updated May 10