Instructions to use Fernandosr85/vozbr-brandvoice-adapter with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use Fernandosr85/vozbr-brandvoice-adapter with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
VozBR-BrandVoice — Brazilian Portuguese Institutional Brand-Voice Adapter
LoRA adapter fine-tuned on Llama-3.3-70B-Instruct (70B) for Brazilian Portuguese institutional brand-voice compliance, via Adaption's AutoScientist platform.
The problem this adapter addresses
Corporate and institutional Portuguese-language assistants routinely drift away from brand/communication guidelines — wrong register, missing required structure, banned informal markers. Given a raw citizen request like "a aposentadoria não foi paga em março, preciso de uma explicação urgente", a base model typically responds informally and without the structure an institution requires:
"Foi um erro, vamos verificar e te aviso depois."
The brand-voice-compliant response follows an explicit structure — formal opening, objective context, explanatory body with protocol/deadline references, closing with a follow-up channel, and institutional identification:
"Prezado(a) cidadão(a), em atenção à manifestação registrada sob o protocolo nº [...], informamos que sua solicitação foi encaminhada ao setor competente. O processo encontra-se em andamento, com prazo estimado de 15 dias úteis para conclusão. Você poderá acompanhar a tramitação por meio do canal da ouvidoria..."
This adapter teaches the model to apply an explicit, written Brand Voice Guide (structure, tone, required vocabulary, banned terms) to a raw input, and to comply with a deterministic 10-point conformance rubric.
Adaptive Data results
| Metric | Before | After |
|---|---|---|
| Quality score | 8.0 | 9.4 |
| Quality grade | B | A |
| Relative improvement | — | +17.5% |
| Percentile (Governance domain) | 16.7 | 57.7 |
Training metrics
| Metric | Value |
|---|---|
| Base model | meta-llama/Llama-3.3-70B-Instruct (70B) |
| Trained model name | adaption_pt_br_formal_gov_complaints |
| Training method | SFT + LoRA |
| LoRA rank (r) | 64 |
| LoRA alpha | 128 |
| LoRA dropout | 0 |
| Trainable modules | all-linear |
| Epochs | 1 |
| Training steps | 113 |
| Learning rate | 1e-4 (cosine scheduler) |
| Warmup ratio | 0.03 |
| Weight decay | 0.01 |
| Max grad norm | 1 |
| Dataset size | 20,204 examples (Grade A) |
| Adapted model win rate | 79% (vs 21% base) |
Dataset
| Platform | Link |
|---|---|
| HuggingFace Dataset (base, 6,505 examples) | Fernandosr85/vozbr-brandvoice |
| HuggingFace Dataset (expanded, 20,204 examples, used for training) | Fernandosr85/adaption-pt-br-formal-gov-complaints |
| Kaggle Dataset | VozBR-BrandVoice Dataset |
| Source dataset | FalaBR-SynthLetters |
6,505 base instruction-tuning examples (expanded to 20,204 via Adaption Adaptive Data augmentation — 8,000 domain-specific + 5,700 general-purpose data points), each pairing:
prompt: an explicit Brand Voice Guide plus a reframed raw citizen requestcompletion: a formal institutional response, pre-filtered to score ≥ 7/10 on the conformance rubric below
Brand Voice conformance rubric (10 checks)
| Check | Description |
|---|---|
formal_opener |
Formal opening salutation (e.g. "Prezado(a) cidadão(a),") |
institutional_voice |
Impersonal institutional voice ("Informa-se que...", "Cumpre informar...") |
process_vocab |
Reference to protocol / process / request |
progress_vocab |
Progress/deadline terms ("prazo", "andamento", "concluído") |
followup_vocab |
Follow-up/escalation channel ("ouvidoria", "canal", "recurso") |
formal_closing |
Formal closing |
no_banned_terms |
No slang, internet language, or emojis |
no_excess_caps |
No excessive capitalization |
min_length |
At least 40 words |
no_first_person_singular |
No informal first-person singular ("eu acho") |
Source data & provenance
- CGU / Fala.BR — Brazil's federal ombudsman open data (dados.gov.br), CC BY 4.0
- FalaBR-SynthLetters — 8,203 instruction-completion pairs of formal pt-BR letters, remastered via Adaption Adaptive Data (Grade A, 9/10 quality, Governance domain), CC BY-SA 4.0
- FalaBR-GovBench — 11-year Brazilian ombudsman benchmark, the original source corpus
All personal identifiers in training examples are templated placeholders (e.g. [Nome do Requerente], [CPF]), not real citizen data.
Credits
- Fine-tuning platform: Adaption — AutoScientist & Adaptive Data
- Challenge: AutoScientist Challenge 2026 — Marketing category
- Training infrastructure: Adaption compute credits
- Dataset remastering: Adaption Adaptive Data pipeline (Grade A, +17.5% quality improvement)
- Author: Fernando Rodrigues · Kaggle: fernandosr85 · HuggingFace: Fernandosr85
Disclaimer
Experimental research artifact submitted to AutoScientist Challenge 2026 (Marketing category). This adapter is derived from public-sector ombudsman correspondence. The Brand Voice Guide and conformance rubric reflect a formal government-correspondence register; applying them to other corporate brand voices may require adjusting the guide's required vocabulary and tone rules. Not a substitute for legal or compliance review of institutional communications.
- Downloads last month
- -
Model tree for Fernandosr85/vozbr-brandvoice-adapter
Base model
meta-llama/Llama-3.1-70B