VozBR-BrandVoice — Brazilian Portuguese Institutional Brand-Voice Adapter

LoRA adapter fine-tuned on Llama-3.3-70B-Instruct (70B) for Brazilian Portuguese institutional brand-voice compliance, via Adaption's AutoScientist platform.

The problem this adapter addresses

Corporate and institutional Portuguese-language assistants routinely drift away from brand/communication guidelines — wrong register, missing required structure, banned informal markers. Given a raw citizen request like "a aposentadoria não foi paga em março, preciso de uma explicação urgente", a base model typically responds informally and without the structure an institution requires:

"Foi um erro, vamos verificar e te aviso depois."

The brand-voice-compliant response follows an explicit structure — formal opening, objective context, explanatory body with protocol/deadline references, closing with a follow-up channel, and institutional identification:

"Prezado(a) cidadão(a), em atenção à manifestação registrada sob o protocolo nº [...], informamos que sua solicitação foi encaminhada ao setor competente. O processo encontra-se em andamento, com prazo estimado de 15 dias úteis para conclusão. Você poderá acompanhar a tramitação por meio do canal da ouvidoria..."

This adapter teaches the model to apply an explicit, written Brand Voice Guide (structure, tone, required vocabulary, banned terms) to a raw input, and to comply with a deterministic 10-point conformance rubric.

Adaptive Data results

Metric	Before	After
Quality score	8.0	9.4
Quality grade	B	A
Relative improvement	—	+17.5%
Percentile (Governance domain)	16.7	57.7

Training metrics

Metric	Value
Base model	`meta-llama/Llama-3.3-70B-Instruct` (70B)
Trained model name	`adaption_pt_br_formal_gov_complaints`
Training method	SFT + LoRA
LoRA rank (r)	64
LoRA alpha	128
LoRA dropout	0
Trainable modules	all-linear
Epochs	1
Training steps	113
Learning rate	1e-4 (cosine scheduler)
Warmup ratio	0.03
Weight decay	0.01
Max grad norm	1
Dataset size	20,204 examples (Grade A)
Adapted model win rate	79% (vs 21% base)

Dataset

Platform	Link
HuggingFace Dataset (base, 6,505 examples)	Fernandosr85/vozbr-brandvoice
HuggingFace Dataset (expanded, 20,204 examples, used for training)	Fernandosr85/adaption-pt-br-formal-gov-complaints
Kaggle Dataset	VozBR-BrandVoice Dataset
Source dataset	FalaBR-SynthLetters

6,505 base instruction-tuning examples (expanded to 20,204 via Adaption Adaptive Data augmentation — 8,000 domain-specific + 5,700 general-purpose data points), each pairing:

prompt: an explicit Brand Voice Guide plus a reframed raw citizen request
completion: a formal institutional response, pre-filtered to score ≥ 7/10 on the conformance rubric below

Brand Voice conformance rubric (10 checks)

Check	Description
`formal_opener`	Formal opening salutation (e.g. "Prezado(a) cidadão(a),")
`institutional_voice`	Impersonal institutional voice ("Informa-se que...", "Cumpre informar...")
`process_vocab`	Reference to protocol / process / request
`progress_vocab`	Progress/deadline terms ("prazo", "andamento", "concluído")
`followup_vocab`	Follow-up/escalation channel ("ouvidoria", "canal", "recurso")
`formal_closing`	Formal closing
`no_banned_terms`	No slang, internet language, or emojis
`no_excess_caps`	No excessive capitalization
`min_length`	At least 40 words
`no_first_person_singular`	No informal first-person singular ("eu acho")

Source data & provenance

CGU / Fala.BR — Brazil's federal ombudsman open data (dados.gov.br), CC BY 4.0
FalaBR-SynthLetters — 8,203 instruction-completion pairs of formal pt-BR letters, remastered via Adaption Adaptive Data (Grade A, 9/10 quality, Governance domain), CC BY-SA 4.0
FalaBR-GovBench — 11-year Brazilian ombudsman benchmark, the original source corpus

All personal identifiers in training examples are templated placeholders (e.g. [Nome do Requerente], [CPF]), not real citizen data.

Credits

Fine-tuning platform: Adaption — AutoScientist & Adaptive Data
Challenge: AutoScientist Challenge 2026 — Marketing category
Training infrastructure: Adaption compute credits
Dataset remastering: Adaption Adaptive Data pipeline (Grade A, +17.5% quality improvement)
Author: Fernando Rodrigues · Kaggle: fernandosr85 · HuggingFace: Fernandosr85

Disclaimer

Experimental research artifact submitted to AutoScientist Challenge 2026 (Marketing category). This adapter is derived from public-sector ombudsman correspondence. The Brand Voice Guide and conformance rubric reflect a formal government-correspondence register; applying them to other corporate brand voices may require adjusting the guide's required vocabulary and tone rules. Not a substitute for legal or compliance review of institutional communications.

Downloads last month: -

Model tree for Fernandosr85/vozbr-brandvoice-adapter

Base model

meta-llama/Llama-3.1-70B

Finetuned

meta-llama/Llama-3.3-70B-Instruct

Adapter

(300)

this model