Instructions to use SINAI/ALIA-es-legal-administrative-7B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SINAI/ALIA-es-legal-administrative-7B-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SINAI/ALIA-es-legal-administrative-7B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("SINAI/ALIA-es-legal-administrative-7B-Instruct")
model = AutoModelForMultimodalLM.from_pretrained("SINAI/ALIA-es-legal-administrative-7B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use SINAI/ALIA-es-legal-administrative-7B-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SINAI/ALIA-es-legal-administrative-7B-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SINAI/ALIA-es-legal-administrative-7B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/SINAI/ALIA-es-legal-administrative-7B-Instruct

SGLang

How to use SINAI/ALIA-es-legal-administrative-7B-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SINAI/ALIA-es-legal-administrative-7B-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SINAI/ALIA-es-legal-administrative-7B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SINAI/ALIA-es-legal-administrative-7B-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SINAI/ALIA-es-legal-administrative-7B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use SINAI/ALIA-es-legal-administrative-7B-Instruct with Docker Model Runner:
```
docker model run hf.co/SINAI/ALIA-es-legal-administrative-7B-Instruct
```

Tokenizer mismatch

by enriquezaf - opened 22 days ago

Discussion

enriquezaf

22 days ago

Hi,

The chat template is missing three \n.

Was the finetune done with the wrong template?

Also, why not trim whitespace?

lmolino changed discussion status to closed 19 days ago

lmolino changed discussion status to open 19 days ago

lmolino

Grupo de investigación en Sistemas Inteligentes de Acceso a la Información (SINAI) de la Universidad de Jaén org 19 days ago

Thanks for the feedback! The model was fine-tuned using exactly this chat template, so it is internally consistent, the template reflects the actual format used during training. Using a different template at inference time (e.g. with extra \n) may lead to slightly inconsistent behavior at turn boundaries, since the fine-tuning was done with this specific format. Regarding whitespace, "add_prefix_space: true" is inherited from the LLaMA SentencePiece tokenizer and is a tokenizer-level setting that does not affect the chat template output directly. Have you actually observed leading spaces in the decoded outputs?

enriquezaf

19 days ago

Thanks for answering my question,

Good to know that chat_template.jinja is correct and tokenizer_config.json is what is wrong.

"chat_template": "{{- bos_token }}{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",

By 'trim whitespace' I don't mean the prefix space, sorry for the unclear explanation, I'm refering to the chat template.

Regarding leading spaces in the output, yes, that's what prompted me to check the tokenizer.

Can replicate if you do a few generations with this prompt:

¿Cuál es la constitución española?

I checked the datasets and found some examples, but I'm not sure if there are enough to bias the model like that.

lmolino

Grupo de investigación en Sistemas Inteligentes de Acceso a la Información (SINAI) de la Universidad de Jaén org 16 days ago

Hi again! Thanks for the clarification about whitespace trimming.

We have now updated the chat_template in tokenizer_config.json to match the chat_template.jinja used during fine-tuning. This should fix the leading spaces in the output.

Regarding whitespace trimming in the template, the chat_template.jinja already uses {%- and -%} tags where appropriate to avoid unwanted whitespace. Now that both templates are in sync, this should be consistent.

Thanks again for taking the time to investigate and report this so thoroughly!

scarrasc changed discussion status to closed 6 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment