Instructions to use Polygl0t/Tucano2-qwen-0.5B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Polygl0t/Tucano2-qwen-0.5B-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Polygl0t/Tucano2-qwen-0.5B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Polygl0t/Tucano2-qwen-0.5B-Instruct")
model = AutoModelForCausalLM.from_pretrained("Polygl0t/Tucano2-qwen-0.5B-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Polygl0t/Tucano2-qwen-0.5B-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Polygl0t/Tucano2-qwen-0.5B-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Polygl0t/Tucano2-qwen-0.5B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Polygl0t/Tucano2-qwen-0.5B-Instruct

SGLang

How to use Polygl0t/Tucano2-qwen-0.5B-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Polygl0t/Tucano2-qwen-0.5B-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Polygl0t/Tucano2-qwen-0.5B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Polygl0t/Tucano2-qwen-0.5B-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Polygl0t/Tucano2-qwen-0.5B-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Polygl0t/Tucano2-qwen-0.5B-Instruct with Docker Model Runner:
```
docker model run hf.co/Polygl0t/Tucano2-qwen-0.5B-Instruct
```

Tucano2-qwen-0.5B-Instruct / ruler.yaml

nicholasKluge

Upload ruler.yaml with huggingface_hub

e630ba5 verified 3 months ago

raw

history blame

3.29 kB

	model_name: Tucano2-qwen-0.5B-Instruct
	results:
	niah_pt_multikey_1_1024: 0.658
	niah_pt_multikey_1_1024_stderr: 0.021236147199899316
	niah_pt_multikey_1_2048: 0.556
	niah_pt_multikey_1_2048_stderr: 0.022242244375731048
	niah_pt_multikey_1_4096: 0.42
	niah_pt_multikey_1_4096_stderr: N/A
	niah_pt_multikey_1_alias: " - niah_pt_multikey_1"
	niah_pt_multikey_2_1024: 0.596
	niah_pt_multikey_2_1024_stderr: 0.021966635293832883
	niah_pt_multikey_2_2048: 0.366
	niah_pt_multikey_2_2048_stderr: 0.021564276850201684
	niah_pt_multikey_2_4096: 0.184
	niah_pt_multikey_2_4096_stderr: N/A
	niah_pt_multikey_2_alias: " - niah_pt_multikey_2"
	niah_pt_multikey_3_1024: 0.406
	niah_pt_multikey_3_1024_stderr: 0.021983962090086417
	niah_pt_multikey_3_2048: 0.11
	niah_pt_multikey_3_2048_stderr: 0.01400686919941566
	niah_pt_multikey_3_4096: 0.038
	niah_pt_multikey_3_4096_stderr: N/A
	niah_pt_multikey_3_alias: " - niah_pt_multikey_3"
	niah_pt_multiquery_1024: 0.554
	niah_pt_multiquery_1024_stderr: 0.014700346948313894
	niah_pt_multiquery_2048: 0.4545
	niah_pt_multiquery_2048_stderr: 0.014300997764986478
	niah_pt_multiquery_4096: 0.395
	niah_pt_multiquery_4096_stderr: N/A
	niah_pt_multiquery_alias: " - niah_pt_multiquery"
	niah_pt_multivalue_1024: 0.4885
	niah_pt_multivalue_1024_stderr: 0.014608638699389432
	niah_pt_multivalue_2048: 0.4675
	niah_pt_multivalue_2048_stderr: 0.014090229563008424
	niah_pt_multivalue_4096: 0.4145
	niah_pt_multivalue_4096_stderr: N/A
	niah_pt_multivalue_alias: " - niah_pt_multivalue"
	niah_pt_single_1_1024: 0.602
	niah_pt_single_1_1024_stderr: 0.021912377885779953
	niah_pt_single_1_2048: 0.608
	niah_pt_single_1_2048_stderr: 0.02185468495561119
	niah_pt_single_1_4096: 0.522
	niah_pt_single_1_4096_stderr: N/A
	niah_pt_single_1_alias: " - niah_pt_single_1"
	niah_pt_single_2_1024: 0.518
	niah_pt_single_2_1024_stderr: 0.022368565117387874
	niah_pt_single_2_2048: 0.4
	niah_pt_single_2_2048_stderr: 0.02193084412072858
	niah_pt_single_2_4096: 0.316
	niah_pt_single_2_4096_stderr: N/A
	niah_pt_single_2_alias: " - niah_pt_single_2"
	niah_pt_single_3_1024: 0.63
	niah_pt_single_3_1024_stderr: 0.021613289165165816
	niah_pt_single_3_2048: 0.596
	niah_pt_single_3_2048_stderr: 0.021966635293832883
	niah_pt_single_3_4096: 0.522
	niah_pt_single_3_4096_stderr: N/A
	niah_pt_single_3_alias: " - niah_pt_single_3"
	ruler_pt_4096: 0.38164545454545457
	ruler_pt_4096_stderr: N/A
	ruler_pt_alias: ruler_pt
	ruler_pt_cwe_1024: 0.4992
	ruler_pt_cwe_1024_stderr: 0.016325801161570425
	ruler_pt_cwe_2048: 0.32839999999999997
	ruler_pt_cwe_2048_stderr: 0.013636671059873462
	ruler_pt_cwe_4096: 0.1778
	ruler_pt_cwe_4096_stderr: N/A
	ruler_pt_cwe_alias: " - ruler_pt_cwe"
	ruler_pt_fwe_1024: 0.8353333333333334
	ruler_pt_fwe_1024_stderr: 0.009076286695702566
	ruler_pt_fwe_2048: 0.6906666666666667
	ruler_pt_fwe_2048_stderr: 0.010496640893696112
	ruler_pt_fwe_4096: 0.594
	ruler_pt_fwe_4096_stderr: N/A
	ruler_pt_fwe_alias: " - ruler_pt_fwe"
	ruler_pt_vt_1024: 0.8847999999999999
	ruler_pt_vt_1024_stderr: 0.009025566003490679
	ruler_pt_vt_2048: 0.7112
	ruler_pt_vt_2048_stderr: 0.013468181161820449
	ruler_pt_vt_4096: 0.6147999999999999
	ruler_pt_vt_4096_stderr: N/A
	ruler_pt_vt_alias: " - ruler_pt_vt"