Instructions to use Sashvat/HQQ-270M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Sashvat/HQQ-270M with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Sashvat/HQQ-270M")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Sashvat/HQQ-270M")
model = AutoModelForCausalLM.from_pretrained("Sashvat/HQQ-270M")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Sashvat/HQQ-270M with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Sashvat/HQQ-270M"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sashvat/HQQ-270M",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Sashvat/HQQ-270M

SGLang

How to use Sashvat/HQQ-270M with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Sashvat/HQQ-270M" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sashvat/HQQ-270M",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Sashvat/HQQ-270M" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Sashvat/HQQ-270M",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Sashvat/HQQ-270M with Docker Model Runner:
```
docker model run hf.co/Sashvat/HQQ-270M
```

HQQ-270M

Commit History

Update LICENSE

38ef32f
verified

Akshat-Dwivedi commited on Mar 29

Update README.md

9367810
verified

Akshat-Dwivedi commited on Mar 29

Update README.md

1ccf9de
verified

Akshat Diwedi commited on Aug 19, 2025

Push-to-HUB ✨

ae18efa
verified

Akshat Diwedi commited on Aug 19, 2025

Update README.md

152694d
verified

Akshat Diwedi commited on Aug 19, 2025

Upload Metric.png

d8c6824
verified

Akshat Diwedi commited on Aug 19, 2025

Create README.md

e9b9d8b
verified

Akshat Diwedi commited on Aug 19, 2025

initial commit

3ceacba
verified

Akshat Diwedi commited on Aug 19, 2025

Commit History

Update LICENSE 38ef32f verified

Update README.md 9367810 verified

Update README.md 1ccf9de verified

Push-to-HUB ✨ ae18efa verified

Update README.md 152694d verified

Upload Metric.png d8c6824 verified

Create README.md e9b9d8b verified

initial commit 3ceacba verified

Update LICENSE

38ef32f
verified

Update README.md

9367810
verified

Update README.md

1ccf9de
verified

Push-to-HUB ✨

ae18efa
verified

Update README.md

152694d
verified

Upload Metric.png

d8c6824
verified

Create README.md

e9b9d8b
verified

initial commit

3ceacba
verified