Instructions to use Nexless/dental-research-slm-1.5b-v2-p5qcva with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Nexless/dental-research-slm-1.5b-v2-p5qcva with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Nexless/dental-research-slm-1.5b-v2-p5qcva", filename="gguf/model-Q4_K_M.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use Nexless/dental-research-slm-1.5b-v2-p5qcva with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
Use Docker
docker model run hf.co/Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use Nexless/dental-research-slm-1.5b-v2-p5qcva with Ollama:
ollama run hf.co/Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
- Unsloth Studio
How to use Nexless/dental-research-slm-1.5b-v2-p5qcva with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Nexless/dental-research-slm-1.5b-v2-p5qcva to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Nexless/dental-research-slm-1.5b-v2-p5qcva to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Nexless/dental-research-slm-1.5b-v2-p5qcva to start chatting
- Pi
How to use Nexless/dental-research-slm-1.5b-v2-p5qcva with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Nexless/dental-research-slm-1.5b-v2-p5qcva with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use Nexless/dental-research-slm-1.5b-v2-p5qcva with Docker Model Runner:
docker model run hf.co/Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
- Lemonade
How to use Nexless/dental-research-slm-1.5b-v2-p5qcva with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Nexless/dental-research-slm-1.5b-v2-p5qcva:Q4_K_M
Run and chat with the model
lemonade run user.dental-research-slm-1.5b-v2-p5qcva-Q4_K_M
List all available models
lemonade list
general-slm-300m-20260424-p5qcva
. Trained on ? documents (~? tokens) from the domain.
Model Details
- Parameters: 300000000
- Base model: Qwen/Qwen2.5-1.5B
- Training regime: lora-sft
- Training framework: huggingface-trainer
- Chat template: qwen2
- License: apache-2.0
Intended Use
Training Data
Curated corpus (forge forge-2026-04-24-publications-v2-p5qcva)
Domain: ``. Corpus shape: ? tokens, ? documents.
Evaluation
Full eval reports: comparison-vs-baseline.md.
Domain-specific performance
(see eval/comparison-vs-baseline.md)
Generic performance
(M5 v1: generic eval deferred to hardening)
Baseline comparison
(see eval/comparison-vs-baseline.md)
Limitations
- Small model — capabilities are proportionally limited
- Domain specialization may cause out-of-domain degradation vs baseline
- Not a general-purpose assistant; best at -specific tasks
Recommended sampling settings
This is a small model (≤300M params) fine-tuned on a narrow domain. Without
the parameters below, it can collapse into repetition or emit garbled GGUF
detokenize artifacts (e.g. literal tti, ttiassistant from Qwen ChatML
markers). Always use:
| param | value | reason |
|---|---|---|
temperature |
0.5 | 0.7 is too hot for a small base on narrow corpus |
top_p |
0.9 | constrains tail |
top_k |
40 | constrains tail |
repeat_penalty |
1.18 | kills paragraph loops (the main failure mode) |
repeat_last_n |
256 | window for the penalty |
max_tokens |
320 | keep responses short |
LM Studio: open the right-hand "Advanced configuration" panel and set
the values above. Also add these stop strings under "Stop strings":
<|im_end|>, <|im_start|>, <|endoftext|>, ttiuser, ttiassistant.
Ollama: the published Modelfile has these defaults baked
in — just ollama create ... -f Modelfile.
llama.cpp / llama-cpp-python: see the Space app.py for a known-working configuration.
How to Use
Download the GGUF for local inference (LM Studio, llama.cpp, text-generation-webui)
Quantizations available:
gguf/model-Q4_K_M.gguf— ~986.0 MB — best size/quality trade-offgguf/model-Q8_0.gguf— ~1646.6 MB — near-lossless
Load via HuggingFace Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("Nexless/general-slm-300m-20260424-p5qcva")
model = AutoModelForCausalLM.from_pretrained("Nexless/general-slm-300m-20260424-p5qcva")
Run in a terminal with Ollama
curl https://huggingface.co/Nexless/general-slm-300m-20260424-p5qcva/raw/main/Modelfile -o Modelfile
ollama create general-slm-300m-20260424-p5qcva -f Modelfile
ollama run general-slm-300m-20260424-p5qcva
Try it in a browser
See the live demo Space.
Training Details
- Training hardware: g5.xlarge
- Training duration: 63 minutes
- Training tokens: ?
- Training cost: $0 on AWS (NVIDIA Inception credits)
- Reproducibility hash: forge-2026-04-24-publications-v2-p5qcva
Citation
If you use this model, please cite:
@misc{{general-slm-300m-20260424-p5qcva_2026},
title={{general-slm-300m-20260424-p5qcva: a 300M-parameter SLM}},
author={{Nexless}},
year={{2026}},
url={{https://huggingface.co/Nexless/general-slm-300m-20260424-p5qcva}}
}
Forged with the SLM-Forge skill tree.
- Downloads last month
- -
4-bit
8-bit
Model tree for Nexless/dental-research-slm-1.5b-v2-p5qcva
Base model
Qwen/Qwen2.5-1.5B