Instructions to use QuixiAI/TinyDolphin-2.8.2-1.1b-laser with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use QuixiAI/TinyDolphin-2.8.2-1.1b-laser with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="QuixiAI/TinyDolphin-2.8.2-1.1b-laser")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("QuixiAI/TinyDolphin-2.8.2-1.1b-laser")
model = AutoModelForCausalLM.from_pretrained("QuixiAI/TinyDolphin-2.8.2-1.1b-laser")

Inference
Local Apps Settings

vLLM

How to use QuixiAI/TinyDolphin-2.8.2-1.1b-laser with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "QuixiAI/TinyDolphin-2.8.2-1.1b-laser"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "QuixiAI/TinyDolphin-2.8.2-1.1b-laser",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/QuixiAI/TinyDolphin-2.8.2-1.1b-laser

SGLang

How to use QuixiAI/TinyDolphin-2.8.2-1.1b-laser with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "QuixiAI/TinyDolphin-2.8.2-1.1b-laser" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "QuixiAI/TinyDolphin-2.8.2-1.1b-laser",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "QuixiAI/TinyDolphin-2.8.2-1.1b-laser" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "QuixiAI/TinyDolphin-2.8.2-1.1b-laser",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use QuixiAI/TinyDolphin-2.8.2-1.1b-laser with Docker Model Runner:
```
docker model run hf.co/QuixiAI/TinyDolphin-2.8.2-1.1b-laser
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

TinyDolphin-2.8.2-1.1b-laser

Join Our Discord! https://discord.gg/cognitivecomputations

This is an version 3 of a model trained on 3 3090's by Kearm on the new Dolphin 2.8 dataset by Eric Hartford https://erichartford.com/dolphin 🐬

This model uses our laser technique from https://github.com/cognitivecomputations/laserRMT to denoise the model!

For this version we increased the epochs as well as refined the datasets used.

Example Outputs

TBD

Support my efforts! https://ko-fi.com/kearm

Orignal Model Card Below

TinyLlama-1.1B

https://github.com/jzhang38/TinyLlama

The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens. With some proper optimization, we can achieve this within a span of "just" 90 days using 16 A100-40G GPUs 🚀🚀. The training has started on 2023-09-01.

We adopted exactly the same architecture and tokenizer as Llama 2. This means TinyLlama can be plugged and played in many open-source projects built upon Llama. Besides, TinyLlama is compact with only 1.1B parameters. This compactness allows it to cater to a multitude of applications demanding a restricted computation and memory footprint.

This Collection

This collection contains all checkpoints after the 1T fix. Branch name indicates the step and number of tokens seen.

Eval

Model	Pretrain Tokens	HellaSwag	Obqa	WinoGrande	ARC_c	ARC_e	boolq	piqa	avg
Pythia-1.0B	300B	47.16	31.40	53.43	27.05	48.99	60.83	69.21	48.30
TinyLlama-1.1B-intermediate-step-50K-104b	103B	43.50	29.80	53.28	24.32	44.91	59.66	67.30	46.11
TinyLlama-1.1B-intermediate-step-240k-503b	503B	49.56	31.40	55.80	26.54	48.32	56.91	69.42	48.28
TinyLlama-1.1B-intermediate-step-480k-1007B	1007B	52.54	33.40	55.96	27.82	52.36	59.54	69.91	50.22
TinyLlama-1.1B-intermediate-step-715k-1.5T	1.5T	53.68	35.20	58.33	29.18	51.89	59.08	71.65	51.29
TinyLlama-1.1B-intermediate-step-955k-2T	2T	54.63	33.40	56.83	28.07	54.67	63.21	70.67	51.64
TinyLlama-1.1B-intermediate-step-1195k-2.5T	2.5T	58.96	34.40	58.72	31.91	56.78	63.21	73.07	53.86
TinyLlama-1.1B-intermediate-step-1431k-3T	3T	59.20	36.00	59.12	30.12	55.25	57.83	73.29	52.99

Downloads last month: 26

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for QuixiAI/TinyDolphin-2.8.2-1.1b-laser

Merges

2 models

Quantizations

6 models

QuixiAI
/

TinyDolphin-2.8.2-1.1b-laser