Instructions to use AnatoliiPotapov/T-lite-instruct-0.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AnatoliiPotapov/T-lite-instruct-0.1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AnatoliiPotapov/T-lite-instruct-0.1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AnatoliiPotapov/T-lite-instruct-0.1")
model = AutoModelForCausalLM.from_pretrained("AnatoliiPotapov/T-lite-instruct-0.1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use AnatoliiPotapov/T-lite-instruct-0.1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AnatoliiPotapov/T-lite-instruct-0.1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AnatoliiPotapov/T-lite-instruct-0.1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/AnatoliiPotapov/T-lite-instruct-0.1

SGLang

How to use AnatoliiPotapov/T-lite-instruct-0.1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AnatoliiPotapov/T-lite-instruct-0.1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AnatoliiPotapov/T-lite-instruct-0.1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AnatoliiPotapov/T-lite-instruct-0.1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AnatoliiPotapov/T-lite-instruct-0.1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use AnatoliiPotapov/T-lite-instruct-0.1 with Docker Model Runner:
```
docker model run hf.co/AnatoliiPotapov/T-lite-instruct-0.1
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

T-lite-instruct-0.1

🚨 T-lite is designed for further fine-tuning and is not intended as a ready-to-use conversational assistant. Users are advised to exercise caution and are responsible for any additional training and oversight required to ensure the model's responses meet acceptable ethical and safety standards. The responsibility for incorporating this model into industrial or commercial solutions lies entirely with those who choose to deploy it.

Description

T-lite-instruct-0.1 is an instruct version of the T-lite-0.1 model.

T-lite-instruct-0.1 was trained in bf16.

📚 Dataset

Contexts

For the instruction dataset, the contexts are obtained from:

Open Source English-language datasets (such as UltraFeedback, HelpSteer, SHP, and so on)
Translations of English-language datasets through machine translation
Synthetic grounded QA contexts, generated from pre-training datasets

The translated contexts are filtered using classifiers.

SFT

The responses to the contexts are generated by a strong model and the training is exclusively carried out on these responses. This avoids training the model on poor-quality translations.

Reward Modeling

RM is trained on such pairs:

Strong Model > Our Model
Stronger Model > Weaker Model
Chosen Translated Response > Rejected Translated Response
Pairs from original English datasets

The translated preference data are preliminarily filtered by the RM ensemble.

Preference tuning

Two stages were used in preference tuning:

Stage 1: SPiN on the responses of the teacher model (Strong Model > Our Model)
Stage 2: SLiC-HF using our RM

📊 Benchmarks

Here we present the results of T-lite-instruct-0.1 on automatic benchmarks.

🏆 MT-Bench

This benchmark was carefully translated into Russian and measured with LLM Judge codebase, using gpt-4-1106-preview as a judge.

MT-Bench	Total	Turn_1	Turn_2	coding	humanities	math	reasoning	roleplay	stem	writing
T-lite-instruct-0.1	6.458	6.833	6.078	4.136	8.45	4.25	4.5	7.667	7.7	7.706
gpt3.5-turbo-0125	6.373	6.423	6.320	6.519	7.474	4.75	4.15	6.333	6.7	7.588
suzume-llama-3-8B-multilingual-orpo-borda-half	6.051	6.577	5.526	4.318	8.0	4.0	3.6	7.056	6.7	7.889
Qwen2-7b-Instruct	6.026	6.449	5.603	5.0	6.95	5.8	4.15	7.167	5.85	7.278
Llama-3-8b-Instruct	5.948	6.662	5.224	4.727	7.8	3.9	2.8	7.333	6.053	7.0
suzume-llama-3-8B-multilingual	5.808	6.167	5.449	5.409	6.4	5.05	3.8	6.556	5.0	7.056
saiga_llama3_8b	5.471	5.896	5.039	3.0	7.4	3.55	3.5	6.444	5.15	7.812
Mistral-7B-Instruct-v0.3	5.135	5.679	4.584	4.045	6.35	3.15	3.2	5.765	5.2	7.333

🏟️ Arena

We used Russian version of Arena benchmark from Vikhrmodels and Arena Hard Auto codebase for evaluation. As baseline model we chose gpt3.5-turbo-0125 and the judge was gpt-4-1106-preview.

Arena General	Score	95% CI	Average Tokens
T-lite-instruct-0.1	57.26	-2.9/2	870
gpt3.5-turbo-0125	50	0/0	254
suzume-llama-3-8B-multilingual-orpo-borda-half	47.17	-2.6/2.4	735
Llama-3-8b-Instruct	42.16	-2.1/2.1	455
saiga_llama3_8b	39.88	-2.3/2.5	616
suzume-llama-3-8B-multilingual	38.25	-1.7/1.7	625
Qwen2-7b-Instruct	33.42	-1.9/2.2	365
Mistral-7B-Instruct-v0.3	28.11	-2/2.2	570

👨‍💻 Examples of usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
torch.manual_seed(42)

model_name = "t-bank-ai/T-lite-instruct-0.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")

messages = [
    {"role": "user", "content": "Напиши рецепт классной пиццы!"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Output:

Конечно, вот рецепт для вкусной домашней пиццы, который можно адаптировать под разные вкусы и предпочтения. Важно, чтобы тесто было мягким и воздушным, а начинка — сочной и ароматной.

### Ингредиенты для теста:
- 500 г муки (лучше использовать смесь пшеничной и цельнозерновой)
- 1 ч. л. сухих дрожжей (или 7 г свежих)
- 1 ч. л. сахара
- 1 ч. л. соли
- 1 ст. л. оливкового масла
- 300 мл тёплой воды
- 1 яйцо (для смазки)

### Ингредиенты для начинки (примерный набор):
- 200 г томатного соуса (можно сделать самому из свежих помидоров или использовать готовый)
- 200 г моцареллы, нарезанной ломтиками
- 100 г сыра пармезан (тертый)
- 100 г ветчины или колбасы
- 100 г грибов (шампин

Downloads last month: 47

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for AnatoliiPotapov/T-lite-instruct-0.1

Adapters

4 models

Finetunes

3 models

Quantizations

12 models

AnatoliiPotapov
/

T-lite-instruct-0.1