Trendyol LLM v1.0
Collection
Mistral 7B based fine-tuned models • 4 items • Updated • 14
How to use Trendyol/Trendyol-LLM-7b-base-v1.0 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="Trendyol/Trendyol-LLM-7b-base-v1.0")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Trendyol/Trendyol-LLM-7b-base-v1.0")
model = AutoModelForCausalLM.from_pretrained("Trendyol/Trendyol-LLM-7b-base-v1.0")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use Trendyol/Trendyol-LLM-7b-base-v1.0 with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Trendyol/Trendyol-LLM-7b-base-v1.0"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Trendyol/Trendyol-LLM-7b-base-v1.0",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/Trendyol/Trendyol-LLM-7b-base-v1.0
How to use Trendyol/Trendyol-LLM-7b-base-v1.0 with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "Trendyol/Trendyol-LLM-7b-base-v1.0" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Trendyol/Trendyol-LLM-7b-base-v1.0",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Trendyol/Trendyol-LLM-7b-base-v1.0" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Trendyol/Trendyol-LLM-7b-base-v1.0",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use Trendyol/Trendyol-LLM-7b-base-v1.0 with Docker Model Runner:
docker model run hf.co/Trendyol/Trendyol-LLM-7b-base-v1.0

Trendyol LLM v1.0 is a generative model that is based on Mistral 7B model. This is the repository for the base model.
Model Developers Trendyol
Variations base, chat, and dpo variations.
Input Models input text only.
Output Models generate text only.
Model Architecture Trendyol LLM v1.0 is an auto-regressive language model (based on Mistral 7b) that uses an optimized transformer architecture. The base version is fine-tuned on 10 billion tokens with the following trainables by using LoRA:

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
model_id = "Trendyol/Trendyol-LLM-7b-base-v1.0"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id,
device_map='auto',
load_in_8bit=True)
sampling_params = dict(do_sample=True, temperature=0.3, top_k=50, top_p=0.9)
pipe = pipeline("text-generation",
model=model,
tokenizer=tokenizer,
device_map="auto",
max_new_tokens=1024,
return_full_text=True,
repetition_penalty=1.1
)
def generate_output(user_query):
outputs = pipe(user_query,
**sampling_params
)
return outputs[0]["generated_text"]
user_query = "Ders çalışmanın en iyi 5 yolu:"
response = generate_output(user_query)
Base model
mistralai/Mistral-7B-v0.1