Instructions to use Mesutby/mistral-7B-wikitext-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Mesutby/mistral-7B-wikitext-finetuned with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Mesutby/mistral-7B-wikitext-finetuned")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Mesutby/mistral-7B-wikitext-finetuned", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Mesutby/mistral-7B-wikitext-finetuned with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Mesutby/mistral-7B-wikitext-finetuned"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mesutby/mistral-7B-wikitext-finetuned",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Mesutby/mistral-7B-wikitext-finetuned

SGLang

How to use Mesutby/mistral-7B-wikitext-finetuned with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Mesutby/mistral-7B-wikitext-finetuned" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mesutby/mistral-7B-wikitext-finetuned",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Mesutby/mistral-7B-wikitext-finetuned" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Mesutby/mistral-7B-wikitext-finetuned",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Mesutby/mistral-7B-wikitext-finetuned with Docker Model Runner:
```
docker model run hf.co/Mesutby/mistral-7B-wikitext-finetuned
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Mistral-7B-WikiFineTuned

This project involves fine-tuning the Mistral-7B-Instruct model using the Wikipedia dataset. The goal is to create a model that provides accurate and informative text generation with a coherent and well-structured language output.

Model Description

Base Model: Mistral-7B
Fine-Tuned on: Wikitext-103-raw-v1
Purpose: The model is designed to offer the maximum amount of information with the shortest training time, aiming to provide accurate and informative content while maintaining a coherent and well-structured language output.
License: MIT

How to Use

To use this model, you can load it with the Hugging Face transformers library. Below is a basic example of how to use the model for text generation:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("Mesutby/mistral-7B-wikitext-finetuned")

# Load the model
model = AutoModelForCausalLM.from_pretrained("Mesutby/mistral-7B-wikitext-finetuned", 
                                             device_map="auto",
                                             load_in_4bit=True)

# Create the pipeline
generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

# Generate text
prompt = "The future of AI is"
output = generator(prompt, max_new_tokens=50)
print(output[0]['generated_text'])

Inference API

You can also use the model directly via the Hugging Face Inference API:

import requests

API_URL = "https://api-inference.huggingface.co/models/Mesutby/mistral-7B-wikitext-finetuned"
headers = {"Authorization": f"Bearer YOUR_HF_TOKEN"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

output = query({"inputs": "The future of AI is"})
print(output)

Training Details

Framework Used: PyTorch
Optimization Techniques:
- 4-bit quantization using bitsandbytes to reduce memory usage.
- Training accelerated using peft and accelerate.

Dataset

The model was fine-tuned on the Wikitext-103-raw-v1 dataset, split into training and evaluation subsets.

Training Configuration

Learning Rate: 2e-4
Batch Size: 4 (with gradient accumulation)
Max Steps: 125 (for demonstration; should ideally be higher, e.g., 1000)
Optimizer: Paged AdamW (32-bit)
Evaluation Strategy: Evaluation every 25 steps
PEFT Configuration: LoRA with 8 ranks and dropout of 0.1

Hyperparameters

Learning Rate: 2e-4
Batch Size: 4
Max Steps: 125 (demo)

Evaluation

The model was evaluated on a subset of the Wikitext dataset. Detailed evaluation metrics can be observed during training.

Limitations and Biases

While the model performs well on a variety of text generation tasks, it may still exhibit biases present in the training data. Users should be cautious when deploying this model in sensitive or high-stakes applications.

License

This model is licensed under the MIT License. See the LICENSE file for more details.

Contact

For any questions or issues, please contact bymuhammedmesut@gmail.com.

Downloads last month: -