How to use from
llama.cpp
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf UdayGattu23/qwen2.5-7b-finetuned-argus:
# Run inference directly in the terminal:
llama cli -hf UdayGattu23/qwen2.5-7b-finetuned-argus:
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf UdayGattu23/qwen2.5-7b-finetuned-argus:
# Run inference directly in the terminal:
llama cli -hf UdayGattu23/qwen2.5-7b-finetuned-argus:
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf UdayGattu23/qwen2.5-7b-finetuned-argus:
# Run inference directly in the terminal:
./llama-cli -hf UdayGattu23/qwen2.5-7b-finetuned-argus:
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf UdayGattu23/qwen2.5-7b-finetuned-argus:
# Run inference directly in the terminal:
./build/bin/llama-cli -hf UdayGattu23/qwen2.5-7b-finetuned-argus:
Use Docker
docker model run hf.co/UdayGattu23/qwen2.5-7b-finetuned-argus:
Quick Links

Qwen2.5-7B Finetuned on Argus Dataset

This model is a finetuned version of Qwen2.5-7B using LoRA with rank 128.

Training Details

  • Base Model: Qwen/Qwen2.5-7B
  • Training Method: LoRA (rank=128, alpha=256)
  • Dataset: 27,997 text samples
  • Epochs: 2 (best checkpoint from epoch 1)
  • Batch Size: 16 (effective)
  • Learning Rate: 5e-5
  • Hardware: A100 GPU

Training Results

  • Epoch 1: Training Loss: 1.301, Validation Loss: 1.589 (best)
  • Epoch 2: Training Loss: 1.699, Validation Loss: 1.826

Available Formats

  • PyTorch: Original model weights
  • GGUF: Multiple quantization levels available
    • Q8_0: Highest quality (7.5GB)
    • Q6_K: Very high quality (5.5GB)
    • Q5_K_M: High quality (4.8GB)
    • Q4_K_M: Good quality (3.8GB)
    • Q4_0: Acceptable quality (3.5GB)

Usage

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("UdayGattu23/qwen2.5-7b-finetuned-argus")
tokenizer = AutoTokenizer.from_pretrained("UdayGattu23/qwen2.5-7b-finetuned-argus")

With llama.cpp (GGUF)

./main -m qwen2.5-7b-finetuned-Q4_K_M.gguf -p "Your prompt here"

License

Apache 2.0

Downloads last month
26
Safetensors
Model size
8B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for UdayGattu23/qwen2.5-7b-finetuned-argus

Base model

Qwen/Qwen2.5-7B
Quantized
(83)
this model