How to use from
Unsloth Studio
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for vhab10/llama_3.1_8b_Q4_K_M-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for vhab10/llama_3.1_8b_Q4_K_M-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for vhab10/llama_3.1_8b_Q4_K_M-gguf to start chatting
Quick Links

Llama 3.1 8B Q4_K_M GGUF Model

Overview

This is the quantized version of the Llama 3.1 8B model in Q4_K_M format, optimized for efficient inference on both CPU and GPU. The model was quantized using the llama.cpp library, allowing users to run it in resource-constrained environments . This quantization reduces the model's memory footprint while maintaining strong language generation capabilities.

The model was originally trained by Meta AI and has been adapted to the GGUF format for compatibility with llama.cpp.

Model Details

  • Base Model: meta-llama/Llama-3.1-8B
  • Quantization Type: Q4_K_M (4-bit quantization with memory optimization)
  • Model Size: 8B parameters
  • Format: GGUF (used for efficient loading in llama.cpp)
  • Intended Use: Text generation, inference on CPUs/GPUs with reduced memory constraints

Intended Use

The model is intended for text generation tasks and is optimized for efficient inference on both CPUs and GPUs, making it suitable for use in resource-constrained environments.

License

This model is licensed under the Apache 2.0 License.


Downloads last month
10
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for vhab10/llama_3.1_8b_Q4_K_M-gguf

Quantized
(325)
this model