Transformers
GGUF
Not-For-All-Audiences
conversational

QuantFactory Banner

QuantFactory/L3.1-8B-komorebi-GGUF

This is quantized version of crestf411/L3.1-8B-komorebi created using llama.cpp

Original Model Card

komorebi.png

This is a model based on a multi-phase process using KTO fine tuning using the jondurbin gutenberg approach, that results in 3 separate LoRAs which are merged in sequence.

The resulting model is exhibiting a significant decrease in Llama 3.1 slop outputs.

Experimental. Please give feedback. Begone if you demand perfection.

I did most of my testing with temp 1.4, min-p 0.15, DRY 0.8. I also did play with enabling XTC with threshold 0.1, prob 0.50.

As context grows, you may want to bump temp and min-p and maybe even DRY.

Downloads last month
31
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for QuantFactory/L3.1-8B-komorebi-GGUF