--- license: apache-2.0 library_name: llama.cpp pipeline_tag: text-generation tags: - gguf - llama.cpp - hermes-agent - qwen3.5 - qwen - tool-use - local-agent base_model: - kai-os/carnice-v1-9b-hermes-agent-stage2-merged --- ![banner](./banner.png) # Carnice-9b-GGUF GGUF builds of `Carnice-9b`, a Hermes-Agent-specialized model built from `Qwen/Qwen3.5-9B` and trained specifically for the Hermes-Agent harness. This repo contains three quantized variants: - `Carnice-9b-Q4_K_M.gguf` - `Carnice-9b-Q6_K.gguf` - `Carnice-9b-Q8_0.gguf` ## Quantizations | File | Quant | Size | Recommended use | |---|---:|---:|---| | `Carnice-9b-Q4_K_M.gguf` | 4-bit | 5.3 GB | fastest local testing | | `Carnice-9b-Q6_K.gguf` | 6-bit | 6.9 GB | best quality/size balance | | `Carnice-9b-Q8_0.gguf` | 8-bit | 8.9 GB | highest quality GGUF option | ## Source model Merged source model: - [`kai-os/carnice-v1-9b-hermes-agent-stage2-merged`](https://huggingface.co/kai-os/carnice-v1-9b-hermes-agent-stage2-merged) ## What it was trained for Carnice-9b was trained specifically around Hermes-Agent behavior rather than generic chat polish. The training mixture emphasized: - Hermes-native terminal/file/browser trajectories - tool-oriented multi-turn agent behavior - reasoning-repair data to recover general reasoning after the first Hermes-specific tuning pass - a second Hermes refresh stage to pull the model back toward harness-native action formatting and tool usage ## llama.cpp ```bash llama-cli -m Carnice-9b-Q6_K.gguf -p "Reply with exactly READY." -n 16 ``` ## Notes These are GGUF exports of the merged standalone Carnice model, not PEFT adapters.