fable5

Running

App Files Files Community

fable5 / README.md

cl4ude

Upload 3 files

6e23cd8 verified 20 days ago

preview code

Raw

History Blame Contribute Delete

1.9 kB

	---
	title: Gemma4 Coder GGUF Chat
	emoji: "💬"
	colorFrom: blue
	colorTo: green
	sdk: docker
	app_file: app.py
	app_port: 7860
	pinned: false
	models:
	- yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF
	tags:
	- llama.cpp
	- gguf
	- gemma4
	- coding
	- cpu
	---

	# Gemma4 12B Coder GGUF Chat

	Hugging Face Spaces Docker chatbot for:

	- Model: `yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF`
	- Default quant: `gemma4-coding-Q4_K_M.gguf`
	- Backend: prebuilt `llama.cpp` `llama-server`
	- UI: native `llama.cpp` web UI
	- Target: testing Gemma4 Coder on HF Spaces CPU

	## Why Q4 by default?

	`gemma4-coding-Q2_K.gguf` is smaller and faster, but it can produce broken fake-language responses on CPU. This Space uses `gemma4-coding-Q4_K_M.gguf` by default for better coherence. It is slower than Q2, but it is the safer option if the goal is a usable chatbot.

	## Default settings

	```text
	MODEL_REPO=yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF
	MODEL_FILE=gemma4-coding-Q4_K_M.gguf
	LLAMA_VERSION=b9592
	THREADS=4
	CTX_SIZE=2048
	BATCH_SIZE=default
	UBATCH_SIZE=default
	FLASH_ATTN=default
	CACHE_TYPE_K=default
	CACHE_TYPE_V=default
	TEMPERATURE=0.2
	TOP_P=0.95
	TOP_K=64
	REPEAT_PENALTY=1.08
	```

	The launcher downloads the GGUF into `/data`, fetches the model chat template from Hugging Face metadata, then hands the process over to `llama-server` on port `7860`.

	`default` means the launcher does not pass that flag, so native `llama.cpp` picks its own optimized default. This is closer to the fast reference Space and avoids CPU overhead from experimental KV-cache quantization or tiny batch settings.

	## If you want to compare Q2

	Change this environment variable back:

	```text
	MODEL_FILE=gemma4-coding-Q2_K.gguf
	```

	Q2 starts and responds faster, but the output may be incoherent.

	## Upload

	Upload these files to the root of a Docker Space:

	- `Dockerfile`
	- `app.py`
	- `README.md`