--- license: apache-2.0 language: - en base_model: Pageshift-Entertainment/pagestorm-research-preview-14b-full-book base_model_relation: quantized tags: - gguf - llama.cpp - story-generation - staged-generation - full-book - ministral3 pipeline_tag: text-generation --- # PageStorm Research Preview 14B Full Book — Q8_0 GGUF Q8_0 GGUF quantization of [Pageshift-Entertainment/pagestorm-research-preview-14b-full-book](https://huggingface.co/Pageshift-Entertainment/pagestorm-research-preview-14b-full-book), a `ministral3` model trained to produce a full novel from a single prompt via a staged generation pipeline. ## Files - `pagestorm-research-preview-14b-full-book-Q8_0.gguf` (~14 GB) ## Requirements - A llama.cpp build whose runtime supports the **`mistral3`** architecture (`llm_build_mistral3` / `LLM_ARCH_MISTRAL3`). Older builds will fail to load it. ## Notes - Converted with `convert_hf_to_gguf.py --outtype q8_0`. The source `config.json` needed `original_max_position_embeddings` changed from `16384.0` to integer `16384` so the converter could write the int rope KV field. - The model uses a **staged** protocol with custom role headers (`<|start_header_id|>…<|stop_header_id|>`) and `<|eot_id|>` as the stage stop token — it is not a plain chat model. See the base model card and its `story_stage_generation.py` for the prompt protocol. - Native context is 262144; KV at that length is large — quantize the KV cache (`--cache-type-k q8_0 --cache-type-v q8_0`) and/or cap `--ctx-size` to fit VRAM. ## Attribution Base model © Pageshift Entertainment, Apache-2.0. This repo only redistributes a quantized copy of those weights.