File size: 1,671 Bytes
0fe6443
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---
license: apache-2.0
language:
- en
base_model: Pageshift-Entertainment/pagestorm-research-preview-14b-full-book
base_model_relation: quantized
tags:
- gguf
- llama.cpp
- story-generation
- staged-generation
- full-book
- ministral3
pipeline_tag: text-generation
---

# PageStorm Research Preview 14B Full Book — Q8_0 GGUF

Q8_0 GGUF quantization of
[Pageshift-Entertainment/pagestorm-research-preview-14b-full-book](https://huggingface.co/Pageshift-Entertainment/pagestorm-research-preview-14b-full-book),
a `ministral3` model trained to produce a full novel from a single prompt via a
staged generation pipeline.

## Files
- `pagestorm-research-preview-14b-full-book-Q8_0.gguf` (~14 GB)

## Requirements
- A llama.cpp build whose runtime supports the **`mistral3`** architecture
  (`llm_build_mistral3` / `LLM_ARCH_MISTRAL3`). Older builds will fail to load it.

## Notes
- Converted with `convert_hf_to_gguf.py --outtype q8_0`. The source `config.json`
  needed `original_max_position_embeddings` changed from `16384.0` to integer
  `16384` so the converter could write the int rope KV field.
- The model uses a **staged** protocol with custom role headers
  (`<|start_header_id|>…<|stop_header_id|>`) and `<|eot_id|>` as the stage stop
  token — it is not a plain chat model. See the base model card and its
  `story_stage_generation.py` for the prompt protocol.
- Native context is 262144; KV at that length is large — quantize the KV cache
  (`--cache-type-k q8_0 --cache-type-v q8_0`) and/or cap `--ctx-size` to fit VRAM.

## Attribution
Base model © Pageshift Entertainment, Apache-2.0. This repo only redistributes a
quantized copy of those weights.