markldn commited on
Commit
0fe6443
·
verified ·
1 Parent(s): 68a52fc

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model: Pageshift-Entertainment/pagestorm-research-preview-14b-full-book
6
+ base_model_relation: quantized
7
+ tags:
8
+ - gguf
9
+ - llama.cpp
10
+ - story-generation
11
+ - staged-generation
12
+ - full-book
13
+ - ministral3
14
+ pipeline_tag: text-generation
15
+ ---
16
+
17
+ # PageStorm Research Preview 14B Full Book — Q8_0 GGUF
18
+
19
+ Q8_0 GGUF quantization of
20
+ [Pageshift-Entertainment/pagestorm-research-preview-14b-full-book](https://huggingface.co/Pageshift-Entertainment/pagestorm-research-preview-14b-full-book),
21
+ a `ministral3` model trained to produce a full novel from a single prompt via a
22
+ staged generation pipeline.
23
+
24
+ ## Files
25
+ - `pagestorm-research-preview-14b-full-book-Q8_0.gguf` (~14 GB)
26
+
27
+ ## Requirements
28
+ - A llama.cpp build whose runtime supports the **`mistral3`** architecture
29
+ (`llm_build_mistral3` / `LLM_ARCH_MISTRAL3`). Older builds will fail to load it.
30
+
31
+ ## Notes
32
+ - Converted with `convert_hf_to_gguf.py --outtype q8_0`. The source `config.json`
33
+ needed `original_max_position_embeddings` changed from `16384.0` to integer
34
+ `16384` so the converter could write the int rope KV field.
35
+ - The model uses a **staged** protocol with custom role headers
36
+ (`<|start_header_id|>…<|stop_header_id|>`) and `<|eot_id|>` as the stage stop
37
+ token — it is not a plain chat model. See the base model card and its
38
+ `story_stage_generation.py` for the prompt protocol.
39
+ - Native context is 262144; KV at that length is large — quantize the KV cache
40
+ (`--cache-type-k q8_0 --cache-type-v q8_0`) and/or cap `--ctx-size` to fit VRAM.
41
+
42
+ ## Attribution
43
+ Base model © Pageshift Entertainment, Apache-2.0. This repo only redistributes a
44
+ quantized copy of those weights.