markldn's picture
Upload README.md with huggingface_hub
0fe6443 verified
|
Raw
History Blame
1.67 kB
metadata
license: apache-2.0
language:
  - en
base_model: Pageshift-Entertainment/pagestorm-research-preview-14b-full-book
base_model_relation: quantized
tags:
  - gguf
  - llama.cpp
  - story-generation
  - staged-generation
  - full-book
  - ministral3
pipeline_tag: text-generation

PageStorm Research Preview 14B Full Book — Q8_0 GGUF

Q8_0 GGUF quantization of Pageshift-Entertainment/pagestorm-research-preview-14b-full-book, a ministral3 model trained to produce a full novel from a single prompt via a staged generation pipeline.

Files

  • pagestorm-research-preview-14b-full-book-Q8_0.gguf (~14 GB)

Requirements

  • A llama.cpp build whose runtime supports the mistral3 architecture (llm_build_mistral3 / LLM_ARCH_MISTRAL3). Older builds will fail to load it.

Notes

  • Converted with convert_hf_to_gguf.py --outtype q8_0. The source config.json needed original_max_position_embeddings changed from 16384.0 to integer 16384 so the converter could write the int rope KV field.
  • The model uses a staged protocol with custom role headers (<|start_header_id|>…<|stop_header_id|>) and <|eot_id|> as the stage stop token — it is not a plain chat model. See the base model card and its story_stage_generation.py for the prompt protocol.
  • Native context is 262144; KV at that length is large — quantize the KV cache (--cache-type-k q8_0 --cache-type-v q8_0) and/or cap --ctx-size to fit VRAM.

Attribution

Base model © Pageshift Entertainment, Apache-2.0. This repo only redistributes a quantized copy of those weights.