gemma-4-26B-A4B-it-MXFP4_MOE StorageLLM MoE JUJU Runtime
Original upstream model: https://huggingface.co/google/gemma-4-26B-A4B-it Quantized GGUF source repo: https://huggingface.co/unsloth/gemma-4-26B-A4B-it-GGUF Quantized GGUF source path: gemma-4-26B-A4B-it-MXFP4_MOE.gguf Runtime code: https://github.com/jujumelona/storage.llm Runtime/model-card license: MIT
This Hugging Face repo is a StorageLLM MoE/JUJU runtime model package. It is not a PPL-only artifact bucket. PPL is only a correctness validation step after the runtime package is generated.
What Users Must Download
Download the whole Hugging Face repo for normal use. Do not download only
*.juju; that gives the engine weights without the tokenizer, config, runtime
metadata, validation files, and performance planning sidecars.
hf download storagejuju/gemma-4-26b-a4b-it-mxfp4-moe-juju --local-dir <model_root>
If you must use include filters, include every pattern below:
hf download storagejuju/gemma-4-26b-a4b-it-mxfp4-moe-juju --local-dir <model_root> --include "*.juju" --include "*.juju.idx" --include "*.juju.verify.json" --include "verify/*.json" --include "runtime_assets_manifest.json" --include "storagellm_runtime_contract.json" --include "storagellm_performance_metadata_manifest.json" --include "metadata/**" --include "README.md" --include "config.json" --include "generation_config.json" --include "tokenizer.json" --include "tokenizer_config.json" --include "special_tokens_map.json" --include "added_tokens.json" --include "chat_template.jinja" --include "tokenizer.model" --include "sentencepiece.bpe.model" --include "tiktoken.model" --include "vocab.json" --include "merges.txt" --include "processor_config.json" --include "preprocessor_config.json" --include "image_processor_config.json" --include "feature_extractor.json" --include "video_preprocessor_config.json" --include "audio_config.json" --include "tokenization_*.py" --include "configuration_*.py" --include "modeling_*.py" --include "processing_*.py" --include "*_processor.py" --include "*_processing.py" --include "*_utils.py"
The engine consumes structured sidecar JSON/YAML/TOML files during model_root load. They are
not decoration: config, runtime manifests, graph/priority/prefetch/residency,
QKV, offload policy, validation, and metadata JSON are merged into the runtime
metadata path so tokenizer, attention, router, rope, embedding, GraphIR, tensor
layout, KV, final norm, LM head, and planning code can see the same contract.
The runtime needs these groups:
JUJU package:
<original_shard_stem>.juju<original_shard_stem>.juju.idx<original_shard_stem>.juju.verify.jsonruntime_assets_manifest.jsonstoragellm_runtime_contract.json
Text/API assets:
config.jsongeneration_config.jsontokenizer.jsonor tokenizer model filetokenizer_config.jsonchat_template.jinjaspecial_tokens_map.jsonandadded_tokens.jsonwhen present
Processor/custom-code assets when the model needs them:
processor_config.json,preprocessor_config.json,image_processor_config.jsonfeature_extractor.json,video_preprocessor_config.json,audio_config.jsontokenization_*.py,configuration_*.py,modeling_*.py,processing_*.py,*_processor.py
Engine/runtime performance metadata:
metadata/storagellm/*run_summary*.jsonmetadata/gguf/*.json,metadata/safetensors/*.json,metadata/sidecar/*.jsonverify/*.juju.verify.jsonstoragellm_performance_metadata_manifest.json
Generated sidecar upload policy:
- Only
README.mdmay be uploaded as Markdown; it is the Hugging Face model card. README.mdis not runtime metadata and is never used as an engine contract.- Runtime/performance sidecars are structured JSON/YAML/TOML only.
- Generated analysis/performance
.md,.pdf,.txt,.csv,.html, and.ipynbfiles are blocked from upload.
Runtime flags embedded in the JUJU contract:
none.
Download note: download the full HF repo, not only *.juju, so the engine also
gets tokenizer/config/chat-template/processor/custom-code assets and the
StorageLLM performance metadata sidecars. The notebook sends SOURCE_HF_TOKEN
when set, otherwise it uses the same HF_TOKEN used for upload. This keeps the
flow working if the source repo becomes gated and the token has accepted access.
- Downloads last month
- 523