OmniGene-4-CPT-v2-GGUF

GGUF format models for OmniGene-4-CPT-v2 (continued pretraining checkpoint)

GGUF format quantized versions of OmniGene-4 for efficient inference on consumer GPUs and CPUs using llama.cpp, llama-cpp-python, Ollama, LM Studio, and other GGUF-compatible runtimes.

Available Quantizations

Quantization File Size RAM Required Quality
F16 OmniGene-4-CPT-v2-f16.gguf 50.6 GB ~52 GB Best quality
Q4_K_M OmniGene-4-CPT-v2-Q4_K_M.gguf 16 GB ~17 GB Recommended balance

Hardware Requirements

Quantization GPU CPU + RAM
F16 RTX A6000 (48GB) 64GB+ system RAM
Q4_K_M RTX 5090 (32GB) / RTX 4090 (24GB) / RTX 3090 (24GB) 32GB+ system RAM

Quick Start

Option 1: llama-cpp-python

pip install llama-cpp-python
from llama_cpp import Llama

llm = Llama(
    model_path="OmniGene-4-CPT-v2-Q4_K_M.gguf",
    n_ctx=4096,
    n_gpu_layers=-1,  # Offload all layers to GPU
)

output = llm("MKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVGDGTQDNLSGAEK", max_tokens=100)
print(output['choices'][0]['text'])

Option 2: llama.cpp Command Line

./llama-cli -m OmniGene-4-CPT-v2-Q4_K_M.gguf -p "MKTAYIAKQRQISFVKSHFSRQLEERL" -n 100 -ngl -1

Option 3: Ollama

# Create Modelfile
cat > Modelfile <<EOF
FROM ./OmniGene-4-CPT-v2-Q4_K_M.gguf
EOF

ollama create omnigene-4-cpt -f Modelfile
ollama run omnigene-4-cpt

Option 4: LM Studio

  1. Download OmniGene-4-CPT-v2-Q4_K_M.gguf
  2. Place in LM Studio models folder
  3. Load in LM Studio
  4. Start chatting

Model Description

OmniGene-4-CPT-v2 is a biological foundation model with:

  • Base: Gemma-4-26B-A4B-Instruct (MoE, 128 experts, top-8 routing)
  • Vocabulary: 290,048 tokens (262,020 original + 28,028 bio tokens)
  • CPT data: 32.5 GB mixed corpus (DNA, Protein, OpenWebText, Structure)
  • Training: 0.6 epoch, 2,806 steps, 8×H20 GPUs

Biological Tokens

The model includes 28,028 additional biological tokens:

  • DNA BPE: 20,000 tokens (optimized for genomic sequences)
  • Protein BPE: 8,000 tokens (optimized for amino acid sequences)
  • 3Di alphabet: 20 tokens (Foldseek structural alphabet)
  • DSSP: 8 tokens (secondary structure: H, E, C, etc.)

Other Versions

Citation

@article{wang2026omnigene4,
  title={OmniGene-4: A Unified Bio-Language MoE Model with Router-Level Interpretability},
  author={Wang, Liang},
  journal={bioRxiv},
  year={2026}
}

Paper

Full paper: https://github.com/maris205/omnigene4

License

Apache 2.0

Contact

Liang Wang (wangliang.f@gmail.com)
School of Artificial Intelligence and Automation
Huazhong University of Science and Technology

Downloads last month
24
GGUF
Model size
25B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dnagpt/OmniGene-4-CPT-v2-GGUF

Quantized
(1)
this model