GENERator-v2 Prokaryote 3B โ€“ Atlas Fine-tuned

This model is a fine-tuned version of:

GenerTeam/GENERator-v2-prokaryote-3b-base

Model Details

  • Finetuned from: GenerTeam/GENERator-v2-prokaryote-3b-base
  • Architecture: Decoder-only Transformer (Causal LM)
  • Tokenization: 6-mer DNA tokenizer (custom, requires trust_remote_code)
  • Domain: Prokaryotic genomic sequences
  • Fine-tuning steps: 36,000
  • Dataset: Atlas (custom)

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "<your-username>/generator-v2-prokaryote-3b-atlas-ft"

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True
)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True
)

Citation

If you use the base model GENERator in your research, please cite the original paper:

@misc{wu2025generator,
  title        = {GENERator: A Long-Context Generative Genomic Foundation Model},
  author       = {Wei Wu and Qiuyi Li and Mingyang Li and Kun Fu and Fuli Feng and Jieping Ye and Hui Xiong and Zheng Wang},
  year         = {2025},
  eprint       = {2502.07272},
  archivePrefix= {arXiv},
  primaryClass = {cs.CL},
  url          = {https://arxiv.org/abs/2502.07272}
}
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Paper for metaXu264/generator-v2-prokaryote-3b-atlas-ft