How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="zettafleet/z1-1b-hybrid-rtx")
# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("zettafleet/z1-1b-hybrid-rtx")
model = AutoModelForMultimodalLM.from_pretrained("zettafleet/z1-1b-hybrid-rtx")
Quick Links

Model Card for Z1 1B Hybrid RTX

We are excited to introduce the Z1 family of models! These models are based on the OLMo 2 1B architecture developed by Allen Institute for AI. Beginning with the pre-training checkpoint for OLMo-2 1B, we performed continued pre-training (i.e., midtraining) on Z1 1B Hybrid using the same dataset as OLMo 2 1B (dolmino-mix-1124).

What is unusual about the Z1 models is that the continued pre-training was performed via Zettafleet’s AI Training Platform on 8 NVIDIA GPUs in a fully decentralized way, without the use of high-bandwidth near-range communication links (i.e., NVLink) between the accelerators. See our blog post for further details.

We release the following models as part of the Z1 family:

The post-training pipeline was reconstructed through instructions provided by engineers and researchers at Allen Institute for AI.

The Z1 family of models shares the same architecture:

Size Layers Hidden Size Attention Heads Context Length
z1-1b-hybrid* 16 2048 16 4096

Using the Model

Z1 1B Hybrid is supported in transformers v4.48 or higher:

pip install transformers>=4.48

You can use Z1 1B Hybrid in your Python code as follows:

from transformers import AutoModelForCausalLM, AutoTokenizer

zettafleet = AutoModelForCausalLM.from_pretrained("zettafleet/z1-1b-hybrid")
tokenizer = AutoTokenizer.from_pretrained("zettafleet/z1-1b-hybrid")

message = ["Language modeling is "]
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
response = zettafleet.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)

print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
>> 'Language modeling is a key component of any text-based application, but its effectiveness...'

Model Description

  • Developed by: Zettafleet Ltd.
  • Contact: research@zettafleet.com.
  • Model type: A transformer-style autoregressive language model.
  • Language(s) (NLP): English.
  • License: The code and model are released under Zettafleet Open License, version 1.0 (ZOL-1.0-MIT).

Evaluation

Below is an evaluation comparison of the original OLMo 2 1B and the two Z1 base models.

Base Model Avg MMLU ARC Challenge HS WG NQ DROP AGI GSM8K MMLU Pro TQA
OLMo 2 1B 42.8 44.7 50.4 68.3 65.8 19.1 35.3 34.5 37.7 16.0 56.2
Z1 1B Hybrid 44.6 46.2 52.9 68.7 65.4 20.8 36.5 37.2 47.6 16.3 54.4
Z1 1B Hybrid RTX 44.6 46.6 53.2 69.1 65.3 19.5 37.2 36.2 48.2 16.5 54.3

Model Details

Data Processing

All datasets used for training were processed, tokenized and partitioned with the use of Zettafleet’s Data Platform.

Training Stages of Z1 models

The training stages we carried out are as follows:

  1. Continued pre-training:
  2. Post-training (Z1 Hybrid Instruct):

Bias, Risks and Limitations

AI models can be prompted by users to generate harmful and sensitive content. Such content may also be produced unintentionally, especially in cases involving bias, so we recommend that users consider the risks when applying this technology. Additionally, many statements from Z1 or any LLM are often inaccurate, so facts should be verified.

Downloads last month
3
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train zettafleet/z1-1b-hybrid-rtx