PatrickHaller
/

ngme-llama-264M

Text Generation

Model card Files Files and versions

NGME-LLama 264M

Trained on 4 A6000 for ~4 days
Trained ~4 Billion (4 * 16 * 768 * 100_000) Tokens
On C4 Corpus

Downloads last month: 5

Dataset used to train PatrickHaller/ngme-llama-264M