Upload README.md with huggingface_hub

Files changed (1) hide show

README.md CHANGED Viewed

@@ -115,50 +115,6 @@ print(f"Best answer: {best.strip()}")
 # → Best answer: biomass
 ```
-### Text generation
-```python
-from transformers import AutoTokenizer, AutoModelForCausalLM
-import torch
-model = AutoModelForCausalLM.from_pretrained(
-    "liodon-ai/slm-10m",
-    trust_remote_code=True,
-    dtype=torch.bfloat16,
-).to("cuda")
-tokenizer = AutoTokenizer.from_pretrained("liodon-ai/slm-10m", trust_remote_code=True)
-inputs = tokenizer("The quick brown fox", return_tensors="pt").to("cuda")
-output = model.generate(**inputs, max_new_tokens=50, do_sample=True, temperature=0.8, top_k=50)
-print(tokenizer.decode(output[0], skip_special_tokens=True))
-```
-> **Note:** Free-text generation quality is limited at this scale. The model's strength is in relative likelihood scoring, as used by the benchmark evaluations above.
-## Reproduce
-```bash
-git clone https://github.com/liodon-ai/slm-pretrain
-pip install -r requirements.txt
-# Prepare data
-python prepare_data.py
-# Train (25B tokens)
-python train.py
-# Export to HF format
-python export.py --checkpoint checkpoints/step_0044000.pt --out hf_model
-# Evaluate (4 lm-eval benchmarks)
-PYTHONPATH=. lm_eval --model hf \
-  --model_args pretrained=hf_model,trust_remote_code=True \
-  --tasks hellaswag,arc_easy,arc_challenge,piqa \
-  --device cuda
-# ArithMark-2.0
-python eval_arithmark.py
-```
 ## Citation

 # → Best answer: biomass
 ```
 ## Citation