| --- |
| language: |
| - en |
| license: apache-2.0 |
| base_model: HuggingFaceTB/SmolLM2-360M |
| tags: |
| - telecom |
| - 3gpp |
| - etsi |
| - standards |
| - domain-adaptation |
| - causal-lm |
| datasets: |
| - nareshmodina/TeleSpec-Data |
| metrics: |
| - perplexity |
| --- |
| |
| # SmolLM-TS-360M |
|
|
| A 360M parameter language model specialised in 3GPP and ETSI telecommunications standards, trained via full fine-tuning on [TeleSpec-Data](https://huggingface.co/datasets/nareshmodina/TeleSpec-Data). |
|
|
| Part of the **SmolLM-TS** series β small language models adapted exclusively to telecommunications standards documents, with zero arXiv or web content in the training corpus. |
|
|
| > **Looking for the instruction-tuned version?** See [nareshmodina/SmolLM-TS-360M-it](https://huggingface.co/nareshmodina/SmolLM-TS-360M-it) |
|
|
| --- |
|
|
| ## Model Details |
|
|
| | | | |
| |---|---| |
| | **Base model** | HuggingFaceTB/SmolLM2-360M | |
| | **Parameters** | 360M | |
| | **Training** | Full fine-tuning on TeleSpec-Data | |
| | **Pretraining data** | TeleSpec-Data (1.87B tokens) | |
| | **Context length** | 4096 tokens | |
| | **Hardware** | 3Γ NVIDIA L40S (48GB) | |
|
|
| --- |
|
|
| ## Training |
|
|
| Full fine-tuning of all model weights on 457,160 packed 4096-token blocks (1.87B tokens) from 38,302 standards documents β 15,054 3GPP (Rel-8 to Rel-19) and 23,248 ETSI documents spanning 15 working groups (2000β2024). Zero arXiv or web content β 100% standards text. |
|
|
| - Epochs: 2 |
| - Effective batch size: 128 β LR: 5e-5 (cosine with warmup) |
| - Context length: 4096 tokens |
|
|
| --- |
|
|
| ## Usage |
|
|
| This is a base model β it continues text rather than following instructions. For instruction following, use [SmolLM-TS-360M-it](https://huggingface.co/nareshmodina/SmolLM-TS-360M-it). |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| import torch |
| |
| model_id = "nareshmodina/SmolLM-TS-360M" |
| tokenizer = AutoTokenizer.from_pretrained(model_id) |
| model = AutoModelForCausalLM.from_pretrained( |
| model_id, dtype=torch.bfloat16, device_map="auto" |
| ) |
| |
| prompt = "The RRC Connection Establishment procedure in LTE is" |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| outputs = model.generate(**inputs, max_new_tokens=100, do_sample=False) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |
|
|
| --- |
|
|
| ## Limitations |
|
|
| - **Base model only** β does not follow instructions, use SmolLM-TS-360M-it for Q&A |
| - **Standards only** β strong 3GPP/ETSI knowledge, limited general telecom knowledge |
| - **Not for production** β intended for research purposes only |
|
|
| --- |
|
|
| ## Links |
|
|
| - π¦ Dataset: [nareshmodina/TeleSpec-Data](https://huggingface.co/datasets/nareshmodina/TeleSpec-Data) |
| - π€ Instruct version: [nareshmodina/SmolLM-TS-360M-it](https://huggingface.co/nareshmodina/SmolLM-TS-360M-it) |
| - π Benchmark: [AliMaatouk/Tele-Eval](https://huggingface.co/datasets/AliMaatouk/Tele-Eval) |
| - ποΈ Collection: [nareshmodina/SmolLM-TS](https://huggingface.co/collections/nareshmodina/smollm-ts) |
|
|
| --- |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{modina2025smollmts, |
| author = {Naresh Modina}, |
| title = {SmolLM-TS: Small Language Models for Telecommunications Standards}, |
| year = {2025}, |
| publisher = {Hugging Face}, |
| url = {https://huggingface.co/nareshmodina/SmolLM-TS-360M} |
| } |
| ``` |
|
|