--- license: apache-2.0 language: - en base_model: - Qwen/Qwen3-Embedding-4B tags: - telecom - telecommunications - gsma - fine-tuned pipeline_tag: feature-extraction --- # OTel-Embedding-4B **OTel-Embedding-4B** is a telecom-specialized embedding model fine-tuned on telecommunications domain data. It is part of the [OTel Family of Models](https://huggingface.co/collections/farbodtavakkoli/otel-embedding), an open-source initiative to build industry-standard AI models for the global telecommunications sector. ## Model Details | Attribute | Value | |-----------|-------| | **Base Model** | [Qwen/Qwen3-Embedding-4B](https://huggingface.co/Qwen/Qwen3-Embedding-4B) | | **Parameters** | 4B | | **Training Method** | Full parameter fine-tuning | | **Language** | English | | **License** | Apache 2.0 | ## Training Data The model was trained on telecom-focused data curated by 100+ domain experts. Each source class was contributed by a specific institutional partner: | Source | Contributor | |---|---| | arXiv telecom papers, 3GPP standards, telecom Wikipedia, telecom Common Crawl | Yale University | | GSMA Permanent Reference Documents, Discover portal | GSMA | | IETF RFC series | NetoAI | | Industry whitepapers | Khalifa University | | O-RAN specifications (working groups 1, 2, 4, 5, 6, 7, 8, 9, 10) | University of Leeds | | O-RAN documents across working groups | The University of Texas at Dallas | Released datasets: [OTel-LLM](https://huggingface.co/datasets/farbodtavakkoli/OTel-LLM), [OTel-Embedding](https://huggingface.co/datasets/farbodtavakkoli/OTel-Embedding), [OTel-Reranker](https://huggingface.co/datasets/farbodtavakkoli/OTel-Reranker), [OTel-Safety](https://huggingface.co/datasets/farbodtavakkoli/OTel-Safety). ## Intended Use The OTel model family is designed to power end-to-end Retrieval-Augmented Generation (RAG) pipelines for telecommunications. The three model types serve complementary roles: 1. **Embedding** — Retrieve relevant chunks from telecom specifications, standards, and documentation. 2. **Reranker** — Re-score and prioritize the retrieved chunks for relevance. 3. **LLM** — Generate accurate responses grounded in the retrieved context. Users can deploy the full pipeline or use individual models independently based on their needs. **Note:** The LLMs include abstention training — if the model does not receive sufficient context, it will decline to answer rather than hallucinate. This means the models are optimized for context-grounded generation, not open-ended question answering. ## Related Models ### Language Models - [OTel LLM Collection](https://huggingface.co/collections/farbodtavakkoli/otel-llm) ### Embedding Models - [OTel Embedding Collection](https://huggingface.co/collections/farbodtavakkoli/otel-embedding) ### Reranker Models - [OTel Reranker Collection](https://huggingface.co/collections/farbodtavakkoli/otel-reranker) ## Related Datasets - [OTel-Embedding](https://huggingface.co/datasets/farbodtavakkoli/OTel-Embedding) - [OTel-Safety](https://huggingface.co/datasets/farbodtavakkoli/OTel-Safety) - [OTel-LLM](https://huggingface.co/datasets/farbodtavakkoli/OTel-LLM) - [OTel-Reranker](https://huggingface.co/datasets/farbodtavakkoli/OTel-Reranker) ## Training Infrastructure - **Framework**: ScalarLM (GPU-agnostic) - **Compute**: AMD and NVIDIA GPUs. ## Project Resources - **Project page:** https://huggingface.co/farbodtavakkoli - **Code:** https://github.com/farbodtavakkoli/OTel - **Media coverage list:** https://github.com/farbodtavakkoli/OTel/blob/main/docs/media_coverage.md ## Citation ```bibtex @misc{otel_models_2026, title = {OTel: Open Telco AI Datasets, Benchmarks, and Models}, author = {Tavakkoli, Farbod and others}, year = {2026}, note = {Open Telco (OTel) model release}, url = {https://huggingface.co/farbodtavakkoli} } ``` ## Contact If you have any technical questions, please feel free to reach out to farbod.tavakkoli@att.com or farbodtavakoli@gmail.com