---
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen3-Embedding-4B
tags:
- telecom
- telecommunications
- gsma
- fine-tuned
pipeline_tag: feature-extraction
---

# OTel-Embedding-4B

**OTel-Embedding-4B** is a telecom-specialized embedding model fine-tuned on telecommunications domain data. It is part of the [OTel Family of Models](https://huggingface.co/collections/farbodtavakkoli/otel-embedding), an open-source initiative to build industry-standard AI models for the global telecommunications sector.

## Model Details

| Attribute | Value |
|-----------|-------|
| **Base Model** | [Qwen/Qwen3-Embedding-4B](https://huggingface.co/Qwen/Qwen3-Embedding-4B) |
| **Parameters** | 4B |
| **Training Method** | Full parameter fine-tuning |
| **Language** | English |
| **License** | Apache 2.0 |

## Training Data

The model was trained on telecom-focused data curated by 100+ domain experts. Each source class was contributed by a specific institutional partner:

| Source | Contributor |
|---|---|
| arXiv telecom papers, 3GPP standards, telecom Wikipedia, telecom Common Crawl | Yale University |
| GSMA Permanent Reference Documents, Discover portal | GSMA |
| IETF RFC series | NetoAI |
| Industry whitepapers | Khalifa University |
| O-RAN specifications (working groups 1, 2, 4, 5, 6, 7, 8, 9, 10) | University of Leeds |
| O-RAN documents across working groups | The University of Texas at Dallas |

Released datasets: [OTel-LLM](https://huggingface.co/datasets/farbodtavakkoli/OTel-LLM), [OTel-Embedding](https://huggingface.co/datasets/farbodtavakkoli/OTel-Embedding), [OTel-Reranker](https://huggingface.co/datasets/farbodtavakkoli/OTel-Reranker), [OTel-Safety](https://huggingface.co/datasets/farbodtavakkoli/OTel-Safety).

## Intended Use
 
The OTel model family is designed to power end-to-end Retrieval-Augmented Generation (RAG) pipelines for telecommunications. The three model types serve complementary roles:
 
1. **Embedding** — Retrieve relevant chunks from telecom specifications, standards, and documentation.
2. **Reranker** — Re-score and prioritize the retrieved chunks for relevance.
3. **LLM** — Generate accurate responses grounded in the retrieved context.
 
Users can deploy the full pipeline or use individual models independently based on their needs.
 
**Note:** The LLMs include abstention training — if the model does not receive sufficient context, it will decline to answer rather than hallucinate. This means the models are optimized for context-grounded generation, not open-ended question answering.

## Related Models

### Language Models
- [OTel LLM Collection](https://huggingface.co/collections/farbodtavakkoli/otel-llm)

### Embedding Models
- [OTel Embedding Collection](https://huggingface.co/collections/farbodtavakkoli/otel-embedding)

### Reranker Models
- [OTel Reranker Collection](https://huggingface.co/collections/farbodtavakkoli/otel-reranker)

## Related Datasets

- [OTel-Embedding](https://huggingface.co/datasets/farbodtavakkoli/OTel-Embedding)
- [OTel-Safety](https://huggingface.co/datasets/farbodtavakkoli/OTel-Safety)
- [OTel-LLM](https://huggingface.co/datasets/farbodtavakkoli/OTel-LLM)
- [OTel-Reranker](https://huggingface.co/datasets/farbodtavakkoli/OTel-Reranker)

## Training Infrastructure

- **Framework**: ScalarLM (GPU-agnostic)
- **Compute**: AMD and NVIDIA GPUs.

## Project Resources

- **Project page:** https://huggingface.co/farbodtavakkoli
- **Code:** https://github.com/farbodtavakkoli/OTel
- **Media coverage list:** https://github.com/farbodtavakkoli/OTel/blob/main/docs/media_coverage.md

## Citation


```bibtex
@misc{otel_models_2026,
  title  = {OTel: Open Telco AI Datasets, Benchmarks, and Models},
  author = {Tavakkoli, Farbod and others},
  year   = {2026},
  note   = {Open Telco (OTel) model release},
  url    = {https://huggingface.co/farbodtavakkoli}
}
```

## Contact

If you have any technical questions, please feel free to reach out to farbod.tavakkoli@att.com or farbodtavakoli@gmail.com