--- language: - ar pipeline_tag: feature-extraction --- We have successfully trained a [FastText](https://fasttext.cc/)-based Word2Vec model on our dataset, utilizing an embedding size of 100 dimensions. This model is designed to generate vector representations for individual words and sub-words, allowing it to effectively capture semantic and morphological relationships within the text. \ To obtain representations at the sentence level, we compute embeddings for all constituent words and sub-words in a given text and then apply averaging. This approach ensures that the resulting sentence embedding encapsulates the overall meaning and preserving contextual nuances.