Fastext_Custom / README.md
Amdalotaibi's picture
Update README.md
88c41a8 verified
|
Raw
History Blame Contribute Delete
678 Bytes
metadata
language:
  - ar
pipeline_tag: feature-extraction

We have successfully trained a FastText-based Word2Vec model on our dataset, utilizing an embedding size of 100 dimensions.
This model is designed to generate vector representations for individual words and sub-words, allowing it to effectively capture semantic and morphological relationships within the text.
To obtain representations at the sentence level, we compute embeddings for all constituent words and sub-words in a given text and then apply averaging.
This approach ensures that the resulting sentence embedding encapsulates the overall meaning and preserving contextual nuances.