---
base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
datasets:
- Omartificial-Intelligence-Space/Arabic-NLi-Triplet
language:
- ar
library_name: sentence-transformers
license: apache-2.0
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
pipeline_tag: feature-extraction
tags:
- mteb
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:557850
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
widget:
- source_sentence: ذكر متوازن بعناية يقف على قدم واحدة بالقرب من منطقة شاطئ المحيط
    النظيفة
  sentences:
  - رجل يقدم عرضاً
  - هناك رجل بالخارج قرب الشاطئ
  - رجل يجلس على أريكه
- source_sentence: رجل يقفز إلى سريره القذر
  sentences:
  - السرير قذر.
  - رجل يضحك أثناء غسيل الملابس
  - الرجل على القمر
- source_sentence: الفتيات بالخارج
  sentences:
  - امرأة تلف الخيط إلى كرات بجانب كومة من الكرات
  - فتيان يركبان في جولة متعة
  - ثلاث فتيات يقفون سوية في غرفة واحدة تستمع وواحدة تكتب على الحائط والثالثة تتحدث
    إليهن
- source_sentence: الرجل يرتدي قميصاً أزرق.
  sentences:
  - رجل يرتدي قميصاً أزرق يميل إلى الجدار بجانب الطريق مع شاحنة زرقاء وسيارة حمراء
    مع الماء في الخلفية.
  - كتاب القصص مفتوح
  - رجل يرتدي قميص أسود يعزف على الجيتار.
- source_sentence: يجلس شاب ذو شعر أشقر على الحائط يقرأ جريدة بينما تمر امرأة وفتاة
    شابة.
  sentences:
  - ذكر شاب ينظر إلى جريدة بينما تمر إمرأتان بجانبه
  - رجل يستلقي على وجهه على مقعد في الحديقة.
  - الشاب نائم بينما الأم تقود ابنتها إلى الحديقة
model-index:
- name: SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
  results:
  - task:
      type: Retrieval
    dataset:
      name: MTEB MintakaRetrieval (ar)
      type: mintaka/mmteb-mintaka
      config: ar
      split: test
      revision: efa78cc2f74bbcd21eff2261f9e13aebe40b814e
    metrics:
    - type: main_score
      value: 12.493
    - type: map_at_1
      value: 5.719
    - type: map_at_3
      value: 8.269
    - type: map_at_5
      value: 9.172
    - type: map_at_10
      value: 9.894
    - type: ndcg_at_1
      value: 5.719
    - type: ndcg_at_3
      value: 9.128
    - type: ndcg_at_5
      value: 10.745
    - type: ndcg_at_10
      value: 12.493
    - type: recall_at_1
      value: 5.719
    - type: recall_at_3
      value: 11.621
    - type: recall_at_5
      value: 15.524
    - type: recall_at_10
      value: 20.926
    - type: precision_at_1
      value: 5.719
    - type: precision_at_3
      value: 3.874
    - type: precision_at_5
      value: 3.105
    - type: precision_at_10
      value: 2.093
    - type: mrr_at_1
      value: 5.7195
    - type: mrr_at_3
      value: 8.269
    - type: mrr_at_5
      value: 9.1723
    - type: mrr_at_10
      value: 9.8942
  - task:
      type: Retrieval
    dataset:
      name: MTEB MIRACLRetrievalHardNegatives (ar)
      type: miracl/mmteb-miracl-hardnegatives
      config: ar
      split: dev
      revision: 95c8db7d4a6e9c1d8a60601afd63d553ae20a2eb
    metrics:
    - type: main_score
      value: 22.396
    - type: map_at_1
      value: 8.866
    - type: map_at_3
      value: 13.905
    - type: map_at_5
      value: 15.326
    - type: map_at_10
      value: 16.851
    - type: ndcg_at_1
      value: 13.9
    - type: ndcg_at_3
      value: 17.309
    - type: ndcg_at_5
      value: 19.174
    - type: ndcg_at_10
      value: 22.396
    - type: recall_at_1
      value: 8.866
    - type: recall_at_3
      value: 19.177
    - type: recall_at_5
      value: 23.999
    - type: recall_at_10
      value: 32.421
    - type: precision_at_1
      value: 13.9
    - type: precision_at_3
      value: 10.933
    - type: precision_at_5
      value: 8.5
    - type: precision_at_10
      value: 5.96
    - type: mrr_at_1
      value: 13.9
    - type: mrr_at_3
      value: 20.0667
    - type: mrr_at_5
      value: 21.3617
    - type: mrr_at_10
      value: 22.7531
  - task:
      type: Retrieval
    dataset:
      name: MTEB MLQARetrieval (ar)
      type: mlqa/mmteb-mlqa
      config: ar
      split: validation
      revision: 397ed406c1a7902140303e7faf60fff35b58d285
    metrics:
    - type: main_score
      value: 57.312
    - type: map_at_1
      value: 44.487
    - type: map_at_3
      value: 50.516
    - type: map_at_5
      value: 51.715
    - type: map_at_10
      value: 52.778
    - type: ndcg_at_1
      value: 44.487
    - type: ndcg_at_3
      value: 52.586
    - type: ndcg_at_5
      value: 54.742
    - type: ndcg_at_10
      value: 57.312
    - type: recall_at_1
      value: 44.487
    - type: recall_at_3
      value: 58.607
    - type: recall_at_5
      value: 63.83
    - type: recall_at_10
      value: 71.76
    - type: precision_at_1
      value: 44.487
    - type: precision_at_3
      value: 19.536
    - type: precision_at_5
      value: 12.766
    - type: precision_at_10
      value: 7.176
    - type: mrr_at_1
      value: 44.4874
    - type: mrr_at_3
      value: 50.5158
    - type: mrr_at_5
      value: 51.715
    - type: mrr_at_10
      value: 52.7782
  - task:
      type: Retrieval
    dataset:
      name: MTEB SadeemQuestionRetrieval (ar)
      type: sadeem/mmteb-sadeem
      config: default
      split: test
      revision: 3cb0752b182e5d5d740df547748b06663c8e0bd9
    metrics:
    - type: main_score
      value: 52.976
    - type: map_at_1
      value: 22.307
    - type: map_at_3
      value: 41.727
    - type: map_at_5
      value: 43.052
    - type: map_at_10
      value: 43.844
    - type: ndcg_at_1
      value: 22.307
    - type: ndcg_at_3
      value: 48.7
    - type: ndcg_at_5
      value: 51.057
    - type: ndcg_at_10
      value: 52.976
    - type: recall_at_1
      value: 22.307
    - type: recall_at_3
      value: 69.076
    - type: recall_at_5
      value: 74.725
    - type: recall_at_10
      value: 80.661
    - type: precision_at_1
      value: 22.307
    - type: precision_at_3
      value: 23.025
    - type: precision_at_5
      value: 14.945
    - type: precision_at_10
      value: 8.066
    - type: mrr_at_1
      value: 21.0148
    - type: mrr_at_3
      value: 40.8808
    - type: mrr_at_5
      value: 42.1254
    - type: mrr_at_10
      value: 42.9125
  - task:
      type: STS
    dataset:
      name: MTEB BIOSSES (default)
      type: mteb/biosses-sts
      config: default
      split: test
      revision: d3fb88f8f02e40887cd149695127462bbcf29b4a
    metrics:
    - type: cosine_pearson
      value: 72.5081840952171
    - type: cosine_spearman
      value: 69.41362982941537
    - type: euclidean_pearson
      value: 67.45121490183709
    - type: euclidean_spearman
      value: 67.15273493989758
    - type: main_score
      value: 69.41362982941537
    - type: manhattan_pearson
      value: 67.6119022794479
    - type: manhattan_spearman
      value: 67.51659865246586
  - task:
      type: STS
    dataset:
      name: MTEB SICK-R (default)
      type: mteb/sickr-sts
      config: default
      split: test
      revision: 20a6d6f312dd54037fe07a32d58e5e168867909d
    metrics:
    - type: cosine_pearson
      value: 83.61591268324493
    - type: cosine_spearman
      value: 79.61914245705792
    - type: euclidean_pearson
      value: 81.32044881859483
    - type: euclidean_spearman
      value: 79.04866675279919
    - type: main_score
      value: 79.61914245705792
    - type: manhattan_pearson
      value: 81.09220518201322
    - type: manhattan_spearman
      value: 78.87590523907905
  - task:
      type: STS
    dataset:
      name: MTEB STS12 (default)
      type: mteb/sts12-sts
      config: default
      split: test
      revision: a0d554a64d88156834ff5ae9920b964011b16384
    metrics:
    - type: cosine_pearson
      value: 84.59807803376341
    - type: cosine_spearman
      value: 77.38689922564416
    - type: euclidean_pearson
      value: 83.92034850646732
    - type: euclidean_spearman
      value: 76.75857193093438
    - type: main_score
      value: 77.38689922564416
    - type: manhattan_pearson
      value: 83.97191863964667
    - type: manhattan_spearman
      value: 76.89790070725708
  - task:
      type: STS
    dataset:
      name: MTEB STS13 (default)
      type: mteb/sts13-sts
      config: default
      split: test
      revision: 7e90230a92c190f1bf69ae9002b8cea547a64cca
    metrics:
    - type: cosine_pearson
      value: 78.18664268536664
    - type: cosine_spearman
      value: 79.58989311630421
    - type: euclidean_pearson
      value: 79.25259731614729
    - type: euclidean_spearman
      value: 80.1701122827397
    - type: main_score
      value: 79.58989311630421
    - type: manhattan_pearson
      value: 79.12601451996869
    - type: manhattan_spearman
      value: 79.98999436073663
  - task:
      type: STS
    dataset:
      name: MTEB STS14 (default)
      type: mteb/sts14-sts
      config: default
      split: test
      revision: 6031580fec1f6af667f0bd2da0a551cf4f0b2375
    metrics:
    - type: cosine_pearson
      value: 80.97541876658141
    - type: cosine_spearman
      value: 79.78614320477877
    - type: euclidean_pearson
      value: 81.01514505747167
    - type: euclidean_spearman
      value: 80.73664735567839
    - type: main_score
      value: 79.78614320477877
    - type: manhattan_pearson
      value: 80.8746560526314
    - type: manhattan_spearman
      value: 80.67025673179079
  - task:
      type: STS
    dataset:
      name: MTEB STS15 (default)
      type: mteb/sts15-sts
      config: default
      split: test
      revision: ae752c7c21bf194d8b67fd573edf7ae58183cbe3
    metrics:
    - type: cosine_pearson
      value: 85.23661155813113
    - type: cosine_spearman
      value: 86.21134464371615
    - type: euclidean_pearson
      value: 85.82518684522182
    - type: euclidean_spearman
      value: 86.43600784349509
    - type: main_score
      value: 86.21134464371615
    - type: manhattan_pearson
      value: 85.83101152371589
    - type: manhattan_spearman
      value: 86.42228695679498
  - task:
      type: STS
    dataset:
      name: MTEB STS16 (default)
      type: mteb/sts16-sts
      config: default
      split: test
      revision: 4d8694f8f0e0100860b497b999b3dbed754a0513
    metrics:
    - type: cosine_pearson
      value: 79.20106689077852
    - type: cosine_spearman
      value: 81.39570893867825
    - type: euclidean_pearson
      value: 80.39578888768929
    - type: euclidean_spearman
      value: 81.19950443340412
    - type: main_score
      value: 81.39570893867825
    - type: manhattan_pearson
      value: 80.2226679341839
    - type: manhattan_spearman
      value: 80.99142422593823
  - task:
      type: STS
    dataset:
      name: MTEB STS17 (ar-ar)
      type: mteb/sts17-crosslingual-sts
      config: ar-ar
      split: test
      revision: faeb762787bd10488a50c8b5be4a3b82e411949c
    metrics:
    - type: cosine_pearson
      value: 81.05294851623468
    - type: cosine_spearman
      value: 81.10570655134113
    - type: euclidean_pearson
      value: 79.22292773537778
    - type: euclidean_spearman
      value: 78.84204232638425
    - type: main_score
      value: 81.10570655134113
    - type: manhattan_pearson
      value: 79.43750460320484
    - type: manhattan_spearman
      value: 79.33713593557482
  - task:
      type: STS
    dataset:
      name: MTEB STS22 (ar)
      type: mteb/sts22-crosslingual-sts
      config: ar
      split: test
      revision: de9d86b3b84231dc21f76c7b7af1f28e2f57f6e3
    metrics:
    - type: cosine_pearson
      value: 45.96875498680092
    - type: cosine_spearman
      value: 52.405509117149904
    - type: euclidean_pearson
      value: 42.097450896728226
    - type: euclidean_spearman
      value: 50.89022884113707
    - type: main_score
      value: 52.405509117149904
    - type: manhattan_pearson
      value: 42.22827727075534
    - type: manhattan_spearman
      value: 50.912841055442634
  - task:
      type: STS
    dataset:
      name: MTEB STSBenchmark (default)
      type: mteb/stsbenchmark-sts
      config: default
      split: test
      revision: b0fddb56ed78048fa8b90373c8a3cfc37b684831
    metrics:
    - type: cosine_pearson
      value: 83.13261516884116
    - type: cosine_spearman
      value: 84.3492527221498
    - type: euclidean_pearson
      value: 82.691603178401
    - type: euclidean_spearman
      value: 83.0499566200785
    - type: main_score
      value: 84.3492527221498
    - type: manhattan_pearson
      value: 82.68307441014618
    - type: manhattan_spearman
      value: 83.01315787964519
  - task:
      type: Summarization
    dataset:
      name: MTEB SummEval (default)
      type: mteb/summeval
      config: default
      split: test
      revision: cda12ad7615edc362dbf25a00fdd61d3b1eaf93c
    metrics:
    - type: cosine_pearson
      value: 31.149232235402845
    - type: cosine_spearman
      value: 30.685504130606255
    - type: dot_pearson
      value: 27.466307571160375
    - type: dot_spearman
      value: 28.93064261485915
    - type: main_score
      value: 30.685504130606255
    - type: pearson
      value: 31.149232235402845
    - type: spearman
      value: 30.685504130606255
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: sts test 256
      type: sts-test-256
    metrics:
    - type: pearson_cosine
      value: 0.8264447022356382
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.8386403752382455
      name: Spearman Cosine
    - type: pearson_manhattan
      value: 0.8219134931449013
      name: Pearson Manhattan
    - type: spearman_manhattan
      value: 0.825509659109493
      name: Spearman Manhattan
    - type: pearson_euclidean
      value: 0.8223094468630248
      name: Pearson Euclidean
    - type: spearman_euclidean
      value: 0.8260503151751462
      name: Spearman Euclidean
    - type: pearson_dot
      value: 0.6375226884845725
      name: Pearson Dot
    - type: spearman_dot
      value: 0.6287228614640888
      name: Spearman Dot
    - type: pearson_max
      value: 0.8264447022356382
      name: Pearson Max
    - type: spearman_max
      value: 0.8386403752382455
      name: Spearman Max
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: sts test 128
      type: sts-test-128
    metrics:
    - type: pearson_cosine
      value: 0.8209661910768973
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.8347149482673766
      name: Spearman Cosine
    - type: pearson_manhattan
      value: 0.8082811559854036
      name: Pearson Manhattan
    - type: spearman_manhattan
      value: 0.8148314269262763
      name: Spearman Manhattan
    - type: pearson_euclidean
      value: 0.8093138512113149
      name: Pearson Euclidean
    - type: spearman_euclidean
      value: 0.8156468458613929
      name: Spearman Euclidean
    - type: pearson_dot
      value: 0.5795109620454884
      name: Pearson Dot
    - type: spearman_dot
      value: 0.5760223026552876
      name: Spearman Dot
    - type: pearson_max
      value: 0.8209661910768973
      name: Pearson Max
    - type: spearman_max
      value: 0.8347149482673766
      name: Spearman Max
  - task:
      type: semantic-similarity
      name: Semantic Similarity
    dataset:
      name: sts test 64
      type: sts-test-64
    metrics:
    - type: pearson_cosine
      value: 0.808708530451336
      name: Pearson Cosine
    - type: spearman_cosine
      value: 0.8217532539767914
      name: Spearman Cosine
    - type: pearson_manhattan
      value: 0.7876121380998453
      name: Pearson Manhattan
    - type: spearman_manhattan
      value: 0.7969092304137347
      name: Spearman Manhattan
    - type: pearson_euclidean
      value: 0.7902997966909958
      name: Pearson Euclidean
    - type: spearman_euclidean
      value: 0.7987635968785215
      name: Spearman Euclidean
    - type: pearson_dot
      value: 0.495047136234386
      name: Pearson Dot
    - type: spearman_dot
      value: 0.49287000679901516
      name: Spearman Dot
    - type: pearson_max
      value: 0.808708530451336
      name: Pearson Max
    - type: spearman_max
      value: 0.8217532539767914
      name: Spearman Max
---

# SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) on the Omartificial-Intelligence-Space/arabic-n_li-triplet dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. This model is part of the [Arabic Matryoshka Embedding Models collection](https://huggingface.co/collections/Omartificial-Intelligence-Space/arabic-matryoshka-embedding-models-666f764d3b570f44d7f77d4e). It was presented in the paper [GATE: General Arabic Text Embedding for Enhanced Semantic Textual Similarity with Matryoshka Representation Learning and Hybrid Loss Training](https://huggingface.co/papers/2505.24581).

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) <!-- at revision bf3bf13ab40c3157080a7ab344c831b9ad18b5eb -->
- **Maximum Sequence Length:** 128 tokens
- **Output Dimensionality:** 384 tokens
- **Similarity Function:** Cosine Similarity
- **Training Dataset:**
    - Omartificial-Intelligence-Space/arabic-n_li-triplet
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging