How to use from the
Use from the
sentence-transformers library
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Thermostatic/qwen3-4b-embeddings-akkadian")

sentences = [
    "a-na a-la-hi-im KIŠIB a-mur-IŠTAR",
    "From Ennānum to Idnaya and Aššur- <big_gap> : In accordance with what I wrote to you with my orders that <big_gap> leads - urgent, pay attention and guard my goods and my donkeys like your own life as you are a gentleman. When Aššur-imittī went to the City he brought some 2 or 3 minas of silver on his own, and in the City Aššur-imittī took 0.5 mina of silver, the working capital of halgiaššu. Clear what textiles and tin he brings on his own and let it remain with him. Send me word. Also, carry out <big_gap> in accordance with my instructions I sent to you. Also, with respect to the interest(?) of Aššur-imittī he must not lie and make me angry at you.",
    "To Ali-ahum; seal of Amur-Ištar.",
    "To Ennam-Aššur from Ali-ahum and Amur-Ištar: Sadly, our father has died. It is not Šalim-Aššur who is our father, it is you who are our father. Take care there of our father's instructions and clear up the affairs. You shall not transfer any consignment of our father's to this place. One or two of our investors are staying here. Our dear father and lord, clear it up."
]
embeddings = model.encode(sentences)

similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]

SentenceTransformer

This model was finetuned with Unsloth.

based on unsloth/Qwen3-Embedding-4B

This is a sentence-transformers model finetuned from unsloth/Qwen3-Embedding-4B. It maps sentences & paragraphs to a 2560-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: unsloth/Qwen3-Embedding-4B
  • Maximum Sequence Length: 1024 tokens
  • Output Dimensionality: 2560 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False, 'architecture': 'PeftModelForFeatureExtraction'})
  (1): Pooling({'word_embedding_dimension': 2560, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': True, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    '37 ku-ta-ni 34 {túg}šu-ru-tum a-na lá-qé-pí-im áp-qí-id IGI i-dí-{d}IŠKUR IGI i-dí-a-šur DUMU sú-e-ta-ta 7 TÚG.HI.A ša li-wi-tim IGI a-šùr-SIPA a-dí-šu-um',
    "37 -textiles (and) 34 dark textiles I entrusted to Lā-qēpum in the presence of Iddin-Adad and of Iddin-Aššur, son of Suettata. 7 textiles for wrapping I gave him in the presence of Aššur-rē'ī.",
    'To Ešarra and Ab-šalim from Ennam-Aššur: 10 shekels of silver and an undergarment sealed by me is for Ešarra. 10 shekels of silver and 2 sashes are for Ab-šalim and the girl. 2 shekels of silver is for <big_gap> sister Ištar-lamassī <big_gap>',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 2560]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7541, 0.0110],
#         [0.7541, 1.0000, 0.0221],
#         [0.0110, 0.0221, 1.0000]])

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.9124
spearman_cosine 0.8625

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,137 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 13 tokens
    • mean: 229.79 tokens
    • max: 579 tokens
    • min: 7 tokens
    • mean: 137.38 tokens
    • max: 442 tokens
  • Samples:
    anchor positive
    1 ma-na KÙ.BABBAR big_gap 4 {túg}ku-ta-ni big_gap ni-ik-na-x- big_gap KÙ.BABBAR a-ha-ma big_gap 2 GÍN KÙ.BABBAR big_gap ša áb-na-tim kà-ú-nam big_gap áb-na-tim big_gap uk-ta-in big_gap KÙ.BABBAR i-za-az big_gap ŠU.NÍGIN 1 ma-na 2 GÍN KÙ.BABBAR i li-bi big_gap IGI pì-lá-ah- big_gap IGI a-šur-na-da šu-ma KÙ.BABBAR a-na big_gap lá iš-ta-qá-al big_gap iš-tù ha-mu-uš-tim ša a-šur-be-el-a-wa-tim 1 ma-na-um 3 GÍN.TA ṣí-ib-tám ú-ṣa-áb i-na ITU.KAM a-ma-nu-šu-um 1 mina of silver 4 kutānu-textiles silver; further, 2 shekels of silver of the stones confirm He has confirmed the stones. The silver stands ready. In all: 1 mina 2 shekels of silver is owed by Witnessed by Pilah- , by Aššur-nādā. If he has not paid the silver in I shall count interest for him reckoned from the week of Aššur-bēl-awātim at the rate 3 shekels per mina per month.
    ŠU.NÍGIN KÙ.BABBAR-pì-kà 15 ma-na 10 GÍN lu ša AN.NA ú ṣú-ba-tí-kà ku-nu-ki-ni ṣí-li-a na-áš-a-ku-um Total of your silver: 15 minas 10 shekels, Ṣilliya brings you under our seal - both that from the tin and that from your textiles.
    1 ma-na 7.5 GÍN KÙ.BABBAR ṣa-ru-pá-am i-ṣé-er a-mur-IŠTAR DUMU da-da e-la-ma i-šu iš-tù ha-muš-tim ša a-la-hi-im ú {d}MAR.TU-ba-ni a-na 11 ha-am-ša-tim i-ša-qal šu-ma lá iš-qú-ul 1½ GÍN.TA ṣí-ib-tám a-na ma-na-im i-na ITU.1.KAM ú-ṣa-áb ITU.KAM ša sà-ra-tim li-mu-um ša qá-té DINGIR-šu-GAL DUMU ba-zi-a IGI im-dí-lim DUMU šu-lá-ba-an IGI e-me-me-i DUMU a-zu-ta-a 1 mina 7.5 shekels of refined silver Āmur-Ištar, son of Dada, owes to Elamma. From the week of Ali-ahum and Amurrum-bāni he will pay in 11 weeks; if he does not pay he will add 1.5 shekel as interest per mina per month. Month II, eponymy of the successor of Ilšu-rabi, son of Baziya. In the presence of Imdī-ilum, son of Šu-Labān, of Ememe'i, son of Azutaya.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • learning_rate: 2e-05
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • dataloader_pin_memory: False
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: None
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: False
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss akkadian_val_spearman_cosine
0.1389 5 2.402 -
0.2778 10 2.3992 -
0.4167 15 2.1648 -
0.5556 20 1.8975 -
0.6944 25 1.4115 0.7776
0.8333 30 1.0211 -
0.9722 35 0.6742 -
1.1111 40 0.4176 -
1.25 45 0.2966 -
1.3889 50 0.2419 0.8580
1.5278 55 0.2028 -
1.6667 60 0.1523 -
1.8056 65 0.1445 -
1.9444 70 0.106 -
2.0833 75 0.0906 0.8614
2.2222 80 0.1198 -
2.3611 85 0.0625 -
2.5 90 0.1019 -
2.6389 95 0.0474 -
2.7778 100 0.0945 0.8625
2.9167 105 0.1227 -

Framework Versions

  • Python: 3.12.3
  • Sentence Transformers: 5.2.0
  • Transformers: 4.57.6
  • PyTorch: 2.9.0+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.3.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
10
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Thermostatic/qwen3-4b-embeddings-akkadian

Finetuned
(7)
this model

Papers for Thermostatic/qwen3-4b-embeddings-akkadian

Evaluation results