Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 15
How to use sunjupskilling/sunj-bge-base-en-v1.5 with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("sunjupskilling/sunj-bge-base-en-v1.5")
sentences = [
"Can you tell me about the origin of the word 'Shehnai'?",
"Krishan Kant (28 February 1927 – 27 July 2002) was the tenth Vice President of India from 1997 until his death. Previously, he was Governor of Andhra Pradesh from 1990 to 1997.",
"Acherontia lachesis is a large (up to 13 cm wingspan) Sphingid moth found in India and much of the Oriental region, one of the three species of Death's-head Hawkmoth, also known as the \"Bee Robber\".",
"A Shehnai is a South Asian music instrument which is normally played at marriages and other ceremonies, rites and rituals. The word itself is of Muslim/Turkish origin, combining 'Sheh' (or 'Shah') 'Royal' and '-Nai' or 'Ney', a type of Flute. A version of the \"Shehnai\", the \"Surnai\", is also played in the Northern and North-western areas of India and Pakistan, in particular at traditional Polo matches."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sunjupskilling/sunj-bge-base-en-v1.5")
# Run inference
sentences = [
'Who was Lal Bahadur Shastri?',
'Lal Bahadur Shastri (, , 2 October 1904\xa0– 11 January 1966) was an Indian politician. He was the 2nd Prime Minister of India from 1964 to 1966. He was a senior leader of the Indian National Congress political party.',
'Rex Vernon Whitehead (26 October 1948 – 26 June 2014) was an Australian Test cricket match umpire and cricketer. He umpired four Test matches between 1981 and 1982. His first match was between Australia and India in Sydney on 2 January to 4 January 1981. Altogether, he umpired 15 first-class matches in his career between 1979 and 1983.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
question and context| question | context | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | context |
|---|---|
What is Basil commonly known as? |
Basil ("Ocimum basilicum") ( or ) is a plant of the Family Lamiaceae. It is also known as Sweet Basil or Tulsi. It is a tender low-growing herb that is grown as a perennial in warm, tropical climates. Basil is originally native to India and other tropical regions of Asia. It has been cultivated there for more than 5,000 years. It is prominently featured in many cuisines throughout the world. Some of them are Italian, Thai, Vietnamese and Laotian cuisines. It grows to between 30–60 cm tall. It has light green, silky leaves 3–5 cm long and 1–3 cm broad. The leaves are opposite each other. The flowers are quite big. They are white in color and arranged as a spike. |
Where is Basil originally native to? |
Basil ("Ocimum basilicum") ( or ) is a plant of the Family Lamiaceae. It is also known as Sweet Basil or Tulsi. It is a tender low-growing herb that is grown as a perennial in warm, tropical climates. Basil is originally native to India and other tropical regions of Asia. It has been cultivated there for more than 5,000 years. It is prominently featured in many cuisines throughout the world. Some of them are Italian, Thai, Vietnamese and Laotian cuisines. It grows to between 30–60 cm tall. It has light green, silky leaves 3–5 cm long and 1–3 cm broad. The leaves are opposite each other. The flowers are quite big. They are white in color and arranged as a spike. |
What is the significance of the Roerich Pact? |
The Roerich Pact is a treaty on Protection of Artistic and Scientific Institutions and Historic Monuments, signed by the representatives of 21 states in the Oval Office of the White House on 15 April 1935. As of January 1, 1990, the Roerich Pact had been ratified by ten nations: Brazil, Chile, Colombia, Cuba, the Dominican Republic, El Salvador, Guatemala, Mexico, the United States, and Venezuela. It went into effect on 26 August 1935. The Government of India approved the Treaty in 1948, but did not take any further formal action. The Roerich Pact is also known as "Pax Cultura" ("Cultural Peace" or "Peace through Culture"). The most important part of the Roerich Pact is the legal recognition that the protection of culture is always more important than any military necessity. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
question and context| question | context | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | context |
|---|---|
What are the bases of political relations between India and Ireland? |
Indo-Irish relations between the Republic of Ireland and the Republic of India picked up steam during the freedom struggles of the respective countries against a common imperial empire in the United Kingdom. Political relations between the two states have largely been based on socio-cultural ties, although political and economic ties have also helped build relations. Indians recognise Northern Ireland as part of its country. |
When did Rex Whitehead umpire his first Test match? |
Rex Vernon Whitehead (26 October 1948 – 26 June 2014) was an Australian Test cricket match umpire and cricketer. He umpired four Test matches between 1981 and 1982. His first match was between Australia and India in Sydney on 2 January to 4 January 1981. Altogether, he umpired 15 first-class matches in his career between 1979 and 1983. |
What can you tell me about Nayaganj? |
Nayaganj is a village in Vaishali District, Bihar, India. It is very close to the river Ganga. It is also a postal office of India |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16learning_rate: 3e-06weight_decay: 0.03max_steps: 332warmup_ratio: 0.1warmup_steps: 1fp16: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 3e-06weight_decay: 0.03adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3.0max_steps: 332lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 1log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 0.1190 | 10 | 0.0998 | - |
| 0.2381 | 20 | 0.082 | 0.0253 |
| 0.3571 | 30 | 0.0843 | - |
| 0.4762 | 40 | 0.0496 | 0.0138 |
| 0.5952 | 50 | 0.0731 | - |
| 0.7143 | 60 | 0.0244 | 0.0093 |
| 0.8333 | 70 | 0.0338 | - |
| 0.9524 | 80 | 0.0484 | 0.0075 |
| 1.0714 | 90 | 0.0258 | - |
| 1.1905 | 100 | 0.0226 | 0.0067 |
| 1.3095 | 110 | 0.0331 | - |
| 1.4286 | 120 | 0.0193 | 0.0061 |
| 1.5476 | 130 | 0.0299 | - |
| 1.6667 | 140 | 0.0146 | 0.0055 |
| 1.7857 | 150 | 0.0228 | - |
| 1.9048 | 160 | 0.0543 | 0.0035 |
| 2.0238 | 170 | 0.0368 | - |
| 2.1429 | 180 | 0.025 | 0.0031 |
| 2.2619 | 190 | 0.0113 | - |
| 2.3810 | 200 | 0.0123 | 0.0029 |
| 2.5 | 210 | 0.0301 | - |
| 2.6190 | 220 | 0.0358 | 0.0027 |
| 2.7381 | 230 | 0.009 | - |
| 2.8571 | 240 | 0.01 | 0.0024 |
| 2.9762 | 250 | 0.0152 | - |
| 3.0952 | 260 | 0.013 | 0.0021 |
| 3.2143 | 270 | 0.0121 | - |
| 3.3333 | 280 | 0.012 | 0.0020 |
| 3.4524 | 290 | 0.0168 | - |
| 3.5714 | 300 | 0.0292 | 0.0019 |
| 3.6905 | 310 | 0.054 | - |
| 3.8095 | 320 | 0.0227 | 0.0019 |
| 3.9286 | 330 | 0.0144 | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
BAAI/bge-base-en-v1.5