Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 14
How to use Nessrine9/finetuned-snli-MiniLM-L12-v2-100k-en-fr with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Nessrine9/finetuned-snli-MiniLM-L12-v2-100k-en-fr")
sentences = [
"The church has granite statues of Jesus and the Apostles adorning its porch .",
"There were no statues in the church .",
"L' Afrique du sud et le reste de l' Afrique sont les mêmes .",
"Tours on foot are a great way to see LA ."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L12-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Nessrine9/finetuned-snli-MiniLM-L12-v2-100k-en-fr")
# Run inference
sentences = [
"L' ancien n' est pas une classification juridique qui entraîne une perte automatique de ces droits .",
'Ils voulaient plaider pour les personnes âgées .',
"Les villes grecques d' Anatolie ont été exclues de l' appartenance à la Confédération Delian .",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
snli-devEmbeddingSimilarityEvaluator| Metric | Value |
|---|---|
| pearson_cosine | 0.3542 |
| spearman_cosine | 0.3593 |
| pearson_manhattan | 0.3494 |
| spearman_manhattan | 0.3583 |
| pearson_euclidean | 0.3498 |
| spearman_euclidean | 0.3593 |
| pearson_dot | 0.3542 |
| spearman_dot | 0.3593 |
| pearson_max | 0.3542 |
| spearman_max | 0.3593 |
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
We 're off ! " |
We 're not headed off . |
1.0 |
Il y en a eu un ici récemment qui me vient à l' esprit que c' est à propos d' une femme que c' est ridicule je veux dire que c' est presque euh ce serait drôle si ce n' était pas si triste je veux dire cette femme cette femme est sortie et a engagé quelqu' un à |
Cette femme a engagé quelqu' un récemment pour le faire et s' est fait prendre immédiatement . |
0.5 |
Gentilello a précisé qu' il n' avait pas critiqué le processus d' examen par les pairs , mais que les panels qui examinent les interventions en matière d' alcool dans l' eds devraient inclure des représentants de la médecine d' urgence . |
Gentilello S' est ensuite battu avec un psychiatre sur le parking . |
0.5 |
CosineSimilarityLoss with these parameters:{
"loss_fct": "torch.nn.modules.loss.MSELoss"
}
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 4fp16: Truemulti_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 4max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseeval_use_gather_object: Falsebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss | snli-dev_spearman_max |
|---|---|---|---|
| 0.08 | 500 | 0.1948 | 0.0484 |
| 0.16 | 1000 | 0.1752 | 0.1177 |
| 0.24 | 1500 | 0.1727 | 0.1136 |
| 0.32 | 2000 | 0.1668 | 0.2050 |
| 0.4 | 2500 | 0.1673 | 0.2227 |
| 0.48 | 3000 | 0.1651 | 0.1760 |
| 0.56 | 3500 | 0.1619 | 0.2195 |
| 0.64 | 4000 | 0.1625 | 0.2308 |
| 0.72 | 4500 | 0.1563 | 0.2405 |
| 0.8 | 5000 | 0.1598 | 0.2773 |
| 0.88 | 5500 | 0.1589 | 0.2359 |
| 0.96 | 6000 | 0.1587 | 0.2084 |
| 1.0 | 6250 | - | 0.2615 |
| 1.04 | 6500 | 0.158 | 0.2958 |
| 1.12 | 7000 | 0.1557 | 0.2887 |
| 1.2 | 7500 | 0.1544 | 0.2960 |
| 1.28 | 8000 | 0.1535 | 0.2977 |
| 1.3600 | 8500 | 0.1559 | 0.2546 |
| 1.44 | 9000 | 0.1518 | 0.3201 |
| 1.52 | 9500 | 0.1551 | 0.2894 |
| 1.6 | 10000 | 0.149 | 0.2981 |
| 1.6800 | 10500 | 0.152 | 0.3140 |
| 1.76 | 11000 | 0.1484 | 0.3056 |
| 1.8400 | 11500 | 0.1497 | 0.3051 |
| 1.92 | 12000 | 0.1522 | 0.2893 |
| 2.0 | 12500 | 0.1503 | 0.2944 |
| 2.08 | 13000 | 0.1496 | 0.3039 |
| 2.16 | 13500 | 0.1462 | 0.3314 |
| 2.24 | 14000 | 0.1505 | 0.2470 |
| 2.32 | 14500 | 0.1457 | 0.3081 |
| 2.4 | 15000 | 0.1478 | 0.3204 |
| 2.48 | 15500 | 0.1464 | 0.3248 |
| 2.56 | 16000 | 0.1442 | 0.3360 |
| 2.64 | 16500 | 0.1437 | 0.3418 |
| 2.7200 | 17000 | 0.1416 | 0.3496 |
| 2.8 | 17500 | 0.1434 | 0.3283 |
| 2.88 | 18000 | 0.146 | 0.3246 |
| 2.96 | 18500 | 0.1448 | 0.3352 |
| 3.0 | 18750 | - | 0.3248 |
| 3.04 | 19000 | 0.1445 | 0.3394 |
| 3.12 | 19500 | 0.1423 | 0.3430 |
| 3.2 | 20000 | 0.1415 | 0.3410 |
| 3.2800 | 20500 | 0.1411 | 0.3367 |
| 3.36 | 21000 | 0.1445 | 0.3497 |
| 3.44 | 21500 | 0.1383 | 0.3640 |
| 3.52 | 22000 | 0.1408 | 0.3497 |
| 3.6 | 22500 | 0.1374 | 0.3452 |
| 3.68 | 23000 | 0.1401 | 0.3519 |
| 3.76 | 23500 | 0.137 | 0.3582 |
| 3.84 | 24000 | 0.1393 | 0.3610 |
| 3.92 | 24500 | 0.1408 | 0.3575 |
| 4.0 | 25000 | 0.1388 | 0.3593 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
microsoft/MiniLM-L12-H384-uncased