nyu-mll/glue
Viewer • Updated • 1.49M • 463k • 504
How to use ejun26/minilm-mrpc-clean-retrieval with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("ejun26/minilm-mrpc-clean-retrieval")
sentences = [
"The transaction will expand Callebaut 's sales revenues from its consumer products business to 45 percent from 23 percent .",
"The transaction will expand Callebaut 's sales revenues from its consumer products business by around 45 percent to some one-third of total sales .",
"Yeager said the incident appeared to be isolated , but the suspect showed tendencies of being a prior offender .",
"\" We must not engage in borough warfare , \" the Comptroller William Thompson told the Council , according to his written testimony ."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2 on the nyu-mll/glue dataset. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ejun26/minilm-mrpc-clean-retrieval")
# Run inference
sentences = [
'In a statement later , he said it appeared his side may have fallen a bit short .',
'Zilkha conceded in a statement issued today that his group may have fallen " a bit short . "',
"U.S. law enforcement officials are sneering at Dar Heatherington 's version of of the events -- including a police conspiracy to discredit her -- which thrust her into the public spotlight .",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
mrpc-validation-clean-v2models.evaluator.CleanInformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.0 |
| cosine_accuracy@3 | 1.0 |
| cosine_accuracy@5 | 1.0 |
| cosine_accuracy@10 | 1.0 |
| cosine_precision@1 | 0.0 |
| cosine_precision@3 | 0.3357 |
| cosine_precision@5 | 0.2014 |
| cosine_precision@10 | 0.1007 |
| cosine_recall@1 | 0.0 |
| cosine_recall@3 | 1.0 |
| cosine_recall@5 | 1.0 |
| cosine_recall@10 | 1.0 |
| cosine_ndcg@10 | 0.6309 |
| cosine_mrr@10 | 0.4994 |
| cosine_map@100 | 0.5 |
| dot_accuracy@1 | 0.0 |
| dot_accuracy@3 | 1.0 |
| dot_accuracy@5 | 1.0 |
| dot_accuracy@10 | 1.0 |
| dot_precision@1 | 0.0 |
| dot_precision@3 | 0.3357 |
| dot_precision@5 | 0.2014 |
| dot_precision@10 | 0.1007 |
| dot_recall@1 | 0.0 |
| dot_recall@3 | 1.0 |
| dot_recall@5 | 1.0 |
| dot_recall@10 | 1.0 |
| dot_ndcg@10 | 0.6309 |
| dot_mrr@10 | 0.4994 |
| dot_map@100 | 0.5 |
mrpc-test-clean-v2models.evaluator.CleanInformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.0 |
| cosine_accuracy@3 | 0.9799 |
| cosine_accuracy@5 | 0.9895 |
| cosine_accuracy@10 | 0.9965 |
| cosine_precision@1 | 0.0 |
| cosine_precision@3 | 0.3275 |
| cosine_precision@5 | 0.1984 |
| cosine_precision@10 | 0.0999 |
| cosine_recall@1 | 0.0 |
| cosine_recall@3 | 0.9799 |
| cosine_recall@5 | 0.9891 |
| cosine_recall@10 | 0.9961 |
| cosine_ndcg@10 | 0.623 |
| cosine_mrr@10 | 0.4912 |
| cosine_map@100 | 0.4915 |
| dot_accuracy@1 | 0.0 |
| dot_accuracy@3 | 0.9799 |
| dot_accuracy@5 | 0.9895 |
| dot_accuracy@10 | 0.9965 |
| dot_precision@1 | 0.0 |
| dot_precision@3 | 0.3275 |
| dot_precision@5 | 0.1984 |
| dot_precision@10 | 0.0999 |
| dot_recall@1 | 0.0 |
| dot_recall@3 | 0.9799 |
| dot_recall@5 | 0.9891 |
| dot_recall@10 | 0.9961 |
| dot_ndcg@10 | 0.623 |
| dot_mrr@10 | 0.4912 |
| dot_map@100 | 0.4915 |
text1, text2, and label| text1 | text2 | label | |
|---|---|---|---|
| type | string | string | int |
| details |
|
|
|
| text1 | text2 | label |
|---|---|---|
Amrozi accused his brother , whom he called " the witness " , of deliberately distorting his evidence . |
Referring to him as only " the witness " , Amrozi accused his brother of deliberately distorting his evidence . |
1 |
Yucaipa owned Dominick 's before selling the chain to Safeway in 1998 for $ 2.5 billion . |
Yucaipa bought Dominick 's in 1995 for $ 693 million and sold it to Safeway for $ 1.8 billion in 1998 . |
0 |
They had published an advertisement on the Internet on June 10 , offering the cargo for sale , he added . |
On June 10 , the ship 's owners had published an advertisement on the Internet , offering the explosives for sale . |
1 |
OnlineContrastiveLosseval_strategy: stepsper_device_train_batch_size: 32learning_rate: 2e-05num_train_epochs: 5warmup_ratio: 0.1load_best_model_at_end: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 5max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falsebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | mrpc-test-clean-v2_cosine_map@100 | mrpc-validation-clean-v2_cosine_map@100 |
|---|---|---|---|---|
| 0 | 0 | - | - | 0.4961 |
| 0.0870 | 10 | 1.6166 | - | - |
| 0.1739 | 20 | 1.668 | - | - |
| 0.2609 | 30 | 1.5081 | - | - |
| 0.3478 | 40 | 1.3996 | - | - |
| 0.4348 | 50 | 1.2969 | - | 0.4985 |
| 0.5217 | 60 | 1.1771 | - | - |
| 0.6087 | 70 | 0.9977 | - | - |
| 0.6957 | 80 | 1.1213 | - | - |
| 0.7826 | 90 | 1.139 | - | - |
| 0.8696 | 100 | 1.0821 | - | 0.5 |
| 0.9565 | 110 | 1.1488 | - | - |
| 1.0435 | 120 | 0.932 | - | - |
| 1.1304 | 130 | 0.794 | - | - |
| 1.2174 | 140 | 0.9996 | - | - |
| 1.3043 | 150 | 0.9328 | - | 0.5 |
| 1.3913 | 160 | 1.1032 | - | - |
| 1.4783 | 170 | 0.9692 | - | - |
| 1.5652 | 180 | 0.9501 | - | - |
| 1.6522 | 190 | 0.7863 | - | - |
| 1.7391 | 200 | 0.8454 | - | 0.5 |
| 1.8261 | 210 | 0.9311 | - | - |
| 1.9130 | 220 | 0.8134 | - | - |
| 2.0 | 230 | 1.0013 | - | - |
| 2.0870 | 240 | 0.7564 | - | - |
| 2.1739 | 250 | 0.9165 | - | 0.5 |
| 2.2609 | 260 | 0.7668 | - | - |
| 2.3478 | 270 | 0.6587 | - | - |
| 2.4348 | 280 | 0.5904 | - | - |
| 2.5217 | 290 | 0.7431 | - | - |
| 2.6087 | 300 | 0.6133 | - | 0.5 |
| 2.6957 | 310 | 0.5994 | - | - |
| 2.7826 | 320 | 0.6256 | - | - |
| 2.8696 | 330 | 0.7294 | - | - |
| 2.9565 | 340 | 0.7527 | - | - |
| 3.0435 | 350 | 0.6908 | - | 0.5 |
| 3.1304 | 360 | 0.6455 | - | - |
| 3.2174 | 370 | 0.3765 | - | - |
| 3.3043 | 380 | 0.5955 | - | - |
| 3.3913 | 390 | 0.6239 | - | - |
| 3.4783 | 400 | 0.6666 | - | 0.5 |
| 3.5652 | 410 | 0.6498 | - | - |
| 3.6522 | 420 | 0.6363 | - | - |
| 3.7391 | 430 | 0.7046 | - | - |
| 3.8261 | 440 | 0.4384 | - | - |
| 3.9130 | 450 | 0.6721 | - | 0.5 |
| 4.0 | 460 | 0.5341 | - | - |
| 4.0870 | 470 | 0.4459 | - | - |
| 4.1739 | 480 | 0.4153 | - | - |
| 4.2609 | 490 | 0.5116 | - | - |
| 4.3478 | 500 | 0.4221 | - | 0.5 |
| 4.4348 | 510 | 0.4696 | - | - |
| 4.5217 | 520 | 0.4552 | - | - |
| 4.6087 | 530 | 0.5403 | - | - |
| 4.6957 | 540 | 0.367 | - | - |
| 4.7826 | 550 | 0.3275 | - | 0.5 |
| 4.8696 | 560 | 0.4016 | - | - |
| 4.9565 | 570 | 0.4889 | - | - |
| 5.0 | 575 | - | 0.4915 | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
nreimers/MiniLM-L6-H384-uncased