Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 15
How to use Hvare/Athena-indobert-finetuned-indonli with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Hvare/Athena-indobert-finetuned-indonli")
sentences = [
"Pura Ulun Danu terletak sekitar 56 kilometer dari Kota Denpasar.",
"Dalam tujuh bulan kehamilan, organ tubuh bayi sudah sempurna.",
"Dokter Adeline menjelaskan aturan-aturan agar diabetisi aman berpuasa.",
"Pura Ulun Danu terletak sekitar satu jam perjalanan dari Kota Denpasar."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from indobenchmark/indobert-base-p2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 75, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Hvare/Athena-indobert-finetuned-indonli")
# Run inference
sentences = [
'Tumenggung Wirapraja setelah mangkat dimakamkan di Kebon Alas Warudoyong, Kecamatan Panumbangan, Kabupaten Ciamis.',
'Tumenggung Wirapraja dikremasi setelah dipastikan mangkat dan abunya kemudian dilarungkan ke Pantai Laut Selatan.',
'Di hari libur ini, Pengunjung semua taman nasional tidak dibebaskan biaya.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
sts-devEmbeddingSimilarityEvaluator| Metric | Value |
|---|---|
| pearson_cosine | -0.053 |
| spearman_cosine | -0.0611 |
| pearson_manhattan | -0.064 |
| spearman_manhattan | -0.0684 |
| pearson_euclidean | -0.0643 |
| spearman_euclidean | -0.0691 |
| pearson_dot | -0.0245 |
| spearman_dot | -0.0242 |
| pearson_max | -0.0245 |
| spearman_max | -0.0242 |
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | int |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
"" "Akan ada protes dan hal-hal lain, semua nya sudah direncanakan," "ungkap oposisi kepada El Mundo." |
Protes dan hal-hal lain sudah direncanakan. |
0 |
Tak jarang, bangun kesiangan pun jadi alasan untuk tak berolahraga. |
Salah satu alasan tidak berolahraga adalah bangun kesiangan. |
0 |
Namun, saingannya Prabowo Subianto juga mendeklarasikan kemenangan, membuat orang Indonesia bingung. |
Prabowo menerima bahwa Dia kalah. |
2 |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16num_train_epochs: 1multi_dataset_batch_sampler: round_robinoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falsebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin| Epoch | Step | Training Loss | sts-dev_spearman_max |
|---|---|---|---|
| 0.0991 | 64 | - | -0.0411 |
| 0.1981 | 128 | - | -0.0426 |
| 0.2972 | 192 | - | -0.0419 |
| 0.3963 | 256 | - | -0.0425 |
| 0.4954 | 320 | - | -0.0384 |
| 0.5944 | 384 | - | -0.0260 |
| 0.6935 | 448 | - | -0.0216 |
| 0.7740 | 500 | 0.0531 | - |
| 0.7926 | 512 | - | -0.0243 |
| 0.8916 | 576 | - | -0.0241 |
| 0.9907 | 640 | - | -0.0242 |
| 1.0 | 646 | - | -0.0242 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
indobenchmark/indobert-base-p2