Sentence Similarity
sentence-transformers
Safetensors
bert
feature-extraction
Generated from Trainer
dataset_size:616
loss:CosineSimilarityLoss
text-embeddings-inference
Instructions to use Ananthu357/Ananthus-BAAI-for-contracts8.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Ananthu357/Ananthus-BAAI-for-contracts8.0 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Ananthu357/Ananthus-BAAI-for-contracts8.0") sentences = [ "Fulfilment of contractual obligations", "Should a tenderer find discrepancies in or omissions from the drawings or any of the Tender Forms or should he be in doubt as to their meaning", "Period of Maintenance shall mean the specified period of maintenance from the date of completion of the works, as certified by the Engineer.", "A copy of certificate stating that they are not liable to be disqualified and all their statements/documents" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
| { | |
| "added_tokens_decoder": { | |
| "0": { | |
| "content": "[PAD]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "100": { | |
| "content": "[UNK]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "101": { | |
| "content": "[CLS]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "102": { | |
| "content": "[SEP]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| }, | |
| "103": { | |
| "content": "[MASK]", | |
| "lstrip": false, | |
| "normalized": false, | |
| "rstrip": false, | |
| "single_word": false, | |
| "special": true | |
| } | |
| }, | |
| "clean_up_tokenization_spaces": true, | |
| "cls_token": "[CLS]", | |
| "do_basic_tokenize": true, | |
| "do_lower_case": true, | |
| "mask_token": "[MASK]", | |
| "model_max_length": 512, | |
| "never_split": null, | |
| "pad_token": "[PAD]", | |
| "sep_token": "[SEP]", | |
| "strip_accents": null, | |
| "tokenize_chinese_chars": true, | |
| "tokenizer_class": "BertTokenizer", | |
| "unk_token": "[UNK]" | |
| } | |