Sentence Similarity
sentence-transformers
Safetensors
English
Khasi
khasi
semantic-search
northeast-india
cross-lingual
Instructions to use MWirelabs/khasi-english-semantic-search with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use MWirelabs/khasi-english-semantic-search with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("MWirelabs/khasi-english-semantic-search") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
Khasi-English Semantic Search Model
First production-ready semantic search model for Khasi-English language pairs.
Overview
This model enables semantic search between English and Khasi languages, supporting Northeast India's linguistic diversity. Trained on 66,794 English-Khasi translation pairs.
Use Cases
- Cross-lingual semantic search (English ↔ Khasi)
- Document similarity in bilingual contexts
- Cultural content discovery for Northeast India
- Educational language learning tools
Performance
- English-Khasi similarity: 0.69-0.74
- Model size: ~90MB (lightweight deployment)
- 384-dimensional embeddings
Quick Start
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('MWirelabs/khasi-english-semantic-search')
sentences = ['Hello', 'hangne', 'Good morning']
embeddings = model.encode(sentences)
Developed by MWirelabs for Northeast India AI innovation.
- Downloads last month
- 4