--- language: ["ko", "en", "ja", "zh", "es", "fr", "de", "pt", "it", "ru", "ar", "hi", "th", "vi", "id", "tr", "nl", "pl"] tags: - sentence-transformers - intent-classification - multilingual - distillation - layer-pruning library_name: sentence-transformers pipeline_tag: sentence-similarity license: apache-2.0 --- # Intent Classifier Student: L4_top Distilled multilingual sentence encoder for intent classification (Action / Recall / Other). Created by **layer pruning** from `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2`. ## Model Details | Property | Value | |----------|-------| | Teacher | paraphrase-multilingual-MiniLM-L12-v2 | | Architecture | XLM-RoBERTa (pruned) | | Hidden dim | 384 | | Layers | 4 (from 12) | | Layer indices | [8, 9, 10, 11] | | Strategy | 4 layers, top quarter (semantic-focused compact) | | Est. params | 103,283,328 | | Est. FP32 | 394.0MB | | Est. INT8 | 98.5MB | | Est. INT8 + vocab pruned | 27.1MB | ## Supported Languages (18) ko, en, ja, zh, es, fr, de, pt, it, ru, ar, hi, th, vi, id, tr, nl, pl ## Intended Use This is a **student encoder** designed to be used as the backbone for a lightweight 3-class intent classifier (Action / Recall / Other) in multilingual dialogue systems. - **Action**: User requests an action (book, order, change settings, etc.) - **Recall**: User asks about past events or stored information - **Other**: Greetings, chitchat, emotions, etc. ## Usage ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("L4_top") embeddings = model.encode(["예약 좀 해줘", "지난번 주문 뭐였지?", "안녕하세요"]) print(embeddings.shape) # (3, 384) ``` ## MTEB Results ### MassiveIntentClassification **Average: 45.26%** | Language | Score | |----------|-------| | ar | 34.77% | | en | 55.68% | | es | 42.64% | | ko | 47.96% | ### MassiveScenarioClassification **Average: 48.79%** | Language | Score | |----------|-------| | ar | 36.89% | | en | 61.66% | | es | 47.3% | | ko | 49.3% | ## Training / Distillation This model was created via **layer pruning** (no additional training): 1. Load teacher: `paraphrase-multilingual-MiniLM-L12-v2` (12 layers, 384 hidden) 2. Select layers: `[8, 9, 10, 11]` 3. Copy embedding weights + selected layer weights 4. Wrap with mean pooling for sentence embeddings For deployment, vocabulary pruning (250K → ~55K tokens) and INT8 quantization are applied to meet the ≤50MB size constraint. ## Limitations - Layer pruning without fine-tuning may lose some quality vs. proper knowledge distillation - Vocabulary pruning limits the model to the target 18 languages - Designed for short dialogue utterances, not long documents