--- language: - en license: mit tags: - intent-classification - transformer - virtual-assistant - nlp - voice-assistant - offline-ai - edge-deployment metrics: - accuracy --- # JaneGPT v2 — Intent Classification Model A lightweight, fast, and accurate intent classification model built from scratch for virtual assistant command understanding. **7.8M parameters | 22 intent classes | 88.6% validation accuracy | ~17ms inference on GPU** ![Loss Curves](assets/janegpt_combined_loss_curves.png) --- ## Why I Built This I'm building JANE — a fully offline, privacy-first AI voice assistant. Llama 3 8B was causing 10–22 second delays for simple commands like "turn up the volume." That's not a voice assistant. That's a waiting game. So I designed JaneGPT v2 from scratch — a model that does exactly one job, does it fast, and runs on consumer hardware without any cloud dependency. --- ## Model Details | Property | Value | |---|---| | Architecture | Decoder-only Transformer + Classification Head | | Parameters | ~7.8M | | Embedding dim | 256 | | Attention heads | 8 | | KV heads (GQA) | 4 | | Layers | 8 | | FF hidden dim | 672 | | Max sequence length | 256 | | Vocab size | 8,192 | | Tokenizer | Custom BPE | | Training accuracy | ~96.7% | | Validation accuracy | 88.6% | | Checkpoint size | ~30MB | --- ## Architecture Decisions & Why | Choice | Reason | |---|---| | **GQA** (4 KV heads, 8 attention heads) | Reduces memory without losing expressiveness | | **RoPE** positional encoding | Better length generalization than learned embeddings | | **SwiGLU** activation | Smoother gradients than ReLU at this model size | | **RMSNorm** | Simpler and faster than LayerNorm | | **Custom BPE tokenizer** | Trained specifically on command-style text | --- ## Supported Intents (22 classes) | Category | Intents | |---|---| | Volume | `volume_up`, `volume_down`, `volume_set`, `volume_mute` | | Brightness | `brightness_up`, `brightness_down`, `brightness_set` | | Media | `media_play`, `media_pause`, `media_next`, `media_previous` | | Apps | `app_launch`, `app_close`, `app_switch` | | Browser | `browser_search` | | Productivity | `set_reminder`, `screenshot` | | Screen | `read_screen`, `explain_screen` | | Control | `undo`, `quit_jane` | | Conversation | `chat` | --- ## Performance | Input | Predicted Intent | Confidence | |---|---|---| | "increase the volume" | volume_up | 86% | | "make it louder" | volume_up | 90% | | "turn down the brightness" | brightness_down | 80% | | "open chrome" | app_launch | 98% | | "play some music" | media_play | 96% | | "search for cats on youtube" | browser_search | 94% | | "set a reminder for 5 minutes" | set_reminder | 96% | | "take a screenshot" | screenshot | 88% | | "undo that" | undo | 92% | | "hello" | chat | 97% | --- ## Quick Start ### Installation ```python git clone https://huggingface.co/RavinduSen/JaneGPT-v2 cd JaneGPT-v2 pip install -r requirements.txt ``` ### Basic Usage ```python from classifier import JaneGPTClassifier classifier = JaneGPTClassifier() intent, confidence = classifier.predict("turn up the volume") print(f"Intent: {intent}, Confidence: {confidence:.2%}") # Output: Intent: volume_up, Confidence: 86.10% intent, confidence = classifier.predict("open chrome") print(f"Intent: {intent}, Confidence: {confidence:.2%}") # Output: Intent: app_launch, Confidence: 98.10% ``` ### With Conversation Context ```python intent, confidence = classifier.predict( "not enough", context={"last_intent": "volume_up"} ) # Output: Intent: volume_up, Confidence: 79.00% ``` --- ## Training Setup | Component | Details | |---|---| | Hardware | NVIDIA RTX 3050Ti (4GB VRAM) | | CPU | AMD Ryzen 9 5900HX | | RAM | 16GB | | Additional | Google Colab (extended training runs) | | Framework | PyTorch 2.0+ | | Training data | Custom command dataset (claude assisted generation under author supervision) | --- ## Limitations - Intent classification only — does not generate text - 22 classes — commands outside supported set classified as `chat` - English only - Optimized for short inputs (1–15 words) - No entity extraction — returns intent label only --- ## Use Cases - Virtual assistant command routing - Smart home intent classification - Voice command understanding - Chatbot intent detection - Edge device deployment (small enough for embedded systems) --- ## Part of the JANE Project This model is the intelligence core of **JANE** — a fully offline, privacy-first AI voice assistant. 🔗 [JANE AI Assistant on GitHub](https://github.com/Ravindu-S/JANE-AI-Assistant) 🔗 [JaneGPT-v2 on GitHub](https://github.com/Ravindu-S/JaneGPT-v2) --- ## Created By **Ravindu Senanayake** — Computer Science Undergraduate, Sri Lanka Built from scratch — architecture, tokenizer, and training pipeline designed and implemented by the author. [![GitHub](https://img.shields.io/badge/GitHub-Ravindu--S-black?logo=github)](https://github.com/Ravindu-S)