adaptive-k-demo / README.md
Tuo Nome
Pin gradio 4.31.0 + huggingface_hub 0.22.2 for compatibility
c63c513
|
Raw
History Blame Contribute Delete
1.16 kB
---
title: Adaptive-K Demo
emoji: 🧠
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 4.31.0
python_version: 3.11
app_file: app.py
pinned: true
license: mit
short_description: Dynamic Expert Selection for MoE Models
tags:
- moe
- mixture-of-experts
- inference-optimization
- llm
- adaptive-routing
---
# Adaptive-K: Dynamic Expert Selection for MoE
This demo shows how **Adaptive-K** dynamically selects the number of experts based on input complexity, reducing inference costs by 30-50% while maintaining accuracy.
## How it works
1. **Input text** β†’ Router computes routing probabilities
2. **Entropy calculation** β†’ Measures router uncertainty
3. **K selection** β†’ Low entropy = fewer experts, High entropy = more experts
4. **Cost savings** β†’ Visualize the compute reduction
## Links
- πŸ“„ [Paper](https://github.com/Gabrobals/sbm-efficient/blob/master/Entropy_Guided_Dynamic_Expert_Selection_in_Mixture_of_Experts_Models.pdf)
- πŸ“– [Whitepaper](https://adaptive-k.vertexdata.it/whitepaper.html)
- πŸ’» [GitHub](https://github.com/Gabrobals/sbm-efficient)
- πŸ“¦ [PyPI](https://pypi.org/project/adaptive-k-routing/)