Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.19.0
metadata
title: Adaptive-K Demo
emoji: π§
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 4.31.0
python_version: 3.11
app_file: app.py
pinned: true
license: mit
short_description: Dynamic Expert Selection for MoE Models
tags:
- moe
- mixture-of-experts
- inference-optimization
- llm
- adaptive-routing
Adaptive-K: Dynamic Expert Selection for MoE
This demo shows how Adaptive-K dynamically selects the number of experts based on input complexity, reducing inference costs by 30-50% while maintaining accuracy.
How it works
- Input text β Router computes routing probabilities
- Entropy calculation β Measures router uncertainty
- K selection β Low entropy = fewer experts, High entropy = more experts
- Cost savings β Visualize the compute reduction
Links
- π Paper
- π Whitepaper
- π» GitHub
- π¦ PyPI