adaptive-k-demo / README.md
Tuo Nome
Pin gradio 4.31.0 + huggingface_hub 0.22.2 for compatibility
c63c513
|
Raw
History Blame Contribute Delete
1.16 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: Adaptive-K Demo
emoji: 🧠
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 4.31.0
python_version: 3.11
app_file: app.py
pinned: true
license: mit
short_description: Dynamic Expert Selection for MoE Models
tags:
  - moe
  - mixture-of-experts
  - inference-optimization
  - llm
  - adaptive-routing

Adaptive-K: Dynamic Expert Selection for MoE

This demo shows how Adaptive-K dynamically selects the number of experts based on input complexity, reducing inference costs by 30-50% while maintaining accuracy.

How it works

  1. Input text β†’ Router computes routing probabilities
  2. Entropy calculation β†’ Measures router uncertainty
  3. K selection β†’ Low entropy = fewer experts, High entropy = more experts
  4. Cost savings β†’ Visualize the compute reduction

Links