my_pi_agent / README.md
AlexWortega's picture
Soyuz chat + reasoning/raw viewer on ZeroGPU (llama-cpp-python)
e51ba55 verified
|
Raw
History Blame Contribute Delete
789 Bytes

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: My Pi Agent
emoji: 🦀
colorFrom: gray
colorTo: indigo
sdk: gradio
sdk_version: 6.14.0
python_version: '3.12'
app_file: app.py
pinned: false
models:
  - AlexWortega/qwen35-4b-soyuz-merged-gguf

My Pi Agent — Soyuz

Chat with qwen35-4b-soyuz-merged (a Qwen3.5-4B hybrid linear-attention model) served as a GGUF via llama-cpp-python on ZeroGPU.

The right-hand panel surfaces what the model actually produced:

  • 🧠 Reasoning — the <think> ... </think> chain of thought.
  • 📝 Raw output — the full untouched generation, tags included.

GGUF: AlexWortega/qwen35-4b-soyuz-merged-gguf (Q4_K_M, MTP head dropped — --no-mtp — since llama.cpp does not yet run Qwen3.5 MTP).