Instructions to use Jackrong/Qwopus3.5-27B-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps Settings
- Unsloth Studio
How to use Jackrong/Qwopus3.5-27B-v3 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Jackrong/Qwopus3.5-27B-v3 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Jackrong/Qwopus3.5-27B-v3 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Jackrong/Qwopus3.5-27B-v3 to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Jackrong/Qwopus3.5-27B-v3", max_seq_length=2048, )
Qwopus3.5 v3 0.8B?
Is there any chance we could a get a qwopus3.5 v3 0.8B so we could use it as a draft model for the 27B? Also just qwopus v3 tunes of the smaller models would be nice in general.
I will
Is there any chance we could a get a qwopus3.5 v3 0.8B so we could use it as a draft model for the 27B? Also just qwopus v3 tunes of the smaller models would be nice in general.
How do you draft qwen3.5? MTP?
I am using llama.cpp, it does not support MTP yet.. Old methods simply fail to enable speculative encoding.
Is there any chance we could a get a qwopus3.5 v3 0.8B so we could use it as a draft model for the 27B? Also just qwopus v3 tunes of the smaller models would be nice in general.
How do you draft qwen3.5? MTP?
I am using llama.cpp, it does not support MTP yet.. Old methods simply fail to enable speculative encoding.
I am using vllm not lama.cpp. I have gotten ngram and running a dedicated model separately as its own deployment works as well (does require building a harness for it).
Yes, that would honestly be incredible! A Qwopus3.5 v3 0.8B would be a dream for draft-model / speculative decoding use cases, and a 2B variant on top of that would be even more amazing β perfect size for local/edge deployment!
Jackrong u the man!
Thanks everyone for the support! Iβve been a bit busy lately, but I should have some time today. Iβll optimize the 2B and 0.8B models as soon as I can!
Thanks a lot! And also thanks for the comprehensive PDF guide and the whole repo β super helpful, really cool!
Thanks a lot for the support β really appreciate it!
Iβll gradually put together all the guides and a complete notebook once I have more timeπ