Instructions to use litert-community/Qwen3-0.6B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LiteRT-LM
How to use litert-community/Qwen3-0.6B with LiteRT-LM:
# LiteRT-LM runs on various platforms (Android, iOS, Windows, Linux, macOS, IoT, Web/WASM) # and supports many APIs (C++, Python, Kotlin, Swift, JavaScript, Flutter). # For platform-specific integration guides, please refer to the official developer website: # https://ai.google.dev/edge/litert-lm # To try LiteRT-LM, the easiest way is to use our CLI tool. # 1. Install the LiteRT-LM CLI tool: pip install litert-lm # 2. Download and run this model locally: # See: https://ai.google.dev/edge/litert-lm/cli litert-lm run \ --from-huggingface-repo=litert-community/Qwen3-0.6B \ model.litertlm \ --prompt="Write me a poem"
- Notebooks
- Google Colab
- Kaggle
Function calling
Is there an option for use function calling with litertlm and qwen models ?
not at least from my tests (https://github.com/monday8am/edgelab)
It's a bit old code but it can help
It is possible, just need to fit the specific Qwen models you need. See LiteRT-LM's Function Calling instruction: https://developers.google.com/edge/litert-lm/api_overview#function-calling
It is possible, just need to fit the specific Qwen models you need. See LiteRT-LM's Function Calling instruction: https://developers.google.com/edge/litert-lm/api_overview#function-calling
When the model invoke a function got this error:
FATAL EXCEPTION: Thread-102
Process: com.uriel.inventoryagent, PID: 14862
com.google.ai.edge.litertlm.LiteRtLmJniException: Failed to start nativeSendMessageAsync: INTERNAL: Failed to apply template: unknown method: string has no method named strip (in template:45)
at com.google.ai.edge.litertlm.LiteRtLmJni.nativeSendMessageAsync(Native Method)
at com.google.ai.edge.litertlm.Conversation$JniMessageCallbackImpl.onDone(Conversation.kt:374)
so from what I've gathered here recently, Qwen 0.6B just isn't built for serious tool use. It'll chat fine, but once you start asking it to actually do things, it gets shaky. Half the time it'll answer instead of calling the tool, and when it does try, the tool call can break and fall back to plain text.
Gemma E2B is a completely different story. It's actually usable for on device agents and simple tool workflows. Not perfect, but reliable enough that people can trust it to perform actions instead of pretending it did.
That's why in my app I'm leaning toward offering both models but defaulting to my "calculatedDevicePerformance" (GOOD / OK / POOR) and estimateGpuMemory against total device RAM. So instead of making users choose by download size, the setup flow should read the device and recommend for them
The whole goal of the app is giving EVERYONE a real local AI experience that can actually interact with tools when needed, send/read texts, notifications reminders social media posts emails etc