Image-Text-to-Text

Vision modality not enabled in Android app

#53
by Deeko76 - opened

Created a basic Android app and pushed model to phone, model initialises but cannot accept image input (image also resized correct). Using Mediapipe implementation("com.google.mediapipe:tasks-genai:0.10.29").

Error:

%aUNAVAILABLE: Vision Modality is not enabled

Does this model currently not have image input enabled?

Thanks
Screenshot_2026-01-24-17-10-01-00_c0887651a1dd380f30788ab7072103a4

Google org

Hi @Deeko76 ,

Yes, the model accepts image as an input. Here are the constraints: Images, normalized to 256x256, 512x512, or 768x768 resolution and encoded to 256 tokens each.
This looks like you haven't enabled vision support for LLM Inference API, you have to set EnableVisionModality configuration option to true within the Graph options.
Here is the official reference: https://ai.google.dev/edge/mediapipe/solutions/genai/llm_inference/android#image-input

Thank you!

Sign up or log in to comment