Instructions to use unsloth/Llama-3.2-11B-Vision-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use unsloth/Llama-3.2-11B-Vision-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="unsloth/Llama-3.2-11B-Vision-Instruct") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("unsloth/Llama-3.2-11B-Vision-Instruct") model = AutoModelForImageTextToText.from_pretrained("unsloth/Llama-3.2-11B-Vision-Instruct") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use unsloth/Llama-3.2-11B-Vision-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "unsloth/Llama-3.2-11B-Vision-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "unsloth/Llama-3.2-11B-Vision-Instruct", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/unsloth/Llama-3.2-11B-Vision-Instruct
- SGLang
How to use unsloth/Llama-3.2-11B-Vision-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "unsloth/Llama-3.2-11B-Vision-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "unsloth/Llama-3.2-11B-Vision-Instruct", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "unsloth/Llama-3.2-11B-Vision-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "unsloth/Llama-3.2-11B-Vision-Instruct", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Unsloth Studio new
How to use unsloth/Llama-3.2-11B-Vision-Instruct with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for unsloth/Llama-3.2-11B-Vision-Instruct to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for unsloth/Llama-3.2-11B-Vision-Instruct to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for unsloth/Llama-3.2-11B-Vision-Instruct to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="unsloth/Llama-3.2-11B-Vision-Instruct", max_seq_length=2048, ) - Docker Model Runner
How to use unsloth/Llama-3.2-11B-Vision-Instruct with Docker Model Runner:
docker model run hf.co/unsloth/Llama-3.2-11B-Vision-Instruct
Issue Loading Llama-3.2 (11B) Vision Instruct Model in Colab
Hello Unslooth Team,
I am attempting to load the Llama-3.2 (11B)-Vision-Instruct model in Colab, but I am encountering the following error:
RuntimeError: The checkpoint you are trying to load has model type `mllama` but Transformers does not recognize this model type.
Could you please confirm if this model is supported on Colab, or if there are specific configurations or dependencies I might be missing? I have updated the necessary libraries, but the issue persists.
I would appreciate your assistance in resolving this.
Best regards,
[Enes Tura]
Hey guys apologies for the delays. Vision models are now supported! :)
Read our blogpost: https://unsloth.ai/blog/vision
Tweet: https://x.com/UnslothAI/status/1859667930075758793
GitHub post: https://github.com/unslothai/unsloth/releases/tag/November-2024
is there any update on this?
is there any update on this?
Still in the works - apologies.
hello , is there any update on this model ? or can you please inform me whenever the update model comes...
hello , is there any update on this model ? or can you please inform me whenever the update model comes...
Yes we will let you guys know once it's supported so we can update all the model cards many thanks!
thank you :)
Hello, is there any progress?
Hello, is there any progress ?
Yes. It's done and ready to go but we need to announce it so this week for sure! :) I will notify you all
Hello, is there any progress ?
Hello, is there any progress?
Hello Unslooth Team,
I am attempting to load the
Llama-3.2 (11B)-Vision-Instructmodel in Colab, but I am encountering the following error:RuntimeError: The checkpoint you are trying to load has model type `mllama` but Transformers does not recognize this model type.Could you please confirm if this model is supported on Colab, or if there are specific configurations or dependencies I might be missing? I have updated the necessary libraries, but the issue persists.
I would appreciate your assistance in resolving this.
Best regards,
[Enes Tura]
is there any update on this?
Hey guys apologies for the delays. Vision models are now supported! :)
Read our blogpost: https://unsloth.ai/blog/vision
Tweet: https://x.com/UnslothAI/status/1859667930075758793
GitHub post: https://github.com/unslothai/unsloth/releases/tag/November-2024
Thank you for the update, do I have a chance to fine-tune the 11 billion instruction version of the llama3.2 model? I don't want to do image processing. I just want to fine-tune it for the Turkish language using unsloth.
Thank you for the update, do I have a chance to fine-tune the 11 billion instruction version of the llama3.2 model? I don't want to do image processing. I just want to fine-tune it for the Turkish language using unsloth.
Yes of course we allow you to do that. You can fine-tune the image and language part separately.