Visual Question Answering
Transformers
Safetensors
minicpmv
feature-extraction
custom_code
4-bit precision
bitsandbytes
Instructions to use openbmb/MiniCPM-Llama3-V-2_5-int4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openbmb/MiniCPM-Llama3-V-2_5-int4 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="openbmb/MiniCPM-Llama3-V-2_5-int4", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("openbmb/MiniCPM-Llama3-V-2_5-int4", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
report for local try run
#2
by kennyx - opened
I'm trying to run the model locally. for both MAC M3 chip and windows 2080Ti
the same error occured:
"""
ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes
"""
I've tried to re-install the latest version. unfortunate, it's still happy.
I'm not sure how to fix this.
I am facing the same error today, have you fixed it?
I encountered the same error, and I resolved it by following these steps:
- Execute the command
python -c 'import torch; print(torch.version.cuda)'. If it returnsNone, this indicates that you have installed the CPU-only version of PyTorch without the necessary CUDA runtime. - To rectify this, use the installation commands provided here, ensuring that the installation log confirms the selection of the appropriate binary.
- Ensure that the torch version is updated from
2.1.2to2.1.2+cu121, and it should works.
Good luck.
Reference