Image-Text-to-Text
MLX
Safetensors
Transformers
English
gemma4
text-generation-inference
unsloth
reasoning
conversational
8-bit precision

This doesnt work on open LLM some error with Vision

#1
by ralong - opened
🥲 Failed to load the model

Failed to load model.

Error when loading model: ValueError: Missing 1 parameters: 
embed_vision.embedding_projection.biases.

2026-04-26 01:06:06 [DEBUG]
The tokenizer you are loading from '/Users/oskar/.lmstudio/models/zecanard/gemma-4-26B-A4B-it-Claude-Opus-Distilled-v2-MLX-8bit-mxfp8' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the fix_mistral_regex=True flag when loading this tokenizer to fix this issue.
2026-04-26 01:06:06 [DEBUG]
[model_kit][INFO]: Loading model from /Users/oskar/.lmstudio/models/zecanard/gemma-4-26B-A4B-it-Claude-Opus-Distilled-v2-MLX-8bit-mxfp8...
2026-04-26 01:06:14 [DEBUG]
The tokenizer you are loading from '/Users/oskar/.lmstudio/models/zecanard/gemma-4-26B-A4B-it-Claude-Opus-Distilled-v2-MLX-8bit-mxfp8' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the fix_mistral_regex=True flag when loading this tokenizer to fix this issue.
2026-04-26 01:06:16 [DEBUG]
The tokenizer you are loading from '/Users/oskar/.lmstudio/models/zecanard/gemma-4-26B-A4B-it-Claude-Opus-Distilled-v2-MLX-8bit-mxfp8' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the fix_mistral_regex=True flag when loading this tokenizer to fix this issue.
2026-04-26 01:06:16 [DEBUG]
/Users/oskar/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac14-arm64@22/lib/python3.11/site-packages/transformers/audio_utils.py:538: UserWarning: At least one mel filter has all zero values. The value for num_mel_filters (128) may be set too high. Or, the value for num_frequency_bins (257) may be set too low.
warnings.warn(
2026-04-26 01:06:17 [DEBUG]
ValueError: Missing 1 parameters:
embed_vision.embedding_projection.biases.

At:
/Users/oskar/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac14-arm64@22/lib/python3.11/site-packages/mlx/nn/layers/base.py(191): load_weights
/Users/oskar/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac14-arm64@22/lib/python3.11/site-packages/mlx_engine/model_kit/vision_add_ons/load_utils.py(182): prepare_components
/Users/oskar/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac14-arm64@22/lib/python3.11/site-packages/mlx_engine/model_kit/vision_add_ons/gemma4.py(71): init
/Users/oskar/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac14-arm64@22/lib/python3.11/site-packages/mlx_engine/model_kit/model_kit.py(120): _full_model_init
/Users/oskar/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac14-arm64@22/lib/python3.11/site-packages/mlx_engine/model_kit/model_kit.py(141): init
/Users/oskar/.lmstudio/extensions/backends/vendor/_amphibian/app-mlx-generate-mac14-arm64@22/lib/python3.11/site-packages/mlx_engine/generate.py(253): load_model
2026-04-26 01:06:17 [DEBUG]
[LLMProcess] Failed to load model _0x5b3431 [Error]: Failed to load model.
at _0x1186d7.loadModel (/Applications/LM Studio.app/Contents/Resources/app/.webpack/lib/llmworker.js:1:610408)
at process.processTicksAndRejections (node:internal/process/task_queues:104:5)
at async _0x1186d7.handleMessage (/Applications/LM Studio.app/Contents/Resources/app/.webpack/lib/llmworker.js:1:602598) {
cause: 'Error when loading model: ValueError: Missing 1 parameters: \n' +
'embed_vision.embedding_projection.biases.',
suggestion: undefined,
errorData: undefined,
data: undefined,
displayData: undefined,
title: 'Failed to load model.'
}
2026-04-26 01:06:17 [DEBUG]
stopGenerating() without a request_id is deprecated. Taking no action

Unfortunately there seems to be a bug in the MLX engine with MXFP builds right now. You can try one of the affine quants in the meantime:
https://huggingface.co/zecanard/gemma-4-26B-A4B-it-Claude-Opus-Distilled-v2-MLX-8bit-int8-affine

Will there be a v3 fix for that?

Unfortunately there seems to be a bug in the MLX engine with MXFP builds right now. You can try one of the affine quants in the meantime:
https://huggingface.co/zecanard/gemma-4-26B-A4B-it-Claude-Opus-Distilled-v2-MLX-8bit-int8-affine

There shouldn’t be a need for a new quant since it is strictly a bug in the MLX engine. Once the fix goes in, you can try this conversion again to see if it works.

If TeichAI puts out a v3 of this finetune and I don’t notice it, just let me know and I’ll be happy to convert it to MLX.

So the fix will be in the LM studio itself not in the model ?

There shouldn’t be a need for a new quant since it is strictly a bug in the MLX engine. Once the fix goes in, you can try this conversion again to see if it works.

If TeichAI puts out a v3 of this finetune and I don’t notice it, just let me know and I’ll be happy to convert it to MLX.

It will be in the MLX engine.

Sign up or log in to comment