---
language: en
pipeline_tag: text-generation
library_name: mlx
tags:
- quantized
- mlx
base_model:
- upstage/Solar-Open-100B
---
# NOTICE
No longer available on HF due to storage restrictions - [archived here](https://modelscope.ai/models/inferencerlabs/Solar-Open-100B-MLX-6.5bit-archive)

## Information
**See Solar-Open-100B MLX in action - [demonstration video](https://youtu.be/2cSDvE3QeNc)**

*q6.5bit mixed quant typically achieves 1.128 perplexity in our testing*
| Quantization | Perplexity |
|:------------:|:----------:|
| **q2.5**     | 41.293     |
| **q3.5**     | 1.900      |
| **q4.5**     | 1.168      |
| **q4.8**     | 1.140      |
| **q5.5**     | 1.141      |
| **q6.5**     | 1.128      |
| **q8.5**     | 1.128      |

## Usage Notes
    
#### Tested on a M3 Ultra using [Inferencer app v1.9.1](https://inferencer.com)
- Single inference ~45 tokens/s @ 1000 tokens
- Batched inference ~72 total tokens/s across four inferences
- Memory usage: ~78 GB

##### Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.30
##### For more details see [demonstration video](https://youtu.be/2cSDvE3QeNc) or visit [Solar-Open-100B](https://huggingface.co/upstage/Solar-Open-100B).