--- language: en pipeline_tag: text-generation library_name: mlx tags: - quantized - mlx base_model: - upstage/Solar-Open-100B --- # NOTICE No longer available on HF due to storage restrictions - [archived here](https://modelscope.ai/models/inferencerlabs/Solar-Open-100B-MLX-6.5bit-archive) ## Information **See Solar-Open-100B MLX in action - [demonstration video](https://youtu.be/2cSDvE3QeNc)** *q6.5bit mixed quant typically achieves 1.128 perplexity in our testing* | Quantization | Perplexity | |:------------:|:----------:| | **q2.5** | 41.293 | | **q3.5** | 1.900 | | **q4.5** | 1.168 | | **q4.8** | 1.140 | | **q5.5** | 1.141 | | **q6.5** | 1.128 | | **q8.5** | 1.128 | ## Usage Notes #### Tested on a M3 Ultra using [Inferencer app v1.9.1](https://inferencer.com) - Single inference ~45 tokens/s @ 1000 tokens - Batched inference ~72 total tokens/s across four inferences - Memory usage: ~78 GB ##### Quantized with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.30 ##### For more details see [demonstration video](https://youtu.be/2cSDvE3QeNc) or visit [Solar-Open-100B](https://huggingface.co/upstage/Solar-Open-100B).