File size: 1,944 Bytes

---
license: bigscience-bloom-rail-1.0
language:
- ak
- ar
- as
- bm
- bn
- ca
- code
- en
- es
- eu
- fon
- fr
- gu
- hi
- id
- ig
- ki
- kn
- lg
- ln
- ml
- mr
- ne
- nso
- ny
- or
- pa
- pt
- rn
- rw
- sn
- st
- sw
- ta
- te
- tn
- ts
- tum
- tw
- ur
- vi
- wo
- xh
- yo
- zh
- zhs
- zht
- zu
pipeline_tag: text-generation
---

<h1 style='text-align: center '>BLOOM LM - 8bit</h1> 
<h2 style='text-align: center '><em>BigScience Large Open-science Open-access Multilingual Language Model - 8bit</em> </h2> 
<h3 style='text-align: center '>Model Card</h3>
<img src="https://s3.amazonaws.com/moonup/production/uploads/1657124309515-5f17f0a0925b9863e28ad517.png" alt="BigScience Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>

Version 1.0 / 26.May.2022

Related paper: https://arxiv.org/abs/2208.07339

## TL;DR

This repository contains 8bit weights of `bloom-1b7` model. You can load this model using `transformers==4.28.0` and `bitsandbytes>0.37.2` out of the box !

```python
# pip install accelerate bitsandbytes
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("ybelkada/bloom-1b7-8bit")
```

## How to push 8bit weights?

First, make sure you are using `transformers` & `bitsandbytes` versions stated above. Then load your 8bit model as usual using `load_in_8bit=True`!

```python
# pip install accelerate bitsandbytes
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("bigscience/bloom-1b7", device_map="auto", load_in_8bit=True)
```

Then just call `push_to_hub` method or `save_pretrained` method if you want to save your 8bit model locally

```python
model.push_to_hub("{your_username}/bloom-1b7-8bit")
```

That's it! 

## What is inside the model's `state_dict`?

Inside the state dict of the model (`pytorch_model.bin` file) you have

- the quantized `int8` weights
- the quantization statistics in `float16`