rhaymison commited on
Commit
aa0a62d
·
verified ·
1 Parent(s): b0a2c59

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -56,7 +56,11 @@ _ = model.generate(**inputs, streamer=streamer, max_new_tokens=200)
56
 
57
  ```
58
 
59
- # 4bits
 
 
 
 
60
 
61
  ```python
62
  from transformers import BitsAndBytesConfig
 
56
 
57
  ```
58
 
59
+ If you are having a memory problem such as "CUDA Out of memory", you should use 4-bit or 8-bit quantization.
60
+ For the complete model in colab you will need the A100.
61
+ If you want to use 4bits or 8bits, T4 or L4 will already solve the problem.
62
+
63
+ # 4bits example
64
 
65
  ```python
66
  from transformers import BitsAndBytesConfig