Instructions to use LoneStriker/miquella-120b-4.5bpw-h6-exl2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LoneStriker/miquella-120b-4.5bpw-h6-exl2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="LoneStriker/miquella-120b-4.5bpw-h6-exl2")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("LoneStriker/miquella-120b-4.5bpw-h6-exl2") model = AutoModelForMultimodalLM.from_pretrained("LoneStriker/miquella-120b-4.5bpw-h6-exl2") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use LoneStriker/miquella-120b-4.5bpw-h6-exl2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "LoneStriker/miquella-120b-4.5bpw-h6-exl2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LoneStriker/miquella-120b-4.5bpw-h6-exl2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/LoneStriker/miquella-120b-4.5bpw-h6-exl2
- SGLang
How to use LoneStriker/miquella-120b-4.5bpw-h6-exl2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "LoneStriker/miquella-120b-4.5bpw-h6-exl2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LoneStriker/miquella-120b-4.5bpw-h6-exl2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "LoneStriker/miquella-120b-4.5bpw-h6-exl2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "LoneStriker/miquella-120b-4.5bpw-h6-exl2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use LoneStriker/miquella-120b-4.5bpw-h6-exl2 with Docker Model Runner:
docker model run hf.co/LoneStriker/miquella-120b-4.5bpw-h6-exl2
| url: https://huggingface.co/alpindale/miquella-120b | |
| branch: main | |
| download date: 2024-02-01 06:18:12 | |
| sha256sum: | |
| 620f63c6506018074d44fb7bd789928e8348a054441dc5c946ac754bec71e8e5 model-00001-of-00024.safetensors | |
| 899cb0c2f2d710b8b04bba13c1c55cd3c676e8eb489203f75341a493d494942c model-00002-of-00024.safetensors | |
| 2922fcffb31663949e49fb0a9899b0b7a4a4f03b5de1bed1b8ac3cbedb375ae4 model-00003-of-00024.safetensors | |
| e6625575aeba5737cba5a8ec751409ebea7c503f7d18af4f1fcfaffb5756df8f model-00004-of-00024.safetensors | |
| 3da639d52b1091d8781ca8b04d492996fe24dad21e67343f428b4c8d47054515 model-00005-of-00024.safetensors | |
| 72d4282785f4f2439e9bcfb2dde427d14b9632890053ae48ffbedb917b738164 model-00006-of-00024.safetensors | |
| a1fe54da1f7bd8c5527d5f172d1ceeed0a777d050e183eeb7ffd2a3c8bb8a219 model-00007-of-00024.safetensors | |
| 9c90942fd8fc67fd6892e53452a8f412c0e182313c5a662663c2e6c600f2b20f model-00008-of-00024.safetensors | |
| 8a08b124b13061a978f9f7cd0eb2c528f66d021bd09cfebcc12e0b6d6712d746 model-00009-of-00024.safetensors | |
| b59d2cbe8e366c091bda018a31e1107722fdda5b35a98cb1f6e9e70de28921b0 model-00010-of-00024.safetensors | |
| a2bb89aec4360e493e26dac89f5c0f3889183230a48f9e4de9c89a3e5b9dcd0f model-00011-of-00024.safetensors | |
| 31a4626235159e4b7f5129ef85637b6e263a002d26d755740fb215e50f390bfe model-00012-of-00024.safetensors | |
| 19eb117b262e70bd5060829b20120b4a73541463eec551421d8872d68e27dcc8 model-00013-of-00024.safetensors | |
| 0aaea4c411ee45fbce685efcec9177b5028d82c70228483694b2d53706fc59df model-00014-of-00024.safetensors | |
| b208e8c133ca1c8d6103b501378cfab024bd69298b9a9962fbf795f37a9449ce model-00015-of-00024.safetensors | |
| ee9ba06c2d2ba13cab3739ed6f5e7e6f4e8fb650432f440c28559c3823a24b33 model-00016-of-00024.safetensors | |
| 47a06053c7fa4181c219aa981351f7bf095c3667cf922886da214b5f069718b1 model-00017-of-00024.safetensors | |
| 01a6b1b6f16e82aa4f656871c64585836a474dc505bf9ea0f02508bc13b3ef94 model-00018-of-00024.safetensors | |
| d98dae800f9f9b5c3cab4a8a8bd9669be5c446862fc962bd7d7a6be15665e746 model-00019-of-00024.safetensors | |
| 5ea2013ef676c88a3ebb24df3b2719b1065c56c365c80a626a7add6371cab1aa model-00020-of-00024.safetensors | |
| afb806c42bd08ee41e2c99f9e0edd01e85014b3c520ac55ab3067cfca256231c model-00021-of-00024.safetensors | |
| 42b170de39eb0c18d740428ea5cb4d58a01a313ad63849c97eb9c46555ffb92d model-00022-of-00024.safetensors | |
| f93387e6a51812fe7c782a921fa78c8b353390f2e56b9cc769b3fcf753bef193 model-00023-of-00024.safetensors | |
| 02580a6da004454fc4e66701b3fc340d76c72e5fdef7b37809437c6d892b50c9 model-00024-of-00024.safetensors | |
| 9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347 tokenizer.model | |