Text Generation
Transformers
Safetensors
English
llama
Llama-3.1
instruct
finetune
reasoning
hybrid-mode
chatml
function calling
tool use
json mode
structured outputs
atropos
dataforge
long context
roleplaying
chat
conversational
text-generation-inference
2-bit
exl3
Instructions to use cpral/Hermes-4-405B-exl3-2bpw with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cpral/Hermes-4-405B-exl3-2bpw with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="cpral/Hermes-4-405B-exl3-2bpw") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("cpral/Hermes-4-405B-exl3-2bpw") model = AutoModelForCausalLM.from_pretrained("cpral/Hermes-4-405B-exl3-2bpw") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use cpral/Hermes-4-405B-exl3-2bpw with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "cpral/Hermes-4-405B-exl3-2bpw" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cpral/Hermes-4-405B-exl3-2bpw", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/cpral/Hermes-4-405B-exl3-2bpw
- SGLang
How to use cpral/Hermes-4-405B-exl3-2bpw with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "cpral/Hermes-4-405B-exl3-2bpw" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cpral/Hermes-4-405B-exl3-2bpw", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "cpral/Hermes-4-405B-exl3-2bpw" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cpral/Hermes-4-405B-exl3-2bpw", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use cpral/Hermes-4-405B-exl3-2bpw with Docker Model Runner:
docker model run hf.co/cpral/Hermes-4-405B-exl3-2bpw
Add files using upload-large-folder tool
Browse files- model-00001-of-00065.safetensors +3 -0
- model-00002-of-00065.safetensors +3 -0
- model-00004-of-00065.safetensors +3 -0
- model-00005-of-00065.safetensors +3 -0
- model-00007-of-00065.safetensors +3 -0
- model-00008-of-00065.safetensors +3 -0
- model-00009-of-00065.safetensors +3 -0
- model-00010-of-00065.safetensors +3 -0
- model-00011-of-00065.safetensors +3 -0
- model-00012-of-00065.safetensors +3 -0
- model-00013-of-00065.safetensors +3 -0
- model-00014-of-00065.safetensors +3 -0
- model-00015-of-00065.safetensors +3 -0
- model-00016-of-00065.safetensors +3 -0
- model-00017-of-00065.safetensors +3 -0
- model-00018-of-00065.safetensors +3 -0
- model-00020-of-00065.safetensors +3 -0
- model-00022-of-00065.safetensors +3 -0
- model-00023-of-00065.safetensors +3 -0
- model-00024-of-00065.safetensors +3 -0
- model-00025-of-00065.safetensors +3 -0
- model-00026-of-00065.safetensors +3 -0
- model-00027-of-00065.safetensors +3 -0
- model-00028-of-00065.safetensors +3 -0
- model-00039-of-00065.safetensors +3 -0
model-00001-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:015a014e62ca67a1141d7a58bbfe4df76a98249341a44b8b28d78421860196b0
|
| 3 |
+
size 4202692714
|
model-00002-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c4388de00dfaf9ef69ca1555e6f3c79f9422187d3be4d6139a38956993b8c41b
|
| 3 |
+
size 1595210048
|
model-00004-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1bde865f0d9d64aa27db422d6774a378d054ab9b519e8100f5cede6ee1a362f1
|
| 3 |
+
size 1595210048
|
model-00005-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a926c947be9be82a56b9d5e16dfc4ca64100fb23208136576bd9eb86cf274ac3
|
| 3 |
+
size 1595210048
|
model-00007-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:77f9a34a3ad4382e2e3d5896aead709c0a3002b40f020cc0f78c34a3c82bd90e
|
| 3 |
+
size 1595210108
|
model-00008-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:14c270deab042df819bca97d55b6d0251f9d53aba4298790b2a249162ba961fd
|
| 3 |
+
size 1595210108
|
model-00009-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1be4e13c3748bc97aec93f2a7ce2ff4fbfed6655458b90f11e43188f1d12b18b
|
| 3 |
+
size 1595210108
|
model-00010-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3e741d80eadecf0167d5f22cf604a110a712b4f201e5331302a9f50877296bd3
|
| 3 |
+
size 1595210108
|
model-00011-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fefa777e9ee74b55ddf3c98f0bc9dead9258e801639c6bf5c54c1cbd0c01ec19
|
| 3 |
+
size 1595210108
|
model-00012-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6c80724fa765e841b4afce3d70ee7feff8979664c58f203e8933d9649ac77148
|
| 3 |
+
size 1595210108
|
model-00013-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fad7dd44e39a0d803ecf9e5fb6b9c8d8be01b0ccea5b564c678d0ef8928a7dfd
|
| 3 |
+
size 1595210108
|
model-00014-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:678f17f323a645c1b2bafd55aee6dc2d53015af02b45a005ac6ec12345bc207e
|
| 3 |
+
size 1595210108
|
model-00015-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aa0d953587346e063cddca0259f3c8edd84666092b036fb6be968a282152ea92
|
| 3 |
+
size 1595210108
|
model-00016-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:091c41ad7fb5a2a9dd3397b8baa352c54dce033c0b844879b6d6cce81ada5b89
|
| 3 |
+
size 1595210108
|
model-00017-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:367b6b53dc1bf9a4a078fb7d90e303fe6df0a0df1bcc442bd251ce3598bc9c4a
|
| 3 |
+
size 1595210108
|
model-00018-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:827d82bafc07b79f70e4d9068dfd06ef20c90bd490c05ae12f42286ca9a2e368
|
| 3 |
+
size 1595210108
|
model-00020-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0c5b97522f565e42f909d97134f0f9067eb9ad87748886aa899008c2c5df4233
|
| 3 |
+
size 1595210108
|
model-00022-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:645cdb9f80941c474f0d230ac3c33b5be4b5c1f66019818b7c552bf03aa877af
|
| 3 |
+
size 1595210108
|
model-00023-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1482bcfb6f3fe7c39a7e7313c70c82bca9d65c9ec348926e4355de1302d325da
|
| 3 |
+
size 1595210108
|
model-00024-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:dad99eb5b5fda0dbec70b76a88b40af3c9f748ee4edfe9278617f4933f7b5335
|
| 3 |
+
size 1595210108
|
model-00025-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2806176ba048862b685b9e83deff84485f557c9a69cd2bef65e710a12cfec24a
|
| 3 |
+
size 1595210108
|
model-00026-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4d612a4f0cd2e5f05c3e1e2d646c9734a50d562ba46e5203ec6fffdd87c08a07
|
| 3 |
+
size 1595210108
|
model-00027-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c123d287682ed6949e225c7be500c5b9714b75903ba912e09c76cf904657f0e6
|
| 3 |
+
size 1595210108
|
model-00028-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e27ba9e7e996ae5ae9f86a490503309f122e75077f5bb75c473d4c922e706c14
|
| 3 |
+
size 1595210108
|
model-00039-of-00065.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fdb0a282bbacd7eb1b1440ef178c095b8fea6ef492c9501f19f2b1df0d55dead
|
| 3 |
+
size 1595210108
|