--- language: - en license: llama3 tags: - Llama-3.1 - instruct - finetune - reasoning - hybrid-mode - chatml - function calling - tool use - json mode - structured outputs - atropos - dataforge - long context - roleplaying - chat base_model: - NousResearch/Hermes-4-405B library_name: transformers widget: - example_title: Hermes 4 messages: - role: system content: >- You are Hermes 4, a capable, neutrally-aligned assistant. Prefer concise, correct answers. - role: user content: Explain what Hadamard Transform is. model-index: - name: Hermes-4-Llama-3.1-405B results: [] --- # Hermes 4 — Llama-3.1 405B EXL 3 2.00bpw 2.00 BPW H8 exllamav3 quant of Hermes 4 405B. ``` -- A perplexity: 1.50484401 -- B perplexity: 4.46562014 -- A label in top-K: K = 1: 0.8938 K = 2: 0.9486 K = 3: 0.9640 K = 4: 0.9714 K = 5: 0.9757 -- B label in top-K: K = 1: 0.6383 K = 2: 0.7622 K = 3: 0.8163 K = 4: 0.8482 K = 5: 0.8698 -- Top-K agreement, A vs B: K = 1: 0.6743 K = 2: 0.2721 K = 3: 0.0833 K = 4: 0.0222 K = 5: 0.0056 -- KL divergence (A, B): 2.27405149 -- KL divergence (B, A): 1.05870732 ``` command used to generate this quant ``` ulimit -n 100000 PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True python convert.py -i /home/ubuntu/workspace/models/Hermes-4-405B \ -o /home/ubuntu/workspace/models/final/hermes4-405b-2bpw \ -w /home/ubuntu/workspace/models/workdir \ -b 2.0 \ -hq \ -ss 2048 \ -cpi 3600 \ -hb 8 \ -d 0 ```