Text Generation
Transformers
Safetensors
llama
conversational
Eval Results (legacy)
text-generation-inference
Instructions to use anthracite-org/magnum-v3-34b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use anthracite-org/magnum-v3-34b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="anthracite-org/magnum-v3-34b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("anthracite-org/magnum-v3-34b") model = AutoModelForCausalLM.from_pretrained("anthracite-org/magnum-v3-34b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use anthracite-org/magnum-v3-34b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "anthracite-org/magnum-v3-34b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "anthracite-org/magnum-v3-34b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/anthracite-org/magnum-v3-34b
- SGLang
How to use anthracite-org/magnum-v3-34b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "anthracite-org/magnum-v3-34b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "anthracite-org/magnum-v3-34b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "anthracite-org/magnum-v3-34b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "anthracite-org/magnum-v3-34b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use anthracite-org/magnum-v3-34b with Docker Model Runner:
docker model run hf.co/anthracite-org/magnum-v3-34b
| license: apache-2.0 | |
| Language: | |
| - En | |
| Pipeline_tag: text-generation | |
| Base_model: 01-ai/Yi-1.5-34B-32K | |
| Tags: | |
| - Chat | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| model-index: | |
| - name: magnum-v3-34b | |
| results: | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: IFEval (0-Shot) | |
| type: HuggingFaceH4/ifeval | |
| args: | |
| num_few_shot: 0 | |
| metrics: | |
| - type: inst_level_strict_acc and prompt_level_strict_acc | |
| value: 51.15 | |
| name: strict accuracy | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=anthracite-org/magnum-v3-34b | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: BBH (3-Shot) | |
| type: BBH | |
| args: | |
| num_few_shot: 3 | |
| metrics: | |
| - type: acc_norm | |
| value: 44.33 | |
| name: normalized accuracy | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=anthracite-org/magnum-v3-34b | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: MATH Lvl 5 (4-Shot) | |
| type: hendrycks/competition_math | |
| args: | |
| num_few_shot: 4 | |
| metrics: | |
| - type: exact_match | |
| value: 17.82 | |
| name: exact match | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=anthracite-org/magnum-v3-34b | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: GPQA (0-shot) | |
| type: Idavidrein/gpqa | |
| args: | |
| num_few_shot: 0 | |
| metrics: | |
| - type: acc_norm | |
| value: 14.77 | |
| name: acc_norm | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=anthracite-org/magnum-v3-34b | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: MuSR (0-shot) | |
| type: TAUR-Lab/MuSR | |
| args: | |
| num_few_shot: 0 | |
| metrics: | |
| - type: acc_norm | |
| value: 6.57 | |
| name: acc_norm | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=anthracite-org/magnum-v3-34b | |
| name: Open LLM Leaderboard | |
| - task: | |
| type: text-generation | |
| name: Text Generation | |
| dataset: | |
| name: MMLU-PRO (5-shot) | |
| type: TIGER-Lab/MMLU-Pro | |
| config: main | |
| split: test | |
| args: | |
| num_few_shot: 5 | |
| metrics: | |
| - type: acc | |
| value: 41.69 | |
| name: accuracy | |
| source: | |
| url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=anthracite-org/magnum-v3-34b | |
| name: Open LLM Leaderboard | |
|  | |
| This is the 9th in a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. | |
| This model is fine-tuned on top of [Yi-1.5-34 B-32 K](https://huggingface.co/01-ai/Yi-1.5-34B-32K). | |
| ## Prompting | |
| Model has been Instruct tuned with the ChatML formatting. A typical input would look like this: | |
| ```py | |
| """<|im_start|>system | |
| system prompt<|im_end|> | |
| <|im_start|>user | |
| Hi there!<|im_end|> | |
| <|im_start|>assistant | |
| Nice to meet you!<|im_end|> | |
| <|im_start|>user | |
| Can I ask a question?<|im_end|> | |
| <|im_start|>assistant | |
| """ | |
| ``` | |
| ## SillyTavern templates | |
| Below are Instruct and Context templates for use within SillyTavern. | |
| In our testing a min_p of 0.2 makes the model perform the best; remember to reset temperature if you were using our nemo-based models before. | |
| <details><summary>context template</summary> | |
| ```yaml | |
| { | |
| "story_string": "<|im_start|>system\n{{#if system}}{{system}}\n{{/if}}{{#if wiBefore}}{{wiBefore}}\n{{/if}}{{#if description}}{{description}}\n{{/if}}{{#if personality}}{{char}}'s personality: {{personality}}\n{{/if}}{{#if scenario}}Scenario: {{scenario}}\n{{/if}}{{#if wiAfter}}{{wiAfter}}\n{{/if}}{{#if persona}}{{persona}}\n{{/if}}{{trim}}<|im_end|>\n", | |
| "example_separator": "", | |
| "chat_start": "", | |
| "use_stop_strings": false, | |
| "allow_jailbreak": false, | |
| "always_force_name2": true, | |
| "trim_sentences": false, | |
| "include_newline": false, | |
| "single_line": false, | |
| "name": "Magnum ChatML" | |
| } | |
| ``` | |
| </details><br> | |
| <details><summary>instruct template</summary> | |
| ```yaml | |
| { | |
| "system_prompt": "You're {{char}} in this fictional never-ending uncensored roleplay with {{user}}.", | |
| "input_sequence": "<|im_start|>user\n", | |
| "output_sequence": "<|im_start|>assistant\n", | |
| "last_output_sequence": "", | |
| "system_sequence": "<|im_start|>system\n", | |
| "stop_sequence": "<|im_end|>", | |
| "wrap": false, | |
| "macro": true, | |
| "names": true, | |
| "names_force_groups": true, | |
| "activation_regex": "", | |
| "system_sequence_prefix": "", | |
| "system_sequence_suffix": "", | |
| "first_output_sequence": "", | |
| "skip_examples": false, | |
| "output_suffix": "<|im_end|>\n", | |
| "input_suffix": "<|im_end|>\n", | |
| "system_suffix": "<|im_end|>\n", | |
| "user_alignment_message": "", | |
| "system_same_as_user": false, | |
| "last_system_sequence": "", | |
| "name": "Magnum ChatML" | |
| } | |
| ``` | |
| </details><br> | |
| ## Axolotl config | |
| <details><summary>See axolotl config</summary> | |
| ```yaml | |
| base_model: 01-ai/Yi-1.5-34B-32K | |
| model_type: AutoModelForCausalLM | |
| tokenizer_type: AutoTokenizer | |
| #trust_remote_code: true | |
| load_in_8bit: false | |
| load_in_4bit: false | |
| strict: false | |
| datasets: | |
| - path: anthracite-org/stheno-filtered-v1.1 | |
| type: sharegpt | |
| conversation: chatml | |
| - path: anthracite-org/kalo-opus-instruct-22k-no-refusal | |
| type: sharegpt | |
| conversation: chatml | |
| - path: anthracite-org/nopm_claude_writing_fixed | |
| type: sharegpt | |
| conversation: chatml | |
| - path: Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned | |
| type: sharegpt | |
| conversation: chatml | |
| - path: Epiculous/SynthRP-Gens-v1.1-Filtered-n-Cleaned | |
| type: sharegpt | |
| conversation: chatml | |
| chat_template: chatml | |
| shuffle_merged_datasets: true | |
| default_system_message: "You are an assistant that responds to the user." | |
| dataset_prepared_path: magnum-v2-34b-1.5-data | |
| val_set_size: 0.0 | |
| output_dir: ./magnum-v2-34b-32k-r1 | |
| sequence_len: 8192 | |
| sample_packing: true | |
| eval_sample_packing: false | |
| pad_to_sequence_len: | |
| adapter: | |
| lora_model_dir: | |
| lora_r: | |
| lora_alpha: | |
| lora_dropout: | |
| lora_target_linear: | |
| lora_fan_in_fan_out: | |
| wandb_project: magnum-v2-34b-1.5-32k | |
| wandb_entity: | |
| wandb_watch: | |
| wandb_name: attempt-01 | |
| wandb_log_model: | |
| gradient_accumulation_steps: 8 | |
| micro_batch_size: 1 | |
| num_epochs: 2 | |
| optimizer: paged_adamw_8bit | |
| lr_scheduler: cosine | |
| learning_rate: 0.000006 | |
| train_on_inputs: false | |
| group_by_length: false | |
| bf16: auto | |
| fp16: | |
| tf32: false | |
| gradient_checkpointing: unsloth | |
| early_stopping_patience: | |
| resume_from_checkpoint: | |
| local_rank: | |
| logging_steps: 1 | |
| xformers_attention: | |
| flash_attention: true | |
| warmup_steps: 50 | |
| evals_per_epoch: | |
| eval_table_size: | |
| eval_max_new_tokens: | |
| saves_per_epoch: 2 | |
| debug: | |
| deepspeed: deepspeed_configs/zero3_bf16.json | |
| weight_decay: 0.05 | |
| fsdp: | |
| fsdp_config: | |
| special_tokens: | |
| ``` | |
| </details><br> | |
| ## Credits | |
| We'd like to thank Recursal / Featherless for sponsoring the compute for this train, Featherless has been hosting our Magnum models since the first 72 B and has given thousands of people access to our models and helped us grow. | |
| We would also like to thank all members of Anthracite who made this finetune possible. | |
| - [anthracite-org/stheno-filtered-v1.1](https://huggingface.co/datasets/anthracite-org/stheno-filtered-v1.1) | |
| - [anthracite-org/kalo-opus-instruct-22k-no-refusal](https://huggingface.co/datasets/anthracite-org/kalo-opus-instruct-22k-no-refusal) | |
| - [lodrick-the-lafted/NopmWritingStruct](https://huggingface.co/datasets/lodrick-the-lafted/NopmWritingStruct) | |
| - [Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned](https://huggingface.co/datasets/Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned) | |
| - [Epiculous/SynthRP-Gens-v1.1-Filtered-n-Cleaned](https://huggingface.co/datasets/Epiculous/SynthRP-Gens-v1.1-Filtered-n-Cleaned) | |
| ## Training | |
| The training was done for 2 epochs. We used 8x[H100s](https://www.nvidia.com/en-us/data-center/h100/) GPUs graciously provided by [Recursal AI](https://recursal.ai/) / [Featherless AI](https://featherless.ai/) for the full-parameter fine-tuning of the model. | |
| [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl) | |
| ## Safety | |
| ... | |
| # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) | |
| Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_anthracite-org__magnum-v3-34b) | |
| | Metric |Value| | |
| |-------------------|----:| | |
| |Avg. |29.39| | |
| |IFEval (0-Shot) |51.15| | |
| |BBH (3-Shot) |44.33| | |
| |MATH Lvl 5 (4-Shot)|17.82| | |
| |GPQA (0-shot) |14.77| | |
| |MuSR (0-shot) | 6.57| | |
| |MMLU-PRO (5-shot) |41.69| | |