Instructions to use openaccess-ai-collective/manticore-30b-chat-pyg-alpha with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openaccess-ai-collective/manticore-30b-chat-pyg-alpha with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="openaccess-ai-collective/manticore-30b-chat-pyg-alpha")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("openaccess-ai-collective/manticore-30b-chat-pyg-alpha") model = AutoModelForCausalLM.from_pretrained("openaccess-ai-collective/manticore-30b-chat-pyg-alpha") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use openaccess-ai-collective/manticore-30b-chat-pyg-alpha with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "openaccess-ai-collective/manticore-30b-chat-pyg-alpha" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "openaccess-ai-collective/manticore-30b-chat-pyg-alpha", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/openaccess-ai-collective/manticore-30b-chat-pyg-alpha
- SGLang
How to use openaccess-ai-collective/manticore-30b-chat-pyg-alpha with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "openaccess-ai-collective/manticore-30b-chat-pyg-alpha" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "openaccess-ai-collective/manticore-30b-chat-pyg-alpha", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "openaccess-ai-collective/manticore-30b-chat-pyg-alpha" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "openaccess-ai-collective/manticore-30b-chat-pyg-alpha", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use openaccess-ai-collective/manticore-30b-chat-pyg-alpha with Docker Model Runner:
docker model run hf.co/openaccess-ai-collective/manticore-30b-chat-pyg-alpha
| base_model: huggyllama/llama-30b | |
| base_model_config: huggyllama/llama-30b | |
| model_type: LlamaForCausalLM | |
| tokenizer_type: LlamaTokenizer | |
| load_in_8bit: false | |
| strict: false | |
| push_dataset_to_hub: winglian | |
| # dataset_shard_num: 3 | |
| # dataset_shard_idx: 0 | |
| datasets: | |
| - path: winglian/pygmalion-cleaned | |
| data_files: | |
| - v12_no_ai.shard_0.jsonl | |
| type: pygmalion | |
| - path: winglian/evals | |
| data_files: | |
| - hf/ARC-Challenge.jsonl | |
| - hf/ARC-Easy.jsonl | |
| - hf/riddle_sense.jsonl | |
| type: explainchoice:chat | |
| - path: winglian/evals | |
| data_files: | |
| - openai/tldr.jsonl | |
| type: summarizetldr:chat | |
| - path: winglian/evals | |
| data_files: | |
| - hf/gsm8k.jsonl | |
| type: alpacachat.load_qa | |
| - path: winglian/evals | |
| data_files: | |
| - hellaswag/hellaswag.jsonl | |
| type: explainchoice:chat | |
| - path: metaeval/ScienceQA_text_only | |
| type: concisechoice:chat | |
| - path: ehartford/WizardLM_alpaca_evol_instruct_70k_unfiltered | |
| type: alpaca:chat | |
| - path: ehartford/wizard_vicuna_70k_unfiltered | |
| type: sharegpt:chat | |
| - path: winglian/chatlogs-en-cleaned | |
| data_files: | |
| - sharegpt_cleaned.jsonl | |
| type: sharegpt:chat | |
| - path: teknium/GPT4-LLM-Cleaned | |
| type: alpaca:chat | |
| - path: teknium/GPTeacher-General-Instruct | |
| data_files: gpt4-instruct-similarity-0.6-dataset.json | |
| type: gpteacher:chat | |
| - path: ewof/code-alpaca-instruct-unfiltered | |
| type: alpaca:chat | |
| - path: QingyiSi/Alpaca-CoT | |
| data_files: | |
| - Chain-of-Thought/formatted_cot_data/aqua_train.json [4/1757] | |
| - Chain-of-Thought/formatted_cot_data/creak_train.json | |
| - Chain-of-Thought/formatted_cot_data/ecqa_train.json | |
| - Chain-of-Thought/formatted_cot_data/esnli_train.json | |
| - Chain-of-Thought/formatted_cot_data/gsm8k_train.json | |
| - Chain-of-Thought/formatted_cot_data/qasc_train.json | |
| - Chain-of-Thought/formatted_cot_data/qed_train.json | |
| - Chain-of-Thought/formatted_cot_data/sensemaking_train.json | |
| - Chain-of-Thought/formatted_cot_data/strategyqa_train.json | |
| - GPTeacher/Roleplay/formatted_roleplay-similarity_0.6-instruct-dataset.json | |
| type: alpaca:chat | |
| dataset_prepared_path: last_run_prepared | |
| val_set_size: 0.01 | |
| adapter: | |
| lora_model_dir: | |
| sequence_len: 2048 | |
| max_packed_sequence_len: 2048 | |
| lora_r: | |
| lora_alpha: | |
| lora_dropout: | |
| lora_target_modules: | |
| lora_fan_in_fan_out: | |
| wandb_project: manticore-13b-v2 | |
| wandb_watch: | |
| wandb_run_id: | |
| wandb_log_model: | |
| output_dir: ./manticore-13b-v2 | |
| batch_size: 512 | |
| micro_batch_size: 8 | |
| num_epochs: 4 | |
| optimizer: | |
| torchdistx_path: | |
| lr_scheduler: | |
| learning_rate: 0.00004 | |
| train_on_inputs: false | |
| group_by_length: false | |
| bf16: true | |
| fp16: false | |
| tf32: true | |
| gradient_checkpointing: true | |
| early_stopping_patience: | |
| resume_from_checkpoint: | |
| local_rank: | |
| logging_steps: 1 | |
| xformers_attention: true | |
| flash_attention: | |
| gptq_groupsize: | |
| gptq_model_v1: | |
| warmup_steps: 20 | |
| eval_steps: 5 | |
| save_steps: 43 | |
| debug: | |
| deepspeed: | |
| weight_decay: 0.0001 | |
| fsdp: | |
| fsdp_config: | |
| special_tokens: | |
| bos_token: "<s>" | |
| eos_token: "</s>" | |
| unk_token: "<unk>" | |