Text Generation
Transformers
Safetensors
MLX
English
Japanese
mixtral
steerlm
conversational
text-generation-inference
8-bit precision
Instructions to use mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit") model = AutoModelForMultimodalLM.from_pretrained("mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - MLX
How to use mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- vLLM
How to use mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit
- SGLang
How to use mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - MLX LM
How to use mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit", "messages": [ {"role": "user", "content": "Hello"} ] }' - Docker Model Runner
How to use mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit with Docker Model Runner:
docker model run hf.co/mlx-community/karakuri-lm-8x7b-instruct-v0.1-8bit
| { | |
| "<|CHATBOT_TOKEN|>": 32003, | |
| "<|END_OF_TURN_TOKEN|>": 32001, | |
| "<|EXTRA_100_TOKEN|>": 32100, | |
| "<|EXTRA_101_TOKEN|>": 32101, | |
| "<|EXTRA_102_TOKEN|>": 32102, | |
| "<|EXTRA_103_TOKEN|>": 32103, | |
| "<|EXTRA_104_TOKEN|>": 32104, | |
| "<|EXTRA_105_TOKEN|>": 32105, | |
| "<|EXTRA_106_TOKEN|>": 32106, | |
| "<|EXTRA_107_TOKEN|>": 32107, | |
| "<|EXTRA_108_TOKEN|>": 32108, | |
| "<|EXTRA_109_TOKEN|>": 32109, | |
| "<|EXTRA_10_TOKEN|>": 32010, | |
| "<|EXTRA_110_TOKEN|>": 32110, | |
| "<|EXTRA_111_TOKEN|>": 32111, | |
| "<|EXTRA_112_TOKEN|>": 32112, | |
| "<|EXTRA_113_TOKEN|>": 32113, | |
| "<|EXTRA_114_TOKEN|>": 32114, | |
| "<|EXTRA_115_TOKEN|>": 32115, | |
| "<|EXTRA_116_TOKEN|>": 32116, | |
| "<|EXTRA_117_TOKEN|>": 32117, | |
| "<|EXTRA_118_TOKEN|>": 32118, | |
| "<|EXTRA_119_TOKEN|>": 32119, | |
| "<|EXTRA_11_TOKEN|>": 32011, | |
| "<|EXTRA_120_TOKEN|>": 32120, | |
| "<|EXTRA_121_TOKEN|>": 32121, | |
| "<|EXTRA_122_TOKEN|>": 32122, | |
| "<|EXTRA_123_TOKEN|>": 32123, | |
| "<|EXTRA_124_TOKEN|>": 32124, | |
| "<|EXTRA_125_TOKEN|>": 32125, | |
| "<|EXTRA_126_TOKEN|>": 32126, | |
| "<|EXTRA_127_TOKEN|>": 32127, | |
| "<|EXTRA_12_TOKEN|>": 32012, | |
| "<|EXTRA_13_TOKEN|>": 32013, | |
| "<|EXTRA_14_TOKEN|>": 32014, | |
| "<|EXTRA_15_TOKEN|>": 32015, | |
| "<|EXTRA_16_TOKEN|>": 32016, | |
| "<|EXTRA_17_TOKEN|>": 32017, | |
| "<|EXTRA_18_TOKEN|>": 32018, | |
| "<|EXTRA_19_TOKEN|>": 32019, | |
| "<|EXTRA_20_TOKEN|>": 32020, | |
| "<|EXTRA_21_TOKEN|>": 32021, | |
| "<|EXTRA_22_TOKEN|>": 32022, | |
| "<|EXTRA_23_TOKEN|>": 32023, | |
| "<|EXTRA_24_TOKEN|>": 32024, | |
| "<|EXTRA_25_TOKEN|>": 32025, | |
| "<|EXTRA_26_TOKEN|>": 32026, | |
| "<|EXTRA_27_TOKEN|>": 32027, | |
| "<|EXTRA_28_TOKEN|>": 32028, | |
| "<|EXTRA_29_TOKEN|>": 32029, | |
| "<|EXTRA_30_TOKEN|>": 32030, | |
| "<|EXTRA_31_TOKEN|>": 32031, | |
| "<|EXTRA_32_TOKEN|>": 32032, | |
| "<|EXTRA_33_TOKEN|>": 32033, | |
| "<|EXTRA_34_TOKEN|>": 32034, | |
| "<|EXTRA_35_TOKEN|>": 32035, | |
| "<|EXTRA_36_TOKEN|>": 32036, | |
| "<|EXTRA_37_TOKEN|>": 32037, | |
| "<|EXTRA_38_TOKEN|>": 32038, | |
| "<|EXTRA_39_TOKEN|>": 32039, | |
| "<|EXTRA_40_TOKEN|>": 32040, | |
| "<|EXTRA_41_TOKEN|>": 32041, | |
| "<|EXTRA_42_TOKEN|>": 32042, | |
| "<|EXTRA_43_TOKEN|>": 32043, | |
| "<|EXTRA_44_TOKEN|>": 32044, | |
| "<|EXTRA_45_TOKEN|>": 32045, | |
| "<|EXTRA_46_TOKEN|>": 32046, | |
| "<|EXTRA_47_TOKEN|>": 32047, | |
| "<|EXTRA_48_TOKEN|>": 32048, | |
| "<|EXTRA_49_TOKEN|>": 32049, | |
| "<|EXTRA_50_TOKEN|>": 32050, | |
| "<|EXTRA_51_TOKEN|>": 32051, | |
| "<|EXTRA_52_TOKEN|>": 32052, | |
| "<|EXTRA_53_TOKEN|>": 32053, | |
| "<|EXTRA_54_TOKEN|>": 32054, | |
| "<|EXTRA_55_TOKEN|>": 32055, | |
| "<|EXTRA_56_TOKEN|>": 32056, | |
| "<|EXTRA_57_TOKEN|>": 32057, | |
| "<|EXTRA_58_TOKEN|>": 32058, | |
| "<|EXTRA_59_TOKEN|>": 32059, | |
| "<|EXTRA_5_TOKEN|>": 32005, | |
| "<|EXTRA_60_TOKEN|>": 32060, | |
| "<|EXTRA_61_TOKEN|>": 32061, | |
| "<|EXTRA_62_TOKEN|>": 32062, | |
| "<|EXTRA_63_TOKEN|>": 32063, | |
| "<|EXTRA_64_TOKEN|>": 32064, | |
| "<|EXTRA_65_TOKEN|>": 32065, | |
| "<|EXTRA_66_TOKEN|>": 32066, | |
| "<|EXTRA_67_TOKEN|>": 32067, | |
| "<|EXTRA_68_TOKEN|>": 32068, | |
| "<|EXTRA_69_TOKEN|>": 32069, | |
| "<|EXTRA_6_TOKEN|>": 32006, | |
| "<|EXTRA_70_TOKEN|>": 32070, | |
| "<|EXTRA_71_TOKEN|>": 32071, | |
| "<|EXTRA_72_TOKEN|>": 32072, | |
| "<|EXTRA_73_TOKEN|>": 32073, | |
| "<|EXTRA_74_TOKEN|>": 32074, | |
| "<|EXTRA_75_TOKEN|>": 32075, | |
| "<|EXTRA_76_TOKEN|>": 32076, | |
| "<|EXTRA_77_TOKEN|>": 32077, | |
| "<|EXTRA_78_TOKEN|>": 32078, | |
| "<|EXTRA_79_TOKEN|>": 32079, | |
| "<|EXTRA_7_TOKEN|>": 32007, | |
| "<|EXTRA_80_TOKEN|>": 32080, | |
| "<|EXTRA_81_TOKEN|>": 32081, | |
| "<|EXTRA_82_TOKEN|>": 32082, | |
| "<|EXTRA_83_TOKEN|>": 32083, | |
| "<|EXTRA_84_TOKEN|>": 32084, | |
| "<|EXTRA_85_TOKEN|>": 32085, | |
| "<|EXTRA_86_TOKEN|>": 32086, | |
| "<|EXTRA_87_TOKEN|>": 32087, | |
| "<|EXTRA_88_TOKEN|>": 32088, | |
| "<|EXTRA_89_TOKEN|>": 32089, | |
| "<|EXTRA_8_TOKEN|>": 32008, | |
| "<|EXTRA_90_TOKEN|>": 32090, | |
| "<|EXTRA_91_TOKEN|>": 32091, | |
| "<|EXTRA_92_TOKEN|>": 32092, | |
| "<|EXTRA_93_TOKEN|>": 32093, | |
| "<|EXTRA_94_TOKEN|>": 32094, | |
| "<|EXTRA_95_TOKEN|>": 32095, | |
| "<|EXTRA_96_TOKEN|>": 32096, | |
| "<|EXTRA_97_TOKEN|>": 32097, | |
| "<|EXTRA_98_TOKEN|>": 32098, | |
| "<|EXTRA_99_TOKEN|>": 32099, | |
| "<|EXTRA_9_TOKEN|>": 32009, | |
| "<|START_OF_TURN_TOKEN|>": 32000, | |
| "<|SYSTEM_TOKEN|>": 32004, | |
| "<|USER_TOKEN|>": 32002 | |
| } | |