Instructions to use j30231/Llama-3.3-70B-Instruct_Q2_K.gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use j30231/Llama-3.3-70B-Instruct_Q2_K.gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="j30231/Llama-3.3-70B-Instruct_Q2_K.gguf", filename="Llama-3.3-70B-Instruct_Q2_K.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use j30231/Llama-3.3-70B-Instruct_Q2_K.gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K # Run inference directly in the terminal: llama-cli -hf j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K # Run inference directly in the terminal: llama-cli -hf j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K # Run inference directly in the terminal: ./llama-cli -hf j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K # Run inference directly in the terminal: ./build/bin/llama-cli -hf j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
Use Docker
docker model run hf.co/j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
- LM Studio
- Jan
- Ollama
How to use j30231/Llama-3.3-70B-Instruct_Q2_K.gguf with Ollama:
ollama run hf.co/j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
- Unsloth Studio new
How to use j30231/Llama-3.3-70B-Instruct_Q2_K.gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for j30231/Llama-3.3-70B-Instruct_Q2_K.gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for j30231/Llama-3.3-70B-Instruct_Q2_K.gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for j30231/Llama-3.3-70B-Instruct_Q2_K.gguf to start chatting
- Pi new
How to use j30231/Llama-3.3-70B-Instruct_Q2_K.gguf with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use j30231/Llama-3.3-70B-Instruct_Q2_K.gguf with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
Run Hermes
hermes
- Docker Model Runner
How to use j30231/Llama-3.3-70B-Instruct_Q2_K.gguf with Docker Model Runner:
docker model run hf.co/j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
- Lemonade
How to use j30231/Llama-3.3-70B-Instruct_Q2_K.gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull j30231/Llama-3.3-70B-Instruct_Q2_K.gguf:Q2_K
Run and chat with the model
lemonade run user.Llama-3.3-70B-Instruct_Q2_K.gguf-Q2_K
List all available models
lemonade list
Quantization : Q2_K (using Llama.cpp)
- llm_load_print_meta: model type = 70B
- llm_load_print_meta: model ftype = Q2_K - Medium
- llm_load_print_meta: model params = 70.55 B
- llm_load_print_meta: model size = 24.56 GiB (2.99 BPW)
- llama_model_loader: - type f32: 162 tensors
- llama_model_loader: - type q2_K: 321 tensors
- llama_model_loader: - type q3_K: 160 tensors
- llama_model_loader: - type q5_K: 80 tensors
- llama_model_loader: - type q6_K: 1 tensors
MMLU Result : 74.89%
Category STEM: 66.09% (18 subjects)
- high_school_chemistry: 64.04%
- high_school_mathematics: 46.67%
- abstract_algebra: 48.00%
- computer_security: 84.00%
- college_computer_science: 61.62%
- college_chemistry: 53.00%
- conceptual_physics: 74.89%
- high_school_statistics: 68.06%
- college_mathematics: 44.00%
- college_biology: 88.19%
- college_physics: 52.94%
- elementary_mathematics: 64.81%
- high_school_biology: 88.71%
- high_school_physics: 57.62%
- machine_learning: 56.25%
- astronomy: 88.16%
- electrical_engineering: 69.66%
- high_school_computer_science: 79.00%
Category humanities: 79.28% (13 subjects)
- world_religions: 84.80%
- high_school_us_history: 89.71%
- moral_disputes: 77.75%
- high_school_world_history: 88.61%
- formal_logic: 62.70%
- international_law: 85.12%
- jurisprudence: 76.85%
- professional_law: 59.58%
- logical_fallacies: 83.44%
- philosophy: 74.28%
- moral_scenarios: 78.66%
- prehistory: 84.26%
- high_school_european_history: 84.85%
Category social sciences: 82.11% (12 subjects)
- high_school_geography: 86.36%
- high_school_psychology: 91.19%
- sociology: 87.56%
- high_school_microeconomics: 86.55%
- professional_psychology: 76.80%
- security_studies: 77.55%
- us_foreign_policy: 91.00%
- public_relations: 70.91%
- high_school_government_and_politics: 93.78%
- econometrics: 61.40%
- human_sexuality: 81.68%
- high_school_macroeconomics: 80.51%
Category other (business, health, misc.): 75.95% (14 subjects)
- virology: 53.61%
- college_medicine: 72.25%
- global_facts: 62.00%
- miscellaneous: 87.36%
- medical_genetics: 84.00%
- human_aging: 78.48%
- nutrition: 83.33%
- marketing: 88.89%
- anatomy: 71.85%
- professional_medicine: 88.24%
- professional_accounting: 56.03%
- management: 82.52%
- clinical_knowledge: 80.75%
- business_ethics: 74.00%
Overall correct rate: 74.89% Total subjects evaluated: 57
Perplexity 6.6865 +/- 0.04336
(using wikitext-2-raw/wiki.test.raw)
- Downloads last month
- 33
2-bit
Model tree for j30231/Llama-3.3-70B-Instruct_Q2_K.gguf
Base model
meta-llama/Llama-3.1-70B