How to use from
llama.cpp
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf wolfram/miquliz-120b-GGUF:IQ3_XXS
# Run inference directly in the terminal:
llama cli -hf wolfram/miquliz-120b-GGUF:IQ3_XXS
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf wolfram/miquliz-120b-GGUF:IQ3_XXS
# Run inference directly in the terminal:
llama cli -hf wolfram/miquliz-120b-GGUF:IQ3_XXS
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf wolfram/miquliz-120b-GGUF:IQ3_XXS
# Run inference directly in the terminal:
./llama-cli -hf wolfram/miquliz-120b-GGUF:IQ3_XXS
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf wolfram/miquliz-120b-GGUF:IQ3_XXS
# Run inference directly in the terminal:
./build/bin/llama-cli -hf wolfram/miquliz-120b-GGUF:IQ3_XXS
Use Docker
docker model run hf.co/wolfram/miquliz-120b-GGUF:IQ3_XXS
Quick Links

miquliz-120b-GGUF

image/jpeg

This is a 120b frankenmerge created by interleaving layers of miqu-1-70b-sf with lzlv_70b_fp16_hf using mergekit.

Inspired by goliath-120b.

Thanks for the support, CopilotKit - the open-source platform for building in-app AI Copilots into any product, with any LLM model. Check out their GitHub.

Thanks for the EXL2 and GGUF quants, Lone Striker and NanoByte!

Prompt template: Mistral

<s>[INST] {prompt} [/INST]

See also: ๐Ÿบ๐Ÿฆโ€โฌ› LLM Prompt Format Comparison/Test: Mixtral 8x7B Instruct with 17 different instruct templates : LocalLLaMA

Model Details

  • Max Context: 32768 tokens
  • Layers: 137

Merge Details

Merge Method

This model was merged using the passthrough merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

dtype: float16
merge_method: passthrough
slices:
- sources:
  - layer_range: [0, 16]
    model: 152334H/miqu-1-70b-sf
- sources:
  - layer_range: [8, 24]
    model: lizpreciatior/lzlv_70b_fp16_hf
- sources:
  - layer_range: [17, 32]
    model: 152334H/miqu-1-70b-sf
- sources:
  - layer_range: [25, 40]
    model: lizpreciatior/lzlv_70b_fp16_hf
- sources:
  - layer_range: [33, 48]
    model: 152334H/miqu-1-70b-sf
- sources:
  - layer_range: [41, 56]
    model: lizpreciatior/lzlv_70b_fp16_hf
- sources:
  - layer_range: [49, 64]
    model: 152334H/miqu-1-70b-sf
- sources:
  - layer_range: [57, 72]
    model: lizpreciatior/lzlv_70b_fp16_hf
- sources:
  - layer_range: [65, 80]
    model: 152334H/miqu-1-70b-sf

Credits & Special Thanks

Support

  • My Ko-fi page if you'd like to tip me to say thanks or request specific models to be tested or merged with priority. Also consider supporting your favorite model creators, quantizers, or frontend/backend devs if you can afford to do so. They deserve it!

DISCLAIMER: THIS IS BASED ON A LEAKED ASSET AND HAS NO LICENSE ASSOCIATED WITH IT. USE AT YOUR OWN RISK.

Downloads last month
5
GGUF
Model size
118B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

3-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for wolfram/miquliz-120b-GGUF