Instructions to use Inv/MoECPM-Untrained-4x2b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Inv/MoECPM-Untrained-4x2b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Inv/MoECPM-Untrained-4x2b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("Inv/MoECPM-Untrained-4x2b")
model = AutoModelForMultimodalLM.from_pretrained("Inv/MoECPM-Untrained-4x2b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Inv/MoECPM-Untrained-4x2b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Inv/MoECPM-Untrained-4x2b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Inv/MoECPM-Untrained-4x2b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Inv/MoECPM-Untrained-4x2b

SGLang

How to use Inv/MoECPM-Untrained-4x2b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Inv/MoECPM-Untrained-4x2b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Inv/MoECPM-Untrained-4x2b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Inv/MoECPM-Untrained-4x2b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Inv/MoECPM-Untrained-4x2b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Inv/MoECPM-Untrained-4x2b with Docker Model Runner:
```
docker model run hf.co/Inv/MoECPM-Untrained-4x2b
```

MoECPM-Untrained-4x2b / README.md

Inv

Adding Evaluation Results (#1)

2c50339 verified over 2 years ago

preview code

Raw

History Blame Contribute Delete

4.03 kB

metadata

language:
  - en
  - zh
license: apache-2.0
tags:
  - Mixtral
  - openbmb/MiniCPM-2B-sft-bf16-llama-format
  - MoE
  - merge
  - mergekit
  - moerge
  - MiniCPM
base_model:
  - openbmb/MiniCPM-2B-sft-bf16-llama-format
model-index:
  - name: MoECPM-Untrained-4x2b
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 46.76
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Inv/MoECPM-Untrained-4x2b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 72.58
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Inv/MoECPM-Untrained-4x2b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 53.21
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Inv/MoECPM-Untrained-4x2b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 38.41
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Inv/MoECPM-Untrained-4x2b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 65.51
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Inv/MoECPM-Untrained-4x2b
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 44.58
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Inv/MoECPM-Untrained-4x2b
          name: Open LLM Leaderboard

MoECPM Untrained 4x2b

Model Details

Model Description

A MoE model out of 4 MiniCPM-2B-sft models. Intended to be trained. This version probably does not perform well (if it works at all, lol. I haven't tested it).

Uses

Training

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	53.51
AI2 Reasoning Challenge (25-Shot)	46.76
HellaSwag (10-Shot)	72.58
MMLU (5-Shot)	53.21
TruthfulQA (0-shot)	38.41
Winogrande (5-shot)	65.51
GSM8k (5-shot)	44.58