Instructions to use liminerity/mm4-3b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use liminerity/mm4-3b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="liminerity/mm4-3b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("liminerity/mm4-3b")
model = AutoModelForCausalLM.from_pretrained("liminerity/mm4-3b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use liminerity/mm4-3b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "liminerity/mm4-3b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "liminerity/mm4-3b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/liminerity/mm4-3b

SGLang

How to use liminerity/mm4-3b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "liminerity/mm4-3b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "liminerity/mm4-3b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "liminerity/mm4-3b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "liminerity/mm4-3b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use liminerity/mm4-3b with Docker Model Runner:
```
docker model run hf.co/liminerity/mm4-3b
```

MM4-3b

a llama based model i made thru extensive training and merging ill explain later i literally made so many models today

Title: Divergent Knowledge Enhancement through Retrograde Merging Strategies: Redefining Accuracy Perspectives in Language Model Evolution

Abstract: Have you picked up any bad habits, or have you ever learned to do something incorrectly, only to realize you must completly relearn whatever it is you're trying to accomplish? In this proposal, we present an innovative and unconventional approach to enhancing the performance and knowledge base of natural language models. Our proposed method, titled 'Divergent Knowledge Enhancement through Retrograde Merging Strategies' (DKE-RS), aims to challenge traditional practices in model development by incorporating a deliberate back-and-forth merger between high and low accuracy language models.

The initial conceptualization of DKE-RS stemmed from the realization that learning often encompasses both acquisition and unlearning, as encapsulated by the quote, "learning is just as sacred as unlearning." The proposed technique commences with a baseline model, 'blur-7b,' attaining an accuracy rate of 72.1%, subsequently merged with a Mistral fine-tuned model on the Dolphin dataset, only achieving a 46% accuracy level.

By deliberately merging with less accurate models and retracing the evolutionary process, DKE-RS aims to broaden the knowledge base of the resulting model. This strategy, dubbed 'making the bad good,' intentionally degrades the initial accuracy in an effort to refine it, thus breaking conventional iterative improvements for innovative progression.

image/png

The DKE-RS method challenges the status quo by not solely relying on a linear enhancement trajectory, instead adopting a more holistic and diverse approach. We anticipate that this non-linear merger process will further diversify the model's knowledge base, thereby creating a more resilient and well-rounded language generation tool, capable of handling complex contexts with a broader understanding.

Through thorough experimentation and analysis, we plan to assess the effectiveness and potential drawbacks of DKE-RS, comparing it to traditional merging techniques. The results from such evaluations will provide valuable insights into the efficacy of this divergent strategy in the landscape of natural language model development.

We posit that the Divergent Knowledge Enhancement through Retrograde Merging Strategies approach contributes a significant and compelling step forward in the field, provoking thought-provoking discourse about the nature of accuracy refinement and model progression.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	53.22
AI2 Reasoning Challenge (25-Shot)	44.80
HellaSwag (10-Shot)	70.41
MMLU (5-Shot)	50.90
TruthfulQA (0-shot)	43.20
Winogrande (5-shot)	66.22
GSM8k (5-shot)	43.82

Downloads last month: 353

Datasets used to train liminerity/mm4-3b

Evaluation results

normalized accuracy on AI2 Reasoning Challenge (25-Shot)
test set Open LLM Leaderboard

44.800
normalized accuracy on HellaSwag (10-Shot)
validation set Open LLM Leaderboard

70.410
accuracy on MMLU (5-Shot)
test set Open LLM Leaderboard

50.900
mc2 on TruthfulQA (0-shot)
validation set Open LLM Leaderboard

43.200
accuracy on Winogrande (5-shot)
validation set Open LLM Leaderboard

66.220
accuracy on GSM8k (5-shot)
test set Open LLM Leaderboard

43.820