Instructions to use ToastyPigeon/MS3-24B-MarbleRye with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ToastyPigeon/MS3-24B-MarbleRye with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ToastyPigeon/MS3-24B-MarbleRye")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("ToastyPigeon/MS3-24B-MarbleRye")
model = AutoModelForMultimodalLM.from_pretrained("ToastyPigeon/MS3-24B-MarbleRye")

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use ToastyPigeon/MS3-24B-MarbleRye with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ToastyPigeon/MS3-24B-MarbleRye"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ToastyPigeon/MS3-24B-MarbleRye",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/ToastyPigeon/MS3-24B-MarbleRye

SGLang

How to use ToastyPigeon/MS3-24B-MarbleRye with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ToastyPigeon/MS3-24B-MarbleRye" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ToastyPigeon/MS3-24B-MarbleRye",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ToastyPigeon/MS3-24B-MarbleRye" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ToastyPigeon/MS3-24B-MarbleRye",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use ToastyPigeon/MS3-24B-MarbleRye with Docker Model Runner:
```
docker model run hf.co/ToastyPigeon/MS3-24B-MarbleRye
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Marble Rye

Y'know, 'cause it's like a bread made with different colors.

This was mixed under the assumption that Sisyphus was an instruct model (later revealed to have accidentally been Ink again). But it still turned out pretty fun, just not as smart as it might have been otherwise. I might re-do it with the actual instruct model Sertraline when I have the time to test properly.

Should have some decent creative potential, with niche subject knowledge (from Roselily + Forgotten Safeword), and minimal god mode/plot armor issues (from DangerousWinds).

Instruct format is Tekken v7 (same as Mistral Small Instruct). Should also work with something like Alpaca or text completion (and possibly ChatML given the inclusion of Roselily).

Merge Details

Merge Method

This model was merged using the Model Breadcrumbs with TIES merge method using unsloth/Mistral-Small-24B-Base-2501 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

merge_method: breadcrumbs_ties
base_model: unsloth/Mistral-Small-24B-Base-2501
models:
  - model: allura-org/MS3-24B-Roselily-Creative
    parameters:
      weight: 0.7
  - model: allura-org/Mistral-Small-Sisyphus-24b-2503
    parameters:
      weight: 1.0
  - model: ReadyArt/Forgotten-Safeword-24B
    parameters:
      weight: 0.2
  - model: PocketDoc/Dans-DangerousWinds-V1.1.1-24b
    parameters:
      weight: 0.2
  - model: trashpanda-org/MS-24B-Mullein-v0
    parameters:
      weight: 0.2
parameters:
  density: 0.95
  gamma: 0.01
tokenizer_source: allura-org/MS3-24B-Roselily-Creative