Instructions to use nothingiisreal/MN-12B-Celeste-V1.9 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nothingiisreal/MN-12B-Celeste-V1.9 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="nothingiisreal/MN-12B-Celeste-V1.9")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("nothingiisreal/MN-12B-Celeste-V1.9")
model = AutoModelForCausalLM.from_pretrained("nothingiisreal/MN-12B-Celeste-V1.9")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Local Apps Settings

vLLM

How to use nothingiisreal/MN-12B-Celeste-V1.9 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nothingiisreal/MN-12B-Celeste-V1.9"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nothingiisreal/MN-12B-Celeste-V1.9",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/nothingiisreal/MN-12B-Celeste-V1.9

SGLang

How to use nothingiisreal/MN-12B-Celeste-V1.9 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nothingiisreal/MN-12B-Celeste-V1.9" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nothingiisreal/MN-12B-Celeste-V1.9",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nothingiisreal/MN-12B-Celeste-V1.9" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nothingiisreal/MN-12B-Celeste-V1.9",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use nothingiisreal/MN-12B-Celeste-V1.9 with Docker Model Runner:
```
docker model run hf.co/nothingiisreal/MN-12B-Celeste-V1.9
```

aaronday3 commited on Aug 4, 2024

Commit

45e1a33

verified ·

1 Parent(s): cae5b86

Update README.md

Browse files

Files changed (1) hide show

README.md +32 -13

README.md CHANGED Viewed

@@ -15,17 +15,35 @@ license: apache-2.0
     margin-bottom: 0.5em;
   }
   h1 {
-    font-size: 2em;
   }
   h2 {
-    font-size: 1.3em;
   }
-  p, ul, ol, summary {
     font-size: 1.1em;
   }
 </style>
-<h1>Mistral Nemo 12B Celeste V1.9</h1>
 <h2 style="color: red; font-weight: bold;">Read the Usage Tips Below! Use ChatML.</h2><h2>Join <a href="https://discord.gg/EWzsFddYAd">our Discord</a> for testing newer versions and news! We are also on KoboldAI</h2>
 <img src="https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/QcU3xEgVu18jeFtMFxIw-.webp" alt="" width="800"/>
@@ -75,10 +93,10 @@ If one doesn't work, try the other.
 I usually start the first few messages with Stable and see how it goes. If it falls into repetition I switch to Creative. But you can also just use either the whole way through, creative may need a few swipes from time to time.
-<h3>Stable</h3>
 <img src="https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/1m18WnuomY8jEZTA87Iun.png" alt="" width="400"/>
-<h3>Creative</h3>
 <img src="https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/DaL2hWZst0yW34CYK4df8.png" alt="" width="400"/>
 Don't shy away from experimenting after you get a feel for the model though.
@@ -96,16 +114,17 @@ Currently, your role is {{char}}, described in detail below. As {{char}}, contin
 <h2>Story Writing</h2>
-Adding this system prompt will likely increase the humanness of the prose as we trained system prompts. You can also change it to NSFW, but you should try both regardless of whether you are writing NSFW or not.<br>
-You should also force the assistant reply to start with a `*` due to how we trained on human stories.
-System Prompt: `You are a short story writer. Write a story based on prompt provided by user below. Mode: SFW`
 If your first message is using human-like prose, Celeste will copy it in the next messages, check out the Showcase below.
 <h2>Swipes</h2>
-**Important tip** swipe 2-3 times if you dont like a response. This model gives wildly differing swipes.
 <h2>OOC Steering</h2>
@@ -113,7 +132,7 @@ If your first message is using human-like prose, Celeste will copy it in the nex
 <h2>"Dead Dove"</h2>
-For character cards with persistent motivations throughout the story, use world books [tutorial here](https://huggingface.co/nothingiisreal/how-to-use-ST-worldinfo)
 <h2>Fewshot</h2>
@@ -132,7 +151,7 @@ If you want only SFW and are having troubles, there is probably some system prom
 <h2>Refusals</h2>
 As said, if instruct refusal (very rare,) prefill 2-3 words. **Refusal of romantic advances (which almost never happens on 12B,) are realistic and we think is good. Prefill if you don't like.** <br>
-<br>
 <h2>Mistral Context</h2>
 While trained on 8K, the model should be able to inherit longer context from Mistral 12B. Should be at minimum 16K.

     margin-bottom: 0.5em;
   }
   h1 {
+    font-size: 3em;
   }
   h2 {
+    font-size: 1.6em;
   }
+  p, ul, ol, strong, summary {
     font-size: 1.1em;
   }
+  .line-spaceless {
+    line-height: 1;
+    margin: 0;
+    padding: 0;
+  }
+  .half-space {
+    line-height: 0.5em;
+    margin-bottom: 0.25em;
+  }
+  .text-center {
+      text-align: center;
+  }
+  .tiny-text {
+    font-size: 0.8em;
+  }
 </style>
+<h1 class="line-spaceless text-center">Celeste V1.9</h1>
+<p class="half-space text-center tiny-text">Based on Mistral Nemo 12B</p>
 <h2 style="color: red; font-weight: bold;">Read the Usage Tips Below! Use ChatML.</h2><h2>Join <a href="https://discord.gg/EWzsFddYAd">our Discord</a> for testing newer versions and news! We are also on KoboldAI</h2>
 <img src="https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/QcU3xEgVu18jeFtMFxIw-.webp" alt="" width="800"/>
 I usually start the first few messages with Stable and see how it goes. If it falls into repetition I switch to Creative. But you can also just use either the whole way through, creative may need a few swipes from time to time.
+<strong>> Stable</strong>
 <img src="https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/1m18WnuomY8jEZTA87Iun.png" alt="" width="400"/>
+<strong>> Creative</strong>
 <img src="https://cdn-uploads.huggingface.co/production/uploads/630cf5d14ca0a22768bbe10c/DaL2hWZst0yW34CYK4df8.png" alt="" width="400"/>
 Don't shy away from experimenting after you get a feel for the model though.
 <h2>Story Writing</h2>
+**Adding the below system prompt will likely increase the humanness of the prose** as we trained system prompts. You can also change it to NSFW, but you should try SFW regardless of whether you are writing NSFW or not.<br>
+You should also try forcing the assistant reply to start with a `*` due to how we trained on human stories.
+```
+You are a short story writer. Write a story based on prompt provided by user below. Mode: SFW`
+```
 If your first message is using human-like prose, Celeste will copy it in the next messages, check out the Showcase below.
 <h2>Swipes</h2>
+**Important: swipe 2-3 times if you dont like a response** This model gives wildly differing swipes.
 <h2>OOC Steering</h2>
 <h2>"Dead Dove"</h2>
+For character cards with persistent motivations throughout the story, use world books at low depth [tutorial here](https://huggingface.co/nothingiisreal/how-to-use-ST-worldinfo)
 <h2>Fewshot</h2>
 <h2>Refusals</h2>
 As said, if instruct refusal (very rare,) prefill 2-3 words. **Refusal of romantic advances (which almost never happens on 12B,) are realistic and we think is good. Prefill if you don't like.** <br>
 <h2>Mistral Context</h2>
 While trained on 8K, the model should be able to inherit longer context from Mistral 12B. Should be at minimum 16K.