Instructions to use cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0") model = AutoModelForMultimodalLM.from_pretrained("cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0
- SGLang
How to use cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0 with Docker Model Runner:
docker model run hf.co/cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0
solar-kor-resume
Update @ 2024.05.27: First release of Ocelot-Ko-self-instruction-10.8B-v1.0
This model card corresponds to the 10.8B Instruct version of the Solar-Ko model.
The train wad done on A100-80GB
Resources and Technical Documentation:
Citation
@misc {cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0,
author = { {frcp, nebchi} },
title = { solar-kor-resume},
year = 2024,
url = { https://huggingface.co/cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0 },
publisher = { Hugging Face }
}
Model Developers: frcp, nebchi
Model Information
Resume Proofreading and evaluation of inputs and outputs.
Description
It has been trained with a large amount of Korean tokens compared to other LLMs, enabling it to generate high-quality Korean text.
Model Architecture Solar is an auto-regressive language model that is scaled using the DUS method.
*You can find dataset list here: https://huggingface.co/datasets/cpm-ai/gpt-self-introduction-all
Inputs and outputs
- Input: Text string, such as a question, a prompt, or a document to be Proofreaded.
- Output: Generated Korea text in response to the input, such as an answer to a question, or a evaluation of a resume.
Running the model on a single / multi GPU
# pip install accelerate, flash_attn, sentencepiece
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0")
model = AutoModelForCausalLM.from_pretrained("cpm-ai/Ocelot-Ko-self-instruction-10.8B-v1.0", device_map="auto")
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, max_new_tokens=4096, streamer=streamer)
text = ๋๋ ์๊ธฐ์๊ฐ์ ์ฒจ์ญ ์ ๋ฌธ๊ฐ์ผ.
์ฃผ์ด์ง ์๊ธฐ์๊ฐ์๋ฅผ ์ฒจ์ญํด์ ๋ค์ ์์ฑํด์ผํด.
์ถ๋ ฅํ์์ ๋ค์์ ์ง์ผ์ผํด.
[์ฒจ์ญ]
๋ค์์ด ์๊ธฐ์๊ฐ์์ผ :
[์ ๋ ์ด๋ฆฐ ์์ ๋ถํฐ ์๋ฒฝ์ฃผ์์ ์ธ ์ฑ๊ฒฉ์ ๊ฐ์ง๊ณ ์์์ต๋๋ค. ์ด๋ก ์ธํด ํญ์ ์์ ์ ๋ฅ๋ ฅ์ ๋ํ ๋ถ์๊ฐ์ ๋๋ผ๋ฉฐ ๊ณผ๋ํ ์คํธ๋ ์ค๋ฅผ ๋ฐ์์์ต๋๋ค. ํ์ฐฝ ์์ ์๋ ๊ณผ์ ๋ ํ๋ก์ ํธ๋ฅผ ์๋ฒฝํ๊ฒ ๋ง๋ฌด๋ฆฌํ์ง ๋ชปํ๋ฉด ์์กด๊ฐ์ด ํฌ๊ฒ ํ๋ค๋ ธ์ต๋๋ค. ์คํ๊ต ์์ ์๋ ํ ๊ฐ์ง ๋ฌธ์ ์ ๋๋ฌด ์ค๋ ์๊ฐ์ ํฌ์ํ์ฌ ๋ค๋ฅธ ํ์ต ๊ธฐํ๋ฅผ ๋์น๊ธฐ๋ ํ์ต๋๋ค. ์ด๋ฌํ ๊ฒฝํ๋ค์ ์ ์๊ฒ ์๋ฒฝํจ์ ์ถ๊ตฌํ๋ ๊ฒ์ด ์ข
์ข
ํ์ค์ ๋ถ์ ํฉํ๋ค๋ ๊ฒ์ ๊นจ๋ฌ๊ฒ ํ์ต๋๋ค.
๊ณ ๋ฑํ๊ต์ ๋ํ๊ต์ ์งํํ๋ฉด์๋ ์ด๋ฌํ ์๋ฒฝ์ฃผ์์ ์ธ ์ฑ๊ฒฉ์ ๊ทน๋ณตํ๊ธฐ ์ํด ๋
ธ๋ ฅํ์ต๋๋ค. ํ์ง๋ง ์ฌ์ ํ ์คํจ๋ฅผ ๋ฐ์๋ค์ด๋ ๊ฒ์ด ์ด๋ ต๊ณ , ์์ ์ ํ๊ณ๋ฅผ ์ธ์ ํ๋ ๊ฒ์ด ์ด๋ ค์ ์ต๋๋ค. ์ด๋ฌํ ๊ณผ์ ์ ํตํด ์๋ฒฝํจ์ ๋ํ ๊ฐ๋ฐ์ด ์ ์ ์ฑ์ฅ๊ณผ์ ์ ์ ์ฝํ๋ ์์ธ์ด ๋์์์ ๊นจ๋ฌ์์ต๋๋ค.]"""
messages = [
{
"role": "user",
"content": "{}".format(text)
}
]
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipe(
prompt,
temperature=0.2,
add_special_tokens=True
)
print(outputs[0]["generated_text"][len(prompt):])
results
[์ฒจ์ญ]
์ด๋ฆฐ ์์ ๋ถํฐ ์ ๋ ์๋ฒฝํ ๊ฒฐ๊ณผ๋ฅผ ์ถ๊ตฌํ๋ฉฐ ์ค์ค๋ก๋ฅผ ์๋ฐํด์จ ์ฑ๊ฒฉ์ด์์ต๋๋ค. ์ด๋ ํ์
๊ณผ ๊ด๋ จ๋ ์คํธ๋ ์ค๋ก ์ด์ด์ ธ, ๊ณผ์ ๋ฅผ ์์ํ๋๋ผ๋ ๋ง์กฑ๋ณด๋ค๋ ๋ถ๋ง์กฑ์ ๊ฐ์ ์ด ๋ ์ปธ๋ ์๊ธฐ์์ต๋๋ค. ํนํ ์คํ๊ต ๋ ํ ๋ฌธ์ ๋ฅผ ํด๊ฒฐํ๊ธฐ ์ํด ์ง๋์น๊ฒ ์ค๋ซ๋์ ๋งค๋ฌ๋ ค ๊ฒฐ๊ตญ ์ค์ํ ์๊ธฐ๋ฅผ ๋์น ๊ฒฝํ์ ์ ์ฑ์ฅ์ ํฐ ์ํฅ์ ๋ฏธ์ณค์ต๋๋ค. ์ด ๊ณผ์ ์์ ์๋ฒฝ์ฃผ์๋ฅผ ์ถ๊ตฌํ๋ ๊ฒ์ด ํ์ค์ ์ด์ง ์์ ์ ์๋ค๋ ์ฌ์ค์ ๊นจ๋ซ๊ธฐ ์์ํ์ต๋๋ค.
๊ณ ๋ฑํ๊ต์ ๋ํ์์๋ ์ด๋ฌํ ์ฑํฅ์ ๊ฐ์ ํ๊ณ ์ ๋ค์ํ ๋
ธ๋ ฅ์ ๊ธฐ์ธ์์ต๋๋ค. ์๋ฅผ ๋ค์ด, ๋ชฉํ๋ฅผ ์ธ๋ถํํ๊ณ ๋จ๊ณ๋ณ๋ก ์ ๊ทผํ๋ฉด์ ์ฑ์ทจ๊ฐ๊ณผ ์์ ๊ฐ์ ํค์ฐ๊ธฐ ์ํด ๋
ธ๋ ฅํ์ต๋๋ค. ๋ํ, ํ ํ๋ก์ ํธ์์ ์ญํ ์ ๋ถ๋ดํ๊ณ ํ๋ ฅํจ์ผ๋ก์จ ๊ฐ์ธ์ ํ๊ณ๋ณด๋ค ์ ์ฒด ์ฑ๊ณผ๋ฅผ ์ฐ์ ์ํ๋ ๋ฒ์ ๋ฐฐ์ ์ต๋๋ค. ๋น๋ก ์์ง ์๋ฒฝํจ์ด๋ผ๋ ๊ตด๋ ๋ก๋ถํฐ ์์ ํ ์์ ๋ก์์ง์ง๋ ๋ชปํ์ง๋ง, ์ด๋ฅผ ๊ทน๋ณตํ๊ณ ์ฑ์ฅํ ์ ์๋ ๋ฐฉ๋ฒ์ ์ฐพ์๋ค๋ ์ ์์ ์๋ถ์ฌ์ ๋๋๋๋ค.
Evaluation Results - LogicKor
| Model | ๊ธ์ฐ๊ธฐ | ์ดํด | ๋ฌธ๋ฒ |
|---|---|---|---|
| HyperClovaX | 8.50 | 9.50 | 8.50 |
| solar-1-mini-chat | 8.50 | 7.00 | 5.21 |
| allganize/Llama-3-Alpha-Ko-8B-Instruct | 8.50 | 8.35 | 4.92 |
| Synatra-kiqu-7B | 4.42 | 5.71 | 4.50 |
| Ocelot-ko-10.8B | 8.57 | 7.00 | 6.57 |
Evaluation Results - Kobest
| ๋ชจ๋ธ ๋ช ์นญ | Average n=0 n=5 |
HellaSwag n=0 n=5 |
COPA n=0 n=5 |
BooIQ n=0 n=5 |
|---|---|---|---|---|
| KoGPT | 58.2 63.7 | 55.9 58.3 | 73.5 72.9 | 45.1 59.8 |
| Polyglot-ko-13B | 62.4 68.2 | 59.5 63.1 | 79.4 81.1 | 48.2 60.4 |
| LLaMA 2-13B | 45.2 60.5 | 41.3 44.0 | 59.3 63.8 | 34.9 73.8 |
| Baichuan 2-13B | 52.7 53.9 | 39.2 39.6 | 60.6 60.6 | 58.4 61.5 |
| QWEN-14B | 47.8 66.4 | 45.3 46.8 | 64.9 68.9 | 33.4 83.5 |
| Orion-14B-Chat | 68.8 73.2 | 47.0 49.6 | 77.7 79.4 | 81.6 90.7 |
| Ocelot-ko-10.8B | 72.5 75.9 | 50.0 51.4 | 75.8 82.5 | 91.7 93.8 |
Software
Training was done using QLoRA
- Downloads last month
- 4