Instructions to use kakaocorp/kanana-nano-2.1b-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use kakaocorp/kanana-nano-2.1b-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="kakaocorp/kanana-nano-2.1b-base") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("kakaocorp/kanana-nano-2.1b-base") model = AutoModelForCausalLM.from_pretrained("kakaocorp/kanana-nano-2.1b-base") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use kakaocorp/kanana-nano-2.1b-base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "kakaocorp/kanana-nano-2.1b-base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kakaocorp/kanana-nano-2.1b-base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/kakaocorp/kanana-nano-2.1b-base
- SGLang
How to use kakaocorp/kanana-nano-2.1b-base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "kakaocorp/kanana-nano-2.1b-base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kakaocorp/kanana-nano-2.1b-base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "kakaocorp/kanana-nano-2.1b-base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kakaocorp/kanana-nano-2.1b-base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use kakaocorp/kanana-nano-2.1b-base with Docker Model Runner:
docker model run hf.co/kakaocorp/kanana-nano-2.1b-base
Use Docker images
docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "kakaocorp/kanana-nano-2.1b-base" \
--host 0.0.0.0 \
--port 30000# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "kakaocorp/kanana-nano-2.1b-base",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'Kanana
π€ Models | π Blog | π Technical Report | π» Github
Introduction
We introduce Kanana, a series of bilingual language models (developed by Kakao) that demonstrate exceeding performance in Korean and competitive performance in English. The computational cost of Kanana is significantly lower than that of state-of-the-art models of similar size. The report details the techniques employed during pre-training to achieve compute-efficient yet competitive models, including high-quality data filtering, staged pre-training, depth up-scaling, and pruning and distillation. Furthermore, the report outlines the methodologies utilized during the post-training of the Kanana models, encompassing supervised fine-tuning and preference optimization, aimed at enhancing their capability for seamless interaction with users. Lastly, the report elaborates on plausible approaches used for language model adaptation to specific scenarios, such as embedding, function calling, and Retrieval Augmented Generation (RAG). The Kanana model series spans from 2.1B to 32.5B parameters with 2.1B models (base, instruct, embedding, function call, and RAG) publicly released to promote research on Korean language models.
Neither the pre-training nor the post-training data includes Kakao user data.
Table of Contents
News
- π
2025/02/27: Released Technical Report and π€HF model weights. - π
2025/01/10: Published a blog post about the development ofKanana-Nanomodel. (Kanana-Nano) - π
2024/11/14: Published blog posts about the development ofKananamodels. (Kanana LLM: Pre-training, Kanana LLM: Post-training) - βΆοΈ
2024/11/06: Published a presentation video about the development of theKananamodels. (if(kakaoAI)2024)
Performance
Below are partial report on the performance of the Kanana model series. Please refer to the Technical Report for the full results.
Pre-trained Model Performance
| Models | MMLU | KMMLU | HAERAE | HumanEval | MBPP | GSM8K | |
|---|---|---|---|---|---|---|---|
| 27b+ scale | |||||||
| Kanana-Flag-32.5b | 77.68 | 62.10 | 90.47 | 51.22 | 63.40 | 70.05 | |
| Qwen2.5-32b | 83.10 | 63.15 | 75.16 | 50.00 | 73.40 | 82.41 | |
| Gemma-2-27b | 75.45 | 51.16 | 69.11 | 51.22 | 64.60 | 74.37 | |
| EXAONE-3.5-32b | 72.68 | 46.36 | 82.22 | - | - | - | |
| Aya-Expanse-32b | 74.52 | 49.57 | 80.66 | - | - | - | |
| 7b+ scale | |||||||
| Kanana-Essence-9.8b | 67.61 | 50.57 | 84.98 | 40.24 | 53.60 | 63.61 | |
| Llama-3.1-8b | 65.18 | 41.02 | 61.78 | 35.37 | 48.60 | 50.87 | |
| Qwen2.5-7b | 74.19 | 51.68 | 67.46 | 56.71 | 63.20 | 83.85 | |
| Gemma-2-9b | 70.34 | 48.18 | 66.18 | 37.20 | 53.60 | 68.16 | |
| EXAONE-3.5-7.8b | 65.36 | 45.30 | 77.54 | - | - | - | |
| Aya-Expanse-8b | 62.52 | 40.11 | 71.95 | - | - | - | |
| 2b+ scale | |||||||
| Kanana-Nano-2.1b | 54.83 | 44.80 | 77.09 | 31.10 | 46.20 | 46.32 | |
| Llama-3.2-3b | 56.40 | 35.57 | 47.66 | 25.61 | 39.00 | 27.37 | |
| Qwen2.5-3b | 65.57 | 45.28 | 61.32 | 37.80 | 55.60 | 69.07 | |
| Gemma-2-2b | 52.89 | 30.67 | 45.55 | 20.12 | 28.20 | 24.72 | |
| EXAONE-3.5-2.4b | 59.27 | 43.58 | 69.65 | - | - | - | |
| 70b+ scale | |||||||
| Llama-3.1-70b | 78.93 | 53.00 | 76.35 | 57.32 | 66.60 | 81.73 | |
| Qwen2.5-72b | 86.12 | 68.57 | 80.84 | 55.49 | 76.40 | 92.04 | |
Post-trained Model Performance
Instruction-following Benchmarks
| Models | MT-Bench | LogicKor | KoMT-Bench | WildBench | IFEval | ||
|---|---|---|---|---|---|---|---|
| 27b+ scale | |||||||
| Kanana-Flag-32.5b | 8.356 | 9.524 | 8.058 | 54.14 | 0.856 | ||
| Qwen2.5-32b | 8.331 | 8.988 | 7.847 | 51.13 | 0.822 | ||
| Gemma-2-27b | 8.088 | 8.869 | 7.373 | 46.46 | 0.817 | ||
| EXAONE-3.5-32b | 8.375 | 9.202 | 7.907 | 54.30 | 0.845 | ||
| Aya-Expanse-32b | 7.788 | 8.941 | 7.626 | 48.36 | 0.735 | ||
| 7b+ scale | |||||||
| Kanana-Essence-9.8b | 7.769 | 8.964 | 7.706 | 47.27 | 0.799 | ||
| Llama-3.1-8b | 7.500 | 6.512 | 5.336 | 33.20 | 0.772 | ||
| Qwen2.5-7b | 7.625 | 7.952 | 6.808 | 41.31 | 0.760 | ||
| Gemma-2-9b | 7.633 | 8.643 | 7.029 | 40.92 | 0.750 | ||
| EXAONE-3.5-7.8b | 8.213 | 9.357 | 8.013 | 50.98 | 0.826 | ||
| Aya-Expanse-8b | 7.131 | 8.357 | 7.006 | 38.50 | 0.645 | ||
| 2b+ scale | |||||||
| Kanana-Nano-2.1b | 6.400 | 7.964 | 5.857 | 25.41 | 0.720 | ||
| Llama-3.2-3b | 7.050 | 4.452 | 3.967 | 21.91 | 0.767 | ||
| Qwen2.5-3b | 6.969 | 6.488 | 5.274 | 25.76 | 0.355 | ||
| Gemma-2-2b | 7.225 | 5.917 | 4.835 | 28.71 | 0.428 | ||
| EXAONE-3.5-2.4b | 7.919 | 8.941 | 7.223 | 41.68 | 0.790 | ||
| 70b+ scale | |||||||
| Llama-3.1-70b | 8.275 | 8.250 | 6.970 | 46.50 | 0.875 | ||
| Qwen2.5-72b | 8.619 | 9.214 | 8.281 | 55.25 | 0.861 | ||
General Benchmarks
| Models | MMLU | KMMLU | HAE-RAE | HumanEval+ | MBPP+ | GSM8K | MATH |
|---|---|---|---|---|---|---|---|
| 27b+ scale | |||||||
| Kanana-Flag-32.5b | 81.08 | 64.19 | 68.18 | 77.44 | 69.84 | 90.83 | 57.82 |
| Qwen2.5-32b | 84.40 | 59.37 | 48.30 | 82.32 | 71.96 | 95.30 | 81.90 |
| Gemma-2-27b | 78.01 | 49.98 | 46.02 | 70.12 | 70.90 | 91.05 | 53.80 |
| EXAONE-3.5-32b | 78.30 | 55.44 | 52.27 | 78.66 | 70.90 | 93.56 | 76.80 |
| Aya-Expanse-32b | 74.49 | 42.35 | 51.14 | 64.63 | 65.61 | 75.06 | 42.82 |
| 7b+ scale | |||||||
| Kanana-Essence-9.8b | 70.64 | 50.76 | 47.16 | 72.56 | 69.05 | 84.91 | 42.24 |
| Llama-3.1-8b | 71.18 | 39.24 | 40.91 | 60.98 | 57.67 | 82.71 | 49.86 |
| Qwen2.5-7b | 77.23 | 46.87 | 37.50 | 73.78 | 70.63 | 91.58 | 75.22 |
| Gemma-2-9b | 73.47 | 44.47 | 39.77 | 59.76 | 64.55 | 87.72 | 48.10 |
| EXAONE-3.5-7.8b | 72.62 | 52.09 | 46.02 | 79.27 | 66.67 | 89.99 | 73.50 |
| Aya-Expanse-8b | 61.23 | 35.78 | 39.20 | 42.68 | 56.88 | 78.85 | 30.80 |
| 2b+ scale | |||||||
| Kanana-Nano-2.1b | 52.48 | 38.51 | 33.52 | 63.41 | 62.43 | 72.32 | 29.26 |
| Llama-3.2-3b | 56.09 | 3.07 | 17.05 | 56.71 | 50.26 | 66.57 | 38.18 |
| Qwen2.5-3b | 69.18 | 38.33 | 32.39 | 67.68 | 64.02 | 84.00 | 65.72 |
| Gemma-2-2b | 57.69 | 6.99 | 7.95 | 35.37 | 45.24 | 49.81 | 21.68 |
| EXAONE-3.5-2.4b | 63.19 | 14.27 | 14.20 | 70.73 | 59.79 | 83.78 | 64.04 |
| 70b+ scale | |||||||
| Llama-3.1-70b | 83.48 | 39.08 | 53.41 | 75.61 | 66.40 | 91.66 | 63.98 |
| Qwen2.5-72b | 87.14 | 65.78 | 60.80 | 81.10 | 75.66 | 95.45 | 82.60 |
Embedding Model Performance
| Backbone | Kanana-Nano-2.1b | Llama-3.2-3b | Qwen2.5-3b | Llama-3.2-1b | Qwen-2.5-1.5b |
| English | 51.56 | 53.28 | 54.00 | 48.77 | 50.60 |
| Korean | 65.00 | 59.43 | 62.10 | 54.68 | 54.60 |
| Avg. | 58.28 | 56.35 | 58.05 | 51.73 | 52.60 |
Quickstart
π€ HuggingFace Transformers
transformers>=4.45.0or the latest version is required to runKananamodel.
pip install transformers>=4.45.0
Example Usage for kanana-nano-2.1b-base
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "kakaocorp/kanana-nano-2.1b-base"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
trust_remote_code=True,
).to("cuda")
tokenizer = AutoTokenizer.from_pretrained(model_name, padding_side="left")
tokenizer.pad_token = tokenizer.eos_token
prompt1 = "μ΄μ²λΌ μΈκ°μ²λΌ μκ°νκ³ νλνλ AI λͺ¨λΈμ "
prompt2 = "Kakao is a leading company in South Korea, and it is known for "
input_ids = tokenizer(
[prompt1, prompt2],
padding=True,
return_tensors="pt",
)["input_ids"].to("cuda")
_ = model.eval()
with torch.no_grad():
output = model.generate(
input_ids,
max_new_tokens=32,
do_sample=False,
)
decoded = tokenizer.batch_decode(output, skip_special_tokens=True)
for text in decoded:
print(text)
# Output:
# μ΄μ²λΌ μΈκ°μ²λΌ μκ°νκ³ νλνλ AI λͺ¨λΈμ 2020λ
λ μ€λ°μ λ±μ₯ν κ²μΌλ‘ μμλλ€. 2020λ
λ μ€λ°μ λ±μ₯ν κ²μΌλ‘ μμλλ AI λͺ¨λΈμ μΈκ°
# Kakao is a leading company in South Korea, and it is known for 1) its innovative products and services, 2) its commitment to sustainability, and 3) its focus on customer experience. Kakao has been recognized as
License
The Kanana models are licensed under CC-BY-NC-4.0.
Citation
@misc{kananallmteam2025kananacomputeefficientbilinguallanguage,
title={Kanana: Compute-efficient Bilingual Language Models},
author={Kanana LLM Team and Yunju Bak and Hojin Lee and Minho Ryu and Jiyeon Ham and Seungjae Jung and Daniel Wontae Nam and Taegyeong Eo and Donghun Lee and Doohae Jung and Boseop Kim and Nayeon Kim and Jaesun Park and Hyunho Kim and Hyunwoong Ko and Changmin Lee and Kyoung-Woon On and Seulye Baeg and Junrae Cho and Sunghee Jung and Jieun Kang and EungGyun Kim and Eunhwa Kim and Byeongil Ko and Daniel Lee and Minchul Lee and Miok Lee and Shinbok Lee and Gaeun Seo},
year={2025},
eprint={2502.18934},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.18934},
}
Contributors
- Pre-training: Yunju Bak, Doohae Jung, Boseop Kim, Nayeon Kim, Hojin Lee, Jaesun Park, Minho Ryu
- Post-training: Jiyeon Ham, Seungjae Jung, Hyunho Kim, Hyunwoong Ko, Changmin Lee, Daniel Wontae Nam, Kyoung-Woon On
- Adaptation: Seulye Baeg, Junrae Cho, Taegyeong Eo, Sunghee Jung, Jieun Kang, EungGyun Kim, Eunhwa Kim, Byeongil Ko, Daniel Lee, Donghun Lee, Minchul Lee, Miok Lee, Shinbok Lee, Minho Ryu, Gaeun Seo
Contact
- Kanana LLM Team Technical Support: kanana-llm@kakaocorp.com
- Business & Partnership Contact: alpha.k@kakaocorp.com
- Downloads last month
- 416
Model tree for kakaocorp/kanana-nano-2.1b-base
Spaces using kakaocorp/kanana-nano-2.1b-base 4
Collection including kakaocorp/kanana-nano-2.1b-base
Paper for kakaocorp/kanana-nano-2.1b-base
Evaluation results
- acc on mmlu (5-shots)self-reported54.830
- exact_match on kmmlu-direct (5-shots)self-reported44.830
- acc_norm on haerae (5-shots)self-reported77.090
- exact_match_strict on gsm8k (5-shots)self-reported46.320
- pass@1 on humaneval (0-shots)self-reported31.100
- pass@1 on mbpp (3-shots)self-reported46.200
Install from pip and serve model
# Install SGLang from pip: pip install sglang# Start the SGLang server: python3 -m sglang.launch_server \ --model-path "kakaocorp/kanana-nano-2.1b-base" \ --host 0.0.0.0 \ --port 30000# Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kakaocorp/kanana-nano-2.1b-base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'