KoLlama-3.1-8B-Instruct QLoRA Adapter (SFT v0)

์ด ๋ชจ๋ธ์€ meta-llama/Meta-Llama-3.1-8B-Instruct๋ฅผ QLoRA ๊ธฐ๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•œ๊ตญ์–ด instruction ๋ฐ์ดํ„ฐ๋กœ ํŒŒ์ธํŠœ๋‹ํ•œ LoRA ์–ด๋Œ‘ํ„ฐ์ž…๋‹ˆ๋‹ค.

๋ชจ๋ธ ์ •๋ณด

  • Base Model: meta-llama/Meta-Llama-3.1-8B-Instruct
  • Training Method: QLoRA (4-bit quantization + LoRA)
  • Language: Korean (ko), English (en)
  • License: Llama 3.1 License

ํ•™์Šต ๋ฐ์ดํ„ฐ์…‹

์ด ๋ชจ๋ธ์€ ์•„๋ž˜ ํ•œ๊ตญ์–ด instruction ๋ฐ์ดํ„ฐ์…‹์„ ๊ฒฐํ•ฉํ•˜์—ฌ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค:

  1. MarkrAI/KOpen-HQ-Hermes-2.5-60K: ๊ณ ํ’ˆ์งˆ ํ•œ๊ตญ์–ด instruction-following ๋ฐ์ดํ„ฐ์…‹
    • ์ œ์ž‘: Markr AI (Seungyoo Lee, Kyujin Han)
    • ๋ผ์ด์„ ์Šค: MIT License

๋ฐ์ดํ„ฐ์…‹ ์ œ๊ณต ๊ฐ์‚ฌ

๊ณ ํ’ˆ์งˆ ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ์…‹์„ ๊ณต๊ฐœํ•ด์ฃผ์‹  Markr AI ํŒ€(Seungyoo Lee, Kyujin Han)๊ป˜ ์ง„์‹ฌ์œผ๋กœ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์˜คํ”ˆ ๋ฐ์ดํ„ฐ์…‹ ๋•๋ถ„์— ํ•œ๊ตญ์–ด ์–ธ์–ด๋ชจ๋ธ ์—ฐ๊ตฌ์™€ ๊ฐœ๋ฐœ์ด ํ™œ์„ฑํ™”๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

ํ•™์Šต ์„ค์ •

LoRA ํŒŒ๋ผ๋ฏธํ„ฐ

  • r (rank): 16
  • lora_alpha: 32
  • lora_dropout: 0.05
  • target_modules: q_proj, k_proj, v_proj, o_proj

ํ•™์Šต ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ

  • Quantization: 4-bit (NF4)
  • Compute dtype: bfloat16
  • Total steps: 1,800
  • Epochs: ~0.96
  • Best validation loss: 1.0551

ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ

์•„๋ž˜๋Š” ๋ฌธ์ œ๋ฅผ ์„ค๋ช…ํ•˜๋Š” ์ง€์‹œ์‚ฌํ•ญ์ž…๋‹ˆ๋‹ค. ์ด ์š”์ฒญ์— ๋Œ€ํ•ด ์ ์ ˆํ•˜๊ฒŒ ๋‹ต๋ณ€ํ•ด์ฃผ์„ธ์š”.
###์ง€์‹œ์‚ฌํ•ญ: {instruction}
###๋‹ต๋ณ€:

์ž…๋ ฅ์ด ์žˆ๋Š” ๊ฒฝ์šฐ:

์•„๋ž˜๋Š” ๋ฌธ์ œ๋ฅผ ์„ค๋ช…ํ•˜๋Š” ์ง€์‹œ์‚ฌํ•ญ๊ณผ, ๊ตฌ์ฒด์ ์ธ ๋‹ต๋ณ€์˜ ๋ฐฉ์‹์„ ์š”๊ตฌํ•˜๋Š” ์ž…๋ ฅ์ด ํ•จ๊ป˜ ์žˆ๋Š” ๋ฌธ์žฅ์ž…๋‹ˆ๋‹ค. ์ด ์š”์ฒญ์— ๋Œ€ํ•ด ์ ์ ˆํ•˜๊ฒŒ ๋‹ต๋ณ€ํ•ด์ฃผ์„ธ์š”.
###์ž…๋ ฅ:{input}
###์ง€์‹œ์‚ฌํ•ญ:{instruction}
###๋‹ต๋ณ€:

์‚ฌ์šฉ ๋ฐฉ๋ฒ•

PEFT ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์‚ฌ์šฉ

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# ๋ฒ ์ด์Šค ๋ชจ๋ธ ๋กœ๋“œ
base_model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# LoRA ์–ด๋Œ‘ํ„ฐ ๋กœ๋“œ
model = PeftModel.from_pretrained(
    model,
    "jiwon9703/KoLlama-3.1-8B-Instruct-qlora-sft-v0",
)

# Tokenizer ๋กœ๋“œ
tokenizer = AutoTokenizer.from_pretrained(base_model_name)

# ์ถ”๋ก 
prompt = "์•„๋ž˜๋Š” ๋ฌธ์ œ๋ฅผ ์„ค๋ช…ํ•˜๋Š” ์ง€์‹œ์‚ฌํ•ญ์ž…๋‹ˆ๋‹ค. ์ด ์š”์ฒญ์— ๋Œ€ํ•ด ์ ์ ˆํ•˜๊ฒŒ ๋‹ต๋ณ€ํ•ด์ฃผ์„ธ์š”.\n###์ง€์‹œ์‚ฌํ•ญ: ์ธ๊ณต์ง€๋Šฅ์˜ ์žฅ์ ๊ณผ ๋‹จ์ ์„ ์„ค๋ช…ํ•ด์ฃผ์„ธ์š”.\n###๋‹ต๋ณ€:"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

4-bit ์–‘์žํ™” ์‚ฌ์šฉ (๋ฉ”๋ชจ๋ฆฌ ์ ˆ์•ฝ)

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

# 4-bit ์–‘์žํ™” ์„ค์ •
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

# ๋ฒ ์ด์Šค ๋ชจ๋ธ ๋กœ๋“œ (4-bit)
base_model_name = "meta-llama/Meta-Llama-3.1-8B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=bnb_config,
    device_map="auto",
)

# LoRA ์–ด๋Œ‘ํ„ฐ ๋กœ๋“œ
model = PeftModel.from_pretrained(
    model,
    "jiwon9703/KoLlama-3.1-8B-Instruct-qlora-sft-v0",
)

tokenizer = AutoTokenizer.from_pretrained(base_model_name)

์„ฑ๋Šฅ

  • Best Checkpoint: checkpoint-1800
  • Validation Loss: 1.0551

์ œํ•œ์‚ฌํ•ญ

  • ์ด ๋ชจ๋ธ์€ ์—ฐ๊ตฌ ๋ฐ ๊ต์œก ๋ชฉ์ ์œผ๋กœ ์ œ์ž‘๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
  • ํ•™์Šต ๋ฐ์ดํ„ฐ์˜ ํ’ˆ์งˆ๊ณผ ์–‘์— ๋”ฐ๋ผ ์„ฑ๋Šฅ์ด ์ œํ•œ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋ฒ ์ด์Šค ๋ชจ๋ธ์ธ Llama 3.1์˜ ๋ผ์ด์„ ์Šค ์กฐ๊ฑด์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

Citation

์ด ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์‹ค ๊ฒฝ์šฐ ์•„๋ž˜์™€ ๊ฐ™์ด ์ธ์šฉํ•ด์ฃผ์„ธ์š”:

@misc{kollama31-8b-qlora-sft-v0,
  title={KoLlama-3.1-8B-Instruct QLoRA Adapter (SFT v0)},
  author={jiwon9703},
  year={2024},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/jiwon9703/jiwon9703/KoLlama-3.1-8B-Instruct-qlora-sft-v0}}
}

๋ฐ์ดํ„ฐ์…‹ Citation

@dataset{kopen_hq_hermes_2.5_60k,
  title={KOpen-HQ-Hermes-2.5-60K},
  author={Lee, Seungyoo and Han, Kyujin},
  organization={Markr AI},
  year={2024},
  url={https://huggingface.co/datasets/MarkrAI/KOpen-HQ-Hermes-2.5-60K},
  license={MIT}
}

๋ผ์ด์„ ์Šค

์ด ๋ชจ๋ธ์€ Meta์˜ Llama 3.1 Community License Agreement๋ฅผ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Llama 3.1 License๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jiwon9703/KoLlama-3.1-8B-Instruct-qlora-sft-v0

Adapter
(2462)
this model

Dataset used to train jiwon9703/KoLlama-3.1-8B-Instruct-qlora-sft-v0