GyunYeop commited on
Commit
1e46302
ยท
verified ยท
1 Parent(s): 9b60ad2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -4
README.md CHANGED
@@ -16,12 +16,75 @@ tags:
16
 
17
  ## Model Details
18
 
19
- ### Model Description
20
-
21
- <!-- Provide a longer summary of what this model is. -->
22
 
 
23
 
 
24
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
  - **Developed by:** [More Information Needed]
26
  - **Funded by [optional]:** [More Information Needed]
27
  - **Shared by [optional]:** [More Information Needed]
@@ -204,4 +267,4 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
204
  [More Information Needed]
205
  ### Framework versions
206
 
207
- - PEFT 0.16.0
 
16
 
17
  ## Model Details
18
 
19
+ ๋ณธ ๋ชจ๋ธ์€ ๊ตญ๋ฆฝ๊ตญ์–ด์› ์ฃผ์ตœ 2025๋…„ ์ธ๊ณต์ง€๋Šฅ์˜ ํ•œ๊ตญ์–ด ๋Šฅ๋ ฅ ํ‰๊ฐ€ ๊ฒฝ์ง„๋Œ€ํšŒ [2025]ํ•œ๊ตญ๋ฌธํ™” ์งˆ์˜์‘๋‹ต(๊ฐ€ ์œ ํ˜•) ISNLPํŒ€ ์ตœ์ข… ์ œ์ถœ๋ฌผ์ด๋‹ค.
 
 
20
 
21
+ ## Model Description
22
 
23
+ <!-- Provide a longer summary of what this model is. -->
24
 
25
+ ํ•™์Šต๋ฐ์ดํ„ฐ๋Š” 2025๋…„ ์ธ๊ณต์ง€๋Šฅ์˜ ํ•œ๊ตญ์–ด ๋Šฅ๋ ฅ ํ‰๊ฐ€ ๊ฒฝ์ง„๋Œ€ํšŒ [2025]ํ•œ๊ตญ๋ฌธํ™” ์งˆ์˜์‘๋‹ต(๊ฐ€ ์œ ํ˜•)์—์„œ ์ฃผ์–ด์ง„ train dataset์„ ์ด์šฉํ•˜์—ฌ ํ•™์Šตํ•˜์˜€๋‹ค.
26
+
27
+ data link(ํ˜„์žฌ ๋‹ค์šด๋กœ๋“œ ๋ถˆ๊ฐ€): http://kli.korean.go.kr/taskOrdtm/taskList.do?taskOrdtmId=180&clCd=END_TASK&subMenuId=sub01
28
+
29
+ ํ•ด๋‹น ๊ณผ์ œ์—์„œ๋Š” Midm-base+QLoRA๋ชจ๋ธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ GRPO + Descriptive Answer Candidate๋กœ ํ•™์Šตํ•˜์˜€๋‹ค.
30
+
31
+ ์ž์„ธํ•œ ํ•™์Šต ๋ฐฉ๋ฒ•๋ก  ๋ฐ ํ•™์Šต ์ฝ”๋“œ๋Š” https://github.com/KimGyunYeop/2025_MalPyeong_QA_ISNLP_RLVR_WTA ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.
32
+
33
+ ## Model Usage
34
+
35
+ ```
36
+ from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, GenerationConfig
37
+ from peft import LoraConfig, PeftModelForCausalLM
38
+ import torch
39
+
40
+ adapter_path = "GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA"
41
+ lora_config = LoraConfig.from_pretrained(adapter_path)
42
+ model_name = lora_config.base_model_name_or_path
43
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
44
+ generation_config= GenerationConfig.from_pretrained(model_name)
45
+
46
+ bnb_config = BitsAndBytesConfig(
47
+ load_in_4bit=True, # 4โ€‘bit ๊ฐ€์ค‘์น˜
48
+ bnb_4bit_quant_type="nf4", # Normalโ€‘Floatโ€ฏ4
49
+ bnb_4bit_use_double_quant=True, # doubleโ€‘quant
50
+ bnb_4bit_compute_dtype=torch.bfloat16, # Ada, Hopper, MI300 ๋“ฑ
51
+ llm_int8_skip_modules=["lm_head"] # ์ถœ๋ ฅ์ธต์€ FP16
52
+ )
53
+
54
+ base_model = AutoModelForCausalLM.from_pretrained(model_name,
55
+ device_map="cuda:0",
56
+ trust_remote_code=True,
57
+ quantization_config=bnb_config,
58
+ )
59
+
60
+ model = PeftModelForCausalLM.from_pretrained(base_model, adapter_path)
61
+ model.load_adapter(adapter_path, subfolder="selection", adapter_name="selection")
62
+
63
+ prompt = '๋‹จ๋‹ตํ˜• ๋ฌธ์ œ์—์„œ ์ •๋‹ต์„ ๋งž์ถ”๊ธฐ์œ„ํ•ด ๋ฐ˜๋“œ์‹œ ์ถฉ๋ถ„ํžˆ ์ƒ๊ฐํ•ด๋ณด๊ณ  "์ƒ๊ฐ ๊ณผ์ •:" ์ดํ›„์— ์ •๋‹ต ๊ทผ๊ฑฐ ๋ฐ ์ƒ๊ฐ๊ณผ์ •์„ ์ž‘์„ฑ ํ•œ ๋‹ค์Œ ์ดํ›„ ์ตœ์ข… ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•  ๊ฒƒ (๋ฌธ์ œ ์œ ํ˜•: ๋‹จ๋‹ตํ˜• ์ƒ๊ฐ๊ณผ์ •: {์ƒ๊ฐ๊ณผ์ •} ๋‹ต๋ณ€: {์ตœ์ข…๋‹ต๋ณ€} ํ˜•ํƒœ)'
64
+ question = '๋ฌธ์ œ ์œ ํ˜•: ๋‹จ๋‹ตํ˜• \n ์งˆ๋ฌธ: 2005๋…„์— ๊ฐœ๊ด€ํ•˜์˜€์œผ๋ฉฐ, ๊ต์œก ๋ฐ ๋ฌธํ™”์  ๋ชฉ์ ์œผ๋กœ ์˜ํ™”๋ฅผ ์ƒ์˜ํ•˜๋Š” ์„œ์šธ์˜ ์œ ์ผํ•œ ๋น„์˜๋ฆฌ ๋ฏผ๊ฐ„ ์‹œ๋„ค๋งˆํ…Œํฌ ์ „์šฉ๊ด€์€ ์–ด๋””์ธ๊ฐ€์š”?'
65
+ inputs = tokenizer.apply_chat_template(
66
+ [
67
+ {"role": "system", "content": prompt},
68
+ {"role": "user", "content": question}
69
+ ],
70
+ tokenize=True,
71
+ add_generation_prompt=True,
72
+ return_tensors="pt"
73
+ )
74
+
75
+ answer = model.generate(
76
+ inputs.to("cuda:0"),
77
+ generation_config=generation_config,
78
+ max_new_tokens=1024,
79
+ do_sample=True
80
+ )
81
+
82
+ answer_text = tokenizer.decode(answer[0][inputs.shape[-1]:], skip_special_tokens=True)
83
+ print(answer_text)
84
+
85
+ ```
86
+
87
+ <!--
88
  - **Developed by:** [More Information Needed]
89
  - **Funded by [optional]:** [More Information Needed]
90
  - **Shared by [optional]:** [More Information Needed]
 
267
  [More Information Needed]
268
  ### Framework versions
269
 
270
+ - PEFT 0.16.0 -->