Model Card for Model ID

Model Details

본 모델은 국립국어원 주최 2025년 인공지능의 한국어 능력 평가 경진대회 [2025]한국문화 질의응답(가 유형) ISNLP팀 최종 제출물이다.

Model Description

학습데이터는 2025년 인공지능의 한국어 능력 평가 경진대회 [2025]한국문화 질의응답(가 유형)에서 주어진 train dataset을 이용하여 학습하였다.

data link(현재 다운로드 불가): http://kli.korean.go.kr/taskOrdtm/taskList.do?taskOrdtmId=180&clCd=END_TASK&subMenuId=sub01

해당 과제에서는 Midm-base+QLoRA모델을 기반으로 GRPO + Descriptive Answer Candidate로 학습하였다.

자세한 학습 방법론 및 학습 코드는 https://github.com/KimGyunYeop/2025_MalPyeong_QA_ISNLP_RLVR_WTA 에서 확인할 수 있다.

Model Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, GenerationConfig
from peft import LoraConfig, PeftModelForCausalLM
import torch
    
adapter_path = "GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA"
lora_config = LoraConfig.from_pretrained(adapter_path)
model_name = lora_config.base_model_name_or_path
tokenizer = AutoTokenizer.from_pretrained(model_name)
generation_config= GenerationConfig.from_pretrained(model_name)

bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,                       # 4‑bit 가중치
        bnb_4bit_quant_type="nf4",               # Normal‑Float 4
        bnb_4bit_use_double_quant=True,          # double‑quant
        bnb_4bit_compute_dtype=torch.bfloat16,   # Ada, Hopper, MI300 등
        llm_int8_skip_modules=["lm_head"]        # 출력층은 FP16
    )
    
base_model = AutoModelForCausalLM.from_pretrained(model_name, 
                                                device_map="cuda:0",
                                                trust_remote_code=True,
                                                quantization_config=bnb_config,
                                                )

model = PeftModelForCausalLM.from_pretrained(base_model, adapter_path)
model.load_adapter(adapter_path, subfolder="selection", adapter_name="selection")

prompt = '단답형 문제에서 정답을 맞추기위해 반드시 충분히 생각해보고 "생각 과정:" 이후에 정답 근거 및 생각과정을 작성 한 다음 이후 최종 답변을 생성할 것 (문제 유형: 단답형 생각과정: {생각과정} 답변: {최종답변} 형태)'
question = '문제 유형: 단답형 \n 질문: 2005년에 개관하였으며, 교육 및 문화적 목적으로 영화를 상영하는 서울의 유일한 비영리 민간 시네마테크 전용관은 어디인가요?'
inputs = tokenizer.apply_chat_template(
    [
        {"role": "system", "content": prompt},
        {"role": "user", "content": question}
    ],
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
)

answer = model.generate(
    inputs.to("cuda:0"),
    generation_config=generation_config,
    max_new_tokens=1024,
    do_sample=True
)

answer_text = tokenizer.decode(answer[0][inputs.shape[-1]:], skip_special_tokens=True)
print(answer_text)
  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Framework versions

  • PEFT 0.16.0 -->
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA

Adapter
(7)
this model

Paper for GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA