Instructions to use GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("K-intelligence/Midm-2.0-Base-Instruct")
model = PeftModel.from_pretrained(base_model, "GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA")

Transformers

How to use GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA

SGLang

How to use GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA with Docker Model Runner:
```
docker model run hf.co/GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA
```

Model Card for Model ID

Model Details

본 모델은 국립국어원 주최 2025년 인공지능의 한국어 능력 평가 경진대회 [2025]한국문화 질의응답(가 유형) ISNLP팀 최종 제출물이다.

Model Description

학습데이터는 2025년 인공지능의 한국어 능력 평가 경진대회 [2025]한국문화 질의응답(가 유형)에서 주어진 train dataset을 이용하여 학습하였다.

data link(현재 다운로드 불가): http://kli.korean.go.kr/taskOrdtm/taskList.do?taskOrdtmId=180&clCd=END_TASK&subMenuId=sub01

해당 과제에서는 Midm-base+QLoRA모델을 기반으로 GRPO + Descriptive Answer Candidate로 학습하였다.

자세한 학습 방법론 및 학습 코드는 https://github.com/KimGyunYeop/2025_MalPyeong_QA_ISNLP_RLVR_WTA 에서 확인할 수 있다.

Model Usage

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, GenerationConfig
from peft import LoraConfig, PeftModelForCausalLM
import torch
    
adapter_path = "GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA"
lora_config = LoraConfig.from_pretrained(adapter_path)
model_name = lora_config.base_model_name_or_path
tokenizer = AutoTokenizer.from_pretrained(model_name)
generation_config= GenerationConfig.from_pretrained(model_name)

bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,                       # 4‑bit 가중치
        bnb_4bit_quant_type="nf4",               # Normal‑Float 4
        bnb_4bit_use_double_quant=True,          # double‑quant
        bnb_4bit_compute_dtype=torch.bfloat16,   # Ada, Hopper, MI300 등
        llm_int8_skip_modules=["lm_head"]        # 출력층은 FP16
    )
    
base_model = AutoModelForCausalLM.from_pretrained(model_name, 
                                                device_map="cuda:0",
                                                trust_remote_code=True,
                                                quantization_config=bnb_config,
                                                )

model = PeftModelForCausalLM.from_pretrained(base_model, adapter_path)
model.load_adapter(adapter_path, subfolder="selection", adapter_name="selection")

prompt = '단답형 문제에서 정답을 맞추기위해 반드시 충분히 생각해보고 "생각 과정:" 이후에 정답 근거 및 생각과정을 작성 한 다음 이후 최종 답변을 생성할 것 (문제 유형: 단답형 생각과정: {생각과정} 답변: {최종답변} 형태)'
question = '문제 유형: 단답형 \n 질문: 2005년에 개관하였으며, 교육 및 문화적 목적으로 영화를 상영하는 서울의 유일한 비영리 민간 시네마테크 전용관은 어디인가요?'
inputs = tokenizer.apply_chat_template(
    [
        {"role": "system", "content": prompt},
        {"role": "user", "content": question}
    ],
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
)

answer = model.generate(
    inputs.to("cuda:0"),
    generation_config=generation_config,
    max_new_tokens=1024,
    do_sample=True
)

answer_text = tokenizer.decode(answer[0][inputs.shape[-1]:], skip_special_tokens=True)
print(answer_text)

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Framework versions

PEFT 0.16.0 -->

Downloads last month: 2

Model tree for GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA

Base model

K-intelligence/Midm-2.0-Base-Instruct

Adapter

(7)

this model

Paper for GyunYeop/midm-base-GRPO-lora-tuning-KoreanCultureQA

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 53