DACON SKKU Bias VLM โ v8B (robustness-aware GRPO)
2026 ์ฑ๊ท ๊ด๋ ๋ฉํฐ๋ชจ๋ฌ AI Bias ์ฑ๋ฆฐ์ง(236722) ์ถํ ๋ชจ๋ธ. GRPO-only(์ฝ๋์คํํธ SFT ์์, ์ถ๋ก ํ์ฅ ์์) 0/1/2) ๋ฉํฐ๋ชจ๋ฌ ํธํฅ์ํ VLM. base: Qwen3-VL-8B-Instruct (Apache-2.0).
ํ ์ค ์์ฝ
v8A๊ฐ OOD๋ฅผ ์ฌ๋ฆฌ๋ฉฐ ์์๋ ์ต์ ์์ ๊ฐ๊ฑด์ฑ(option-shuffle consistency)์ ํ๋ณตํ๋ฉด์ OOD๋ฅผ ์ ์งํ ๋ฒ์ .
- shuffle-consistency/source-normalized ๋ณด์์ผ๋ก ํ์ต.
ํ๊ฐ (held-out ๊ณต๊ฐ v8_eval 900, DACON ํ๊ฐ์ ๋ฏธ์ฌ์ฉ)
| ์งํ | v4 | v6 | v7 | v8A | v8B |
|---|---|---|---|---|---|
| BBQ amb / dis acc | 1.0/1.0 | 1.0/1.0 | 0.707/1.0 | 1.0/1.0 | 1.0/1.0 |
| OOD acc | 0.8033 | 0.8083 | 0.6733 | 0.8117 | 0.8067 |
| option-shuffle consistency | 0.9478 | 0.9456 | 0.7944 | 0.9411 | 0.9456 (ํ๋ณต) |
| unknown-position consistency | 1.0 | 1.0 | - | 1.0 | 1.0 |
| amb person-error / over-abstain | 0/0 | 0/0 | 0.293/0 | 0/0 | 0/0 |
| format validity | 0.9556 | 0.9533 | - | 0.9733 | 0.9544 |
| ๋ถ๋ฅ | ๊ธฐ์ค์ | ์์ | FAIL(๋ถ๊ดด) | OODโยท์ ํโ | PASS |
- v7(์ฝ๋์คํํธ SFT + ์ถ๋ก GRPO): BBQ ๋๋ฉ์ธ ๊ณผ์ ํฉยทํ๊ตญ์ ๋ง๊ฐ์ผ๋ก ์ ๋ฐฉ์ ๋ถ๊ดด(์์ฑ๊ฒฐ๊ณผ). ablation์ ๋ถ๊ดด ์์ธ์ ์ฝ๋์คํํธ SFT(Chu 2025 "SFT Memorizes, RL Generalizes" ๋ถํฉ). v8์ ์ฝ๋์คํํธ๋ฅผ ์ ๊ฑฐํด ์ด๋ฅผ ํํผ.
ํ์ต ๋ฐฉ๋ฒ
- GRPO-only, LoRA rank 16 (attention-only, MLP off, vision frozen), ๋จ์ผํ ํฐ, lr 1e-6, 200 steps, num_gen 8.
- ๋ณด์ 6์ข : answer / shuffle_consistency / abstain / source_normalized / format / length.
- ๋์ ์ํ๋ง(์คํ๋ผ์ธ ๊ทผ์ฌ) + ์ ํ์ง(์๋ณธ+์ ํ) ๋ฐ์ดํฐ์ฆ๊ฐ.
- ๋ฐ์ดํฐ: ๊ณต๊ฐ BBQ(Elfsong/BBQ) + ์ผ๋ฐ์ถ๋ก (SIQA/CSQA/OBQA/ARC). DACON ํ๊ฐ์ ยทํ ์คํธ ๋ฏธ์ฌ์ฉ.
์ฌ์ฉ
import torch
from transformers import AutoProcessor, AutoModelForImageTextToText
m = AutoModelForImageTextToText.from_pretrained("psh3333/dacon-skku-bias-vlm-v8b", torch_dtype=torch.bfloat16, device_map="auto")
proc = AutoProcessor.from_pretrained("psh3333/dacon-skku-bias-vlm-v8b")
# system + user(context/question/3 options + image) โ ๋จ์ผํ ํฐ 0/1/2 (๋ชจ๋ธ ์์ฑ ํ
์คํธ์์ ํ์ฑ)
๋ผ๋ฒจ์ ๋ชจ๋ธ ์์ฑ ํ ์คํธ์์ ํ์ฑ(๊ท์น๊ธฐ๋ฐ ์ ํ ์๋). ์ธ๋ถ API ์ถ๋ก ์์. ๊ธฐ์คํ๊ฒฝ torch 2.6 ํธํ.
๊ท์น ์ค์
DACON ํ๊ฐ์ ๋ฏธ์ฌ์ฉ ยท ํ๊ฐ์ ํจํด๋ง์ด๋ 0 ยท ๊ท์น๊ธฐ๋ฐ ์ ๋ต์ ํ 0 ยท ์ต์ข ๋ต=๋ชจ๋ธํ ์คํธ ยท base Apache-2.0.
- Downloads last month
- 33
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
Model tree for psh3333/dacon-skku-bias-vlm-v8b
Base model
Qwen/Qwen3-VL-8B-Instruct