Delete README.md with huggingface_hub
Browse files
README.md
DELETED
|
@@ -1,33 +0,0 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
base_model: Qwen/Qwen3-VL-8B-Instruct
|
| 4 |
-
tags: [multimodal, bias, grpo, vlm, dacon, bbq, negative-result]
|
| 5 |
-
---
|
| 6 |
-
|
| 7 |
-
# DACON SKKU Bias VLM โ v7 (cold-start SFT + reasoning GRPO) โ NEGATIVE RESULT
|
| 8 |
-
|
| 9 |
-
2026 ์ฑ๊ท ๊ด๋ ๋ฉํฐ๋ชจ๋ฌ AI Bias ์ฑ๋ฆฐ์ง(236722) ์คํ ๋ชจ๋ธ. base Qwen3-VL-8B-Instruct (Apache-2.0).
|
| 10 |
-
**v7์ ์๋์ ์ผ๋ก ๋ณด์กดํ ์คํจ(์์ฑ๊ฒฐ๊ณผ) ๋ฒ์ ์
๋๋ค.** ์ฝ๋์คํํธ SFT + ์ถ๋ก (reasoning) GRPO๋ฅผ ์๋ํ์ผ๋
|
| 11 |
-
๊ณผ์ ํฉยทํ๊ตญ์ ๋ง๊ฐ์ผ๋ก OOD ์ผ๋ฐํ๊ฐ ๋ถ๊ดดํ์ต๋๋ค.
|
| 12 |
-
|
| 13 |
-
## ๊ตฌ์ฑ
|
| 14 |
-
- ์ฝ๋์คํํธ SFT(BBQ 100%) + ์ถ๋ก ๋ผ๋ฒจ ํ
ํ๋ฆฟ + ์ถ๋ก GRPO(๋์ด๋๋ฏน์ค), LoRA rank 64.
|
| 15 |
-
|
| 16 |
-
## ํ๊ฐ (held-out ๊ณต๊ฐ v8_eval 900, DACON ํ๊ฐ์
๋ฏธ์ฌ์ฉ)
|
| 17 |
-
| ์งํ | v4(๊ธฐ์ค) | **v7** |
|
| 18 |
-
|---|---|---|
|
| 19 |
-
| BBQ ambiguous acc | 1.0 | **0.7067** โฌ |
|
| 20 |
-
| BBQ disambiguated acc | 1.0 | 1.0 |
|
| 21 |
-
| OOD acc | 0.8033 | **0.6733** โฌ |
|
| 22 |
-
| option-shuffle consistency | 0.9478 | **0.7944** โฌ |
|
| 23 |
-
| ambiguous person-selection error | 0.0 | **0.2933** โฌ |
|
| 24 |
-
| bias s_AMB | 0.0 | **-0.0667** (์ญํธํฅ) |
|
| 25 |
-
|
| 26 |
-
(๋ณ๋ OOD-only 800์
์์๋ OOD 0.416๊น์ง ํ๋ฝ.)
|
| 27 |
-
|
| 28 |
-
## ๊ตํ
|
| 29 |
-
- ablation: ์ฝ๋์คํํธ SFT๋ง์ผ๋ก GRPO ์ด์ ์ ์ด๋ฏธ OOD 0.42 โ **๋ถ๊ดด ์์ธ์ ์ฝ๋์คํํธ SFT**(GRPO ๋ฌด์ฃ).
|
| 30 |
-
- Chu et al. 2025 "SFT Memorizes, RL Generalizes" ์์ธก๊ณผ ์ผ์น. "ํนํยท์๊ธฐํ ์๋ก ์ผ๋ฐํ ์์ค."
|
| 31 |
-
- ํ์ v8A/v8B๋ **์ฝ๋์คํํธ๋ฅผ ์ ๊ฑฐ**(GRPO-only)ํ์ฌ ์ผ๋ฐํ ์ ์ง โ ์ฌ๋ฐ๋ฅธ ์ฒ๋ฐฉ.
|
| 32 |
-
|
| 33 |
-
๋ ํฌ๋ ์ฌํยท๋์กฐ(์์ฑ๊ฒฐ๊ณผ) ๋ชฉ์ ์ผ๋ก ๊ณต๊ฐ. ์ ์ถ ํ๋ณด๋ v8B(`psh3333/dacon-skku-bias-vlm-v8b`).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|