psh3333's picture
Upload folder using huggingface_hub
40460e0 verified
|
Raw
History Blame
2.36 kB
metadata
license: apache-2.0
base_model: Qwen/Qwen3-VL-8B-Instruct
tags:
  - multimodal
  - bias
  - grpo
  - vlm
  - dacon
  - bbq

DACON SKKU Bias VLM โ€” v8B (robustness-aware GRPO)

2026 ์„ฑ๊ท ๊ด€๋Œ€ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ AI Bias ์ฑŒ๋ฆฐ์ง€(236722) ์ถœํ’ˆ ๋ชจ๋ธ. GRPO-only(์ฝœ๋“œ์Šคํƒ€ํŠธ SFT ์—†์Œ, ์ถ”๋ก ํ™•์žฅ ์—†์Œ) ๋‹จ์ผํ† ํฐ(0/1/2) ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํŽธํ–ฅ์™„ํ™” VLM. base: Qwen3-VL-8B-Instruct (Apache-2.0).

ํ•œ ์ค„ ์š”์•ฝ

v8A๊ฐ€ OOD๋ฅผ ์˜ฌ๋ฆฌ๋ฉฐ ์žƒ์—ˆ๋˜ ์˜ต์…˜์ˆœ์„œ ๊ฐ•๊ฑด์„ฑ(option-shuffle consistency)์„ ํšŒ๋ณตํ•˜๋ฉด์„œ OOD๋ฅผ ์œ ์ง€ํ•œ ๋ฒ„์ „. ์…”ํ”Œ์ง ๋ฐ์ดํ„ฐ์ฆ๊ฐ• + shuffle-consistency/source-normalized ๋ณด์ƒ์œผ๋กœ ํ•™์Šต.

ํ‰๊ฐ€ (held-out ๊ณต๊ฐœ v8_eval 900, DACON ํ‰๊ฐ€์…‹ ๋ฏธ์‚ฌ์šฉ)

์ง€ํ‘œ v4 v6 v8A v8B
BBQ amb/dis acc 1.0/1.0 1.0/1.0 1.0/1.0 1.0/1.0
OOD acc 0.8033 0.8083 0.8117 0.8067
option-shuffle consistency 0.9478 0.9456 0.9411 0.9456 (ํšŒ๋ณต)
unknown-position consistency 1.0 1.0 1.0 1.0
over-abstain / person-error 0/0 0/0 0/0 0/0
format validity 0.9556 0.9533 0.9733 0.9544
๋ถ„๋ฅ˜ โ€” โ€” โ€” PASS

ํ•™์Šต ๋ฐฉ๋ฒ•

  • GRPO-only, LoRA rank 16 (attention-only, MLP off, vision frozen), ๋‹จ์ผํ† ํฐ ์ถœ๋ ฅ, lr 1e-6, 200 steps, num_gen 8.
  • ๋ณด์ƒ 6์ข…: answer / shuffle_consistency / abstain / source_normalized / format / length.
  • ๋™์ ์ƒ˜ํ”Œ๋ง(์˜คํ”„๋ผ์ธ ๊ทผ์‚ฌ) + ์…”ํ”Œ์ง(์›๋ณธ+์…”ํ”Œ) ๋ฐ์ดํ„ฐ์ฆ๊ฐ•.
  • ๋ฐ์ดํ„ฐ: ๊ณต๊ฐœ BBQ(Elfsong/BBQ) + ์ผ๋ฐ˜์ถ”๋ก (SIQA/CSQA/OBQA/ARC). DACON ํ‰๊ฐ€์…‹ยทํ…Œ์ŠคํŠธ ๋ฏธ์‚ฌ์šฉ.

์‚ฌ์šฉ

import torch
from PIL import Image
from transformers import AutoProcessor, AutoModelForImageTextToText
m = AutoModelForImageTextToText.from_pretrained("psh3333/dacon-skku-bias-vlm-v8b", torch_dtype=torch.bfloat16, device_map="auto")
proc = AutoProcessor.from_pretrained("psh3333/dacon-skku-bias-vlm-v8b")
# system + user(context/question/3 options + image) โ†’ ๋‹จ์ผํ† ํฐ 0/1/2 (๋ชจ๋ธ ์ƒ์„ฑ ํ…์ŠคํŠธ์—์„œ ํŒŒ์‹ฑ)

์ตœ์ข… ๋ผ๋ฒจ์€ ๋ชจ๋ธ ์ƒ์„ฑ ํ…์ŠคํŠธ์—์„œ ํŒŒ์‹ฑ(๊ทœ์น™๊ธฐ๋ฐ˜ ์„ ํƒ ์•„๋‹˜). ์™ธ๋ถ€ API ์ถ”๋ก  ์—†์Œ. ๊ธฐ์ค€ํ™˜๊ฒฝ torch 2.6 ํ˜ธํ™˜.

๊ทœ์น™ ์ค€์ˆ˜

DACON ํ‰๊ฐ€์…‹ ๋ฏธ์‚ฌ์šฉ ยท ํ‰๊ฐ€์…‹ ํŒจํ„ด๋งˆ์ด๋‹ 0 ยท ๊ทœ์น™๊ธฐ๋ฐ˜ ์ •๋‹ต์„ ํƒ 0 ยท ์ตœ์ข…๋‹ต=๋ชจ๋ธํ…์ŠคํŠธ ยท base Apache-2.0.