KoHRM-Text-1.4B FullSFT BarExam MCQ + Hard Current-Law Precedent Epoch2
Full fine-tune that continues from LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-BarExam-MCQ-1-14-Epoch2 and adds the gyung/korean-bar-exam-hard-current-law-precedent-sft-1000 corpus.
Base Model
- Base:
LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-BarExam-MCQ-1-14-Epoch2 - Relation: full fine-tune (continuation)
- Runtime: local KoHRM/HRM-Text PrefixLM runtime
- Export format: single-file
model.safetensorsplus tokenizer/config
Training
- Dataset:
gyung/korean-bar-exam-hard-current-law-precedent-sft-1000sft/train.jsonl - Rows kept: 1,000 (0 dropped)
- Tokens: 594,803 (avg sample 595, max 865)
- Epochs: 2
- Global batch size: 4,096 tokens
- Learning rate: 2.0e-5, cosine, warmup 10 steps
- Single H200 (CUDA index 7), torchrun nproc_per_node=1
- Train loss (final): 0.243
- Run time: ~7 minutes
Subject distribution of the additional SFT set:
| subject | count |
|---|---|
| 공법 | 270 |
| 민사법 | 460 |
| 형사법 | 270 |
Assistant response template (from the source dataset, preserved verbatim):
정답: <번호>
해설: 정답은 <번호>번이다. ㄱ은 옳다/옳지 않다. {법령 인용} ... ㄴ은 ... ㄷ은 ... ㄹ은 ...
참고 법령: <법령1>(url); <법령2>(url); ...
The first generation token is always the answer number, which makes greedy-decode answer extraction simple.
Evaluation (round 15)
Round 15 of gyung/korean-bar-exam-moj-multiple-choice is held out. 145 single-answer questions.
| run | condition | accuracy | parse rate |
|---|---|---|---|
| base (no SFT) | direct | 13.1 % (19/145) | 61.4 % |
| parent (1-14 SFT only) | direct | 26.9 % (39/145) | 100 % |
| this checkpoint | cot | 20.0 % (29/145) | 100 % |
| this checkpoint | direct | 22.1 % (32/145) | 100 % |
By subject (this checkpoint, direct condition):
| subject | accuracy |
|---|---|
| 공법 | 20.5 % (8/39) |
| 민사법 | 20.9 % (14/67) |
| 형사법 | 25.6 % (10/39) |
Random baseline (single-answer 5-way) = 20 %. The hard-current-law continuation did not improve over the parent checkpoint on round 15. Inspecting the generations, the model still produces short "정답: X" outputs (inherited from the parent run) instead of the longer 정답/해설/참고 법령 format from this SFT set, so the additional signal does not fully transfer at inference time.
Usage
This is not a standard Hugging Face AutoModelForCausalLM chat-model export. It uses the KoHRM/HRM-Text PrefixLM runtime. Tokenizer special tokens (no chat_template):
<|im_start|> boq (id 2)
<|im_end|> eoq (id 3)
<|box_end|> eoa (id 35, eos)
<|object_ref_start|> direct condition (id 32)
<|object_ref_end|> cot condition (id 33)
Prompt is tokenized as <|im_start|><condition_token>{instruction}<|im_end|>, generation stops at <|box_end|>. See simple_inference_engine.py in the source repo.
Source
- Additional SFT dataset:
gyung/korean-bar-exam-hard-current-law-precedent-sft-1000 - Parent checkpoint:
LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-BarExam-MCQ-1-14-Epoch2 - Source license: Korea Open Government License Type 1 (KOGL Type 1) for statute/precedent data
- Downloads last month
- -
Model tree for LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-BarExam-MCQ-1-14-HardCurrentLaw-1000-Epoch2
Base model
LLM-OS-Models/KoHRM-Text-1.4B