KoHRM-Text-1.4B FullSFT BarExam MCQ + Hard Current-Law Precedent Epoch2

Full fine-tune that continues from LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-BarExam-MCQ-1-14-Epoch2 and adds the gyung/korean-bar-exam-hard-current-law-precedent-sft-1000 corpus.

Base Model

Base: LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-BarExam-MCQ-1-14-Epoch2
Relation: full fine-tune (continuation)
Runtime: local KoHRM/HRM-Text PrefixLM runtime
Export format: single-file model.safetensors plus tokenizer/config

Training

Dataset: gyung/korean-bar-exam-hard-current-law-precedent-sft-1000 sft/train.jsonl
Rows kept: 1,000 (0 dropped)
Tokens: 594,803 (avg sample 595, max 865)
Epochs: 2
Global batch size: 4,096 tokens
Learning rate: 2.0e-5, cosine, warmup 10 steps
Single H200 (CUDA index 7), torchrun nproc_per_node=1
Train loss (final): 0.243
Run time: ~7 minutes

Subject distribution of the additional SFT set:

subject	count
공법	270
민사법	460
형사법	270

Assistant response template (from the source dataset, preserved verbatim):

정답: <번호>

해설: 정답은 <번호>번이다. ㄱ은 옳다/옳지 않다. {법령 인용} ... ㄴ은 ... ㄷ은 ... ㄹ은 ...

참고 법령: <법령1>(url); <법령2>(url); ...

The first generation token is always the answer number, which makes greedy-decode answer extraction simple.

Evaluation (round 15)

Round 15 of gyung/korean-bar-exam-moj-multiple-choice is held out. 145 single-answer questions.

run	condition	accuracy	parse rate
base (no SFT)	direct	13.1 % (19/145)	61.4 %
parent (1-14 SFT only)	direct	26.9 % (39/145)	100 %
this checkpoint	cot	20.0 % (29/145)	100 %
this checkpoint	direct	22.1 % (32/145)	100 %

By subject (this checkpoint, direct condition):

subject	accuracy
공법	20.5 % (8/39)
민사법	20.9 % (14/67)
형사법	25.6 % (10/39)

Random baseline (single-answer 5-way) = 20 %. The hard-current-law continuation did not improve over the parent checkpoint on round 15. Inspecting the generations, the model still produces short "정답: X" outputs (inherited from the parent run) instead of the longer 정답/해설/참고 법령 format from this SFT set, so the additional signal does not fully transfer at inference time.

Usage

This is not a standard Hugging Face AutoModelForCausalLM chat-model export. It uses the KoHRM/HRM-Text PrefixLM runtime. Tokenizer special tokens (no chat_template):

<|im_start|>          boq (id 2)
<|im_end|>            eoq (id 3)
<|box_end|>           eoa (id 35, eos)
<|object_ref_start|>  direct condition (id 32)
<|object_ref_end|>    cot condition    (id 33)

Prompt is tokenized as <|im_start|><condition_token>{instruction}<|im_end|>, generation stops at <|box_end|>. See simple_inference_engine.py in the source repo.

Source

Additional SFT dataset: gyung/korean-bar-exam-hard-current-law-precedent-sft-1000
Parent checkpoint: LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-BarExam-MCQ-1-14-Epoch2
Source license: Korea Open Government License Type 1 (KOGL Type 1) for statute/precedent data

Downloads last month: -

Safetensors

Model size

1B params

Tensor type

BF16

Model tree for LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-BarExam-MCQ-1-14-HardCurrentLaw-1000-Epoch2

Base model

LLM-OS-Models/KoHRM-Text-1.4B

Finetuned

LLM-OS-Models/KoHRM-Text-1.4B-FullSFT-BarExam-MCQ-1-14-Epoch2

Finetuned

(1)

this model