{ "_comment": "Stop generation on <|im_end|> (248046). The merged model's config.json sets eos_token_id to <|endoftext|> (248044), which the chat format never emits at turn end, so generation ran to max_new_tokens. Upload this to the merged model repo with finetune/push_generation_config.py.", "bos_token_id": 248045, "pad_token_id": 248044, "eos_token_id": [248046, 248044], "do_sample": false, "max_new_tokens": 2048, "transformers_version": "5.7.0" }