The tokenizer has been fixed for the Korean -> English model.

However, the same change was never applied to the English -> Korean model.
This Pull Request applies that same fix here.

English:
Personal transportation devices (skateboards, hover boards, scooters, etc.)

Korean:
개인 수송 장치 (스케이트 보드, 호버 보드, 스쿠터 등)

johnchen95 changed pull request title from Upload 2 files to Fix Tokenizer
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment