Korean Multilingual is broken.

#6
by DFveloper - opened

The Korean output is quite broken.
it mixes with 5 different languages.
input is nice. and latent reasoning is also nice.
I'll provide the translation.

Original

User: 한국어로 말해라.
LLM:

서로를로(Sử-ro-lo)로,我是由 Google 提供的 AI 模型.

我可以被看作是一个智能助手(Intelligent Assistant),能够理解并处理文本、代码,并回答各种问题。

如果您希望我用韩语(Korean)进行对话,我会根据您的要求切换。

**현재モード (Current Mode):** 我正在使用中文回答,但如果您需要我用韩语对话,请告诉我。

(Nếuคุณ muốn我用韩语回答,请写 "请用韩语回答" 或 "请用韩语继续".)

Translation

User: Say in korean.
LLM:

As Seororel(Sử-ro-lo), I'm Intelligent Assistant created by Google.
Do you know I'm smart assistant that understands text and code? If you wish to conversate "korean", I'll change mode with your favor.
**Current Mode:** now I'm talking chinese, Tell me if you wish to conversate korean.

(If you wish me to conversate with korean, say "please say with korean" or "please continue with korean" in chinese.)

the sentence is right, but language filter layers are gone.
also, model's attitude is Driven to Hell🤣

PS: Don't worry. it could be fixed with SFT.

REAP knowingly removes experts with lower saliency on a targeted set of samples. I chose to protect coding and agentic

I know but, was all dataset english?
if it is, could it be normalized with sft?
I didn't majored AI.

Sign up or log in to comment