Question from a newb

by yasor84052 - opened May 1

May 1

I am confused here, I was under impression that the workflow should be:

Abliterate the base model — remove refusal directions from the residual stream before any personality is baked in
Then finetune — so the finetuning reinforces the abliterated behavior rather than fighting against it

Doing it the other way around (finetune first, then abliterate) means you're trying to surgically remove refusals that are now entangled with the roleplay/personality training. You get higher chance of coherence degradation.

llmfan46

Owner May 2

I am not the author of this fintune, ConicCat is, Hereticating finetunes is fine, I did plenty and you can see the benchmarks on the model cards, feel free to test the models yourself too also if you prefer that over benchmarks.

yasor84052

May 2

I guess what I was saying is that it might produce better quality if you guys collaborate on this so that you for example abliterate base gemma4 model and then they finetune off that. I guess I am just curious how much of a model quality difference that would make.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment