Something's broken

#1
by ggnoy - opened

The model is very schizophrenic. Starts taking over for user, but not completely. Ends up talking with itself in the same turn nonsensically. Original 12b by @grimjim doesn't act like that.

Which quant are you using? I will give it a test. Are you using it in sillytarvan for RP?

Yes, sillytavern. I used q5km I made with gguf my repo. Tried both oai compatibility end point and regular completion to rule out the template errors.

Thanks! I will get back to you as soon as I finish testing.

For my attempt on the 12B version, I applied the equivalent of "--projected" for both measurement and ablation, and "--normpreserve" for ablation.
I also clipped/capped during measurement at 0.995 strength for the peak model when it came to UGI NatInt. The resulting model did retain some soft refusals.

The resulting model did retain some soft refusals.

Yeah, it's not exactly uncensored per se, but it's a bit more pliant than the original without being waaay dumber, so it's an improvement. So I really wanted that, but in 27b.

Yeah, it's not exactly uncensored per se, but it's a bit more pliant than the original without being waaay dumber, so it's an improvement. So I really wanted that, but in 27b.
I've been using gemma3-27b-abliterated-dpo for months, is it really that dumber then normpreserve version?

I’ve noticed that this abliterated version can have trouble following instructions, especially when the context gets long. The safety mechanisms in the original model were quite stubborn, so I had to abliterate it rather aggressively to remove refusals on my test prompts, and I guess that this has damaged the model’s capacity to some extent. The goal is to strike a balance between refusal removal and output quality, which is a delicate art. I’ll keep working on this and come back with an updated version. Thank you so much for your feedback and time!

@grimjim Thanks again for the tips. I also want to take this chance to report an issue in the llm-abliteration repository. When running python chat.py -m <path_to_your_abliterated_model> I got an error, but when I added -p fp32 the chat ran normally—just painfully slow, since it was using fp32.

I admit to having neglected that script. Thanks for the error report! I'll check what's going on.

Hello everyone,
A new set of GGUF files (and safetensor files) has been uploaded. The quality of the model’s responses has improved significantly, thanks to tips from @grimjim .

I have also released YanLabs/gemma-3-27b-abliterated-normpreserve-v1, which is even less abliterated than the current model, in order to better preserve model quality. For this v1 version, only the Q8_0 quantization is recommended. At quantization levels lower than Q8_0, refusals still occur, but with Q8_0 and F16 the model does not refuse.

I’ll leave this thread open, and your feedback is much appreciated.

I've downloaded the new model in q5km and it's a lot better. Very stable and relatively non-nanny. It doesn't feel as smart, however. Maybe the abliteration could be tuned down just a notch? Or maybe it's just pretrain being filtered?

@ggnoy Thanks for the feedback! If you want a less abliterated version, I recommend YanLabs/gemma-3-27b-abliterated-normpreserve-v1-GGUF Q8_0, where the abliteration is done as little as possible (for any lower quants, refusals are still there!).
Quantization always hurts model performance. I can even sense some loss from Q8_0 to Q6_K, though very subtle, from the model's choice of words. Abliteration also hurts. That said, I have some ideas for making abliteration more precise, but haven't tested them yet. I believe the community will find better ways to unleash LLM potential in the near future. Stay tuned!

Sign up or log in to comment