Request for Q8_K_XL

#1
by thataigod - opened

I tried to make the GGUFs earlier but failed, Not sure how you did it but currently yours is the only repo with them. Would love that sweet Q8_K_XL if possible.

Just saw that you updated the model card, damn bro even Q3KM @4GB is killing it. This is the model people are waiting for after SD1.5. I can't even imagine what we can do once the base model is released for fine tuning.

I only see K_XL quants from unsloth

I tried to make the GGUFs earlier but failed, Not sure how you did it but currently yours is the only repo with them. Would love that sweet Q8_K_XL if possible.

Q8_K_XL better than the regular Q8? Q8 has always been equivalent to bf16, which means it's already the best. I googled for info, but couldn't find anything. What makes K_XL better?

Can you guys explain how i got 6 fingers on his hand using bf16.safetensors first try, randomly chose fixed seed 7? bf16 can't be worse than Q8.gguf, right?

00119_

This promt is from their hf space demo
Update: Oh wow... I just tried to use 9steps instead of 8steps and he miraculously has 5 fingers now!? So just this last 9th step can fix fingers?

I tried to make the GGUFs earlier but failed, Not sure how you did it but currently yours is the only repo with them. Would love that sweet Q8_K_XL if possible.

Q8_K_XL better than the regular Q8? Q8 has always been equivalent to bf16, which means it's already the best. I googled for info, but couldn't find anything. What makes K_XL better?

Q8_K_XL should have a bit more accuracy than even Q8. Q8 (specifically Q8_0) uses a basic, uniform 8-bit quantization scheme, whereas Q8_K_XL uses a more advanced, group-wise quantization method (K-bit) which generally results in superior accuracy and performance closer to the original full-precision model. You can see that in this image where some of the weights are left at FP32 and BF16 even though the others are Q8.

whats-the-difference-between-q8-k-xl-and-q8-0-v0-k8ag65lx401f1

Sign up or log in to comment