where is workflow？

by Zero110 - opened Sep 3

Sep 3

Sorry, I’m a beginner. Could you please tell me where the workflow of ComfyUI can be found?
Also, what do Q2, Q4, Q6, K, S, K_S, and K_M each represent, what are the differences between them, and how should I choose among them?
Can they be used together with LightX LoRA?
I would be very grateful if I could get an answer.

DrEsteban

Sep 4

I'm also new and trying to learn! Ideally i'd like to start with a pre-configured workflow because i'm still wrapping my head around it.

Andyx1976

Sep 4

•

edited Sep 6

use the standard wan workflow template in comfyui, double click add a node called "unet laoder (gguf)" and just replace the two "load diffusion model" nodes in the standard workflow with it. Just connect the new ones to the same point the old ones were connected. And in those you select the high noise and low noise gguf model the same way the fp8 models were in the old loaders. (you need both, drop them into comfyui/models(diffusion-models, it will find them (after a restart).
all else (settings, lora...) works exactly the same, it's a 1:1 replacement.

As for which: Higher (bigger) is better Q4<5<6<8. BUT if it is bigger than your vram you will take a massive performance hit (much more relevant in a video model than a single image model) . You wont get any errors usually because Comfyui handles Vrram swapping). If the model is also bigger than your normal (free) system ram, the performance hit will become much bigger, it has to shuffle many GB in and out of your (hopefully) SSD. That said Q8 is often considered overkill.

look into or ask a (web connected!) ai, its great for that stuff. It was a massive benefit for me to get into this. It answer quickly, it can sum up quickly, it doesnt get annoyed, it is not allowed to tell you that your question is stupid (i certainly had some silly questions at the start as EVERYBODY has). Add a system prompt including "avoid speculation, if in doubt look it up online".

btw: You can replace any normal model with a alternative gguf version that way You can also use GGUF versions of other things like clip and text encoders (same node but (GGUF) version i.e. "Double Clip Loader (GGUF)"
Especially useful if one of the text encoders is a full-on llm like for Quen Image, which is very large). Your mileage on how good they work compared to normal or FP8 version of it may vary from case to case. Both fp8 and gguf ARE compromises for speed and vram size

I had SO much better results with this (using q6) than with the Fp8 models. With longer, complicated prompts both wan fp8 and umt5_xxl in fp8 version are just baaad. Q6 + umt5_xxl fp16 on such prompts much better, no (!) messed up hands, low res, warping image, people just morphing instead of moving... However that will likely get worse again if you go lower than q6.

I'm now tried umt5_xxl in fp8 version (using a long, complicated prompt!) and then umt5_xxl in a Q8 gguf version. Using always this wan q6 gguf with it : instantly a prompt that gave me a blurry morphy mess everytime before, does exactly what it should when swapping umt5 out to the q8 version. it may be slower but the result not being garbage makes that kind of irrelevant. gotta check now vs using the full umt5 fp16 version.
simpler prompts are likely just fine with fp8 text encoder.

bottom line: gguf q6 or higher for both main model and text encoder are so much better than fp8, it's not even funny.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment