GGUF versions and comfyui support
#2
by maroo87 - opened
Impossible to run on low end devices like 12vram, please provide a gguf version run at least at 12vram or less
Thanks for the feedback β this is a very valid request.
At the moment, you can reduce GPU memory usage by lowering max_memory_per_gpu (this forces more aggressive CPU offloading, so it runs slower but can fit smaller cards).
Regarding low-bit options:
- BF16 is the currently tested/recommended path.
- NF4 / INT8 are technically possible.
- GGUF is not officially provided yet; we understand the demand for <=12GB VRAM devices and are investigating feasibility.
For ComfyUI support: it is under consideration, and we will share updates once we have a stable integration plan.
We appreciate the request and will prioritize better low-VRAM usability in upcoming updates.