Qwen-Image-Edit-2509-Turbo-Lightning / MULTI_LORA_DOCUMENTATION.md
LPX55's picture
temp: remove all LoRAs except 1 for testing
931e0eb

A newer version of the Gradio SDK is available: 6.0.2

Upgrade

Multi-LoRA Image Editing Implementation (Simplified)

Overview

This implementation provides a simplified multi-LoRA (Low-Rank Adaptation) system for the Qwen-Image-Edit application, focusing on Lightning LoRA always active as the base optimization with only Object Remover as an additional LoRA for testing.

Architecture

Core Components

  1. LoRAManager (lora_manager.py)

    • Centralized management of multiple LoRA adapters
    • Registry system for storing LoRA configurations
    • Dynamic loading and fusion capabilities
    • Memory management and cleanup
  2. LoRA Configuration (app.py)

    • Centralized LORA_CONFIG dictionary
    • Lightning LoRA configured as always-loaded base
    • Simplified to Object Remover for focused testing
  3. Dynamic UI System (app.py)

    • Conditional component visibility based on LoRA selection
    • Lightning LoRA status indication
    • Type-specific UI adaptations (edit vs base)
    • Real-time interface updates

⚑ Lightning LoRA Always-On Architecture

Core Principle

Lightning LoRA is always loaded as the base model for fast 4-step generation, regardless of which other LoRA is selected. This provides:

  • Consistent Performance: Always-on 4-step generation
  • Enhanced Speed: Lightning's optimization applies to all operations
  • Multi-LoRA Fusion: Combine Lightning speed with Object Remover capabilities

Implementation Details

1. Always-On Loading

# Lightning LoRA is loaded first and always remains active
LIGHTNING_LORA_NAME = "Lightning (4-Step)"

print(f"Loading always-active Lightning LoRA: {LIGHTNING_LORA_NAME}")
lightning_lora_path = hf_hub_download(
    repo_id=lightning_config["repo_id"], 
    filename=lightning_config["filename"]
)

lora_manager.register_lora(LIGHTNING_LORA_NAME, lightning_lora_path, **lightning_config)
lora_manager.configure_lora(LIGHTNING_LORA_NAME, {
    "description": lightning_config["description"],
    "is_base": True
})

# Load Lightning LoRA and keep it always active
lora_manager.load_lora(LIGHTNING_LORA_NAME)
lora_manager.fuse_lora(LIGHTNING_LORA_NAME)

2. Multi-LoRA Combination

def load_and_fuse_additional_lora(lora_name):
    """
    Load an additional LoRA while keeping Lightning LoRA always active.
    This enables combining Lightning's speed with Object Remover capabilities.
    """
    # Always keep Lightning LoRA loaded
    # Load additional LoRA without resetting to base state
    if config["method"] == "standard":
        print("Using standard loading method...")
        # Load additional LoRA without fusing (to preserve Lightning)
        pipe.load_lora_weights(lora_path, adapter_names=[lora_name])
        # Set both adapters as active
        pipe.set_adapters([LIGHTNING_LORA_NAME, lora_name])
        print(f"Lightning + {lora_name} now active.")

3. Lightning Preservation in Inference

def infer(lora_name, ...):
    """Main inference function with Lightning always active"""
    # Load additional LoRA while keeping Lightning active
    load_and_fuse_lora(lora_name)
    
    print("--- Running Inference ---")
    print(f"LoRA: {lora_name} (with Lightning always active)")
    
    # Generate with Lightning + additional LoRA
    result_image = pipe(
        image=image_for_pipeline,
        prompt=final_prompt,
        num_inference_steps=int(num_inference_steps),
        # ... other parameters
    ).images[0]
    
    # Don't unfuse Lightning - keep it active for next inference
    if lora_name != LIGHTNING_LORA_NAME:
        pipe.disable_adapters()  # Disable additional LoRA but keep Lightning

Simplified LoRA Configuration

Current Supported LoRAs

LoRA Name Type Method Always-On Description
⚑ Lightning (4-Step) base standard βœ… Always Fast 4-step generation - always active
None edit none ❌ Base model with Lightning optimization
Object Remover edit standard ⚑ Lightning+ Removes objects from an image while maintaining background consistency

Lightning + Object Remover Combination

Lightning + Object Remover: Fast object removal with 4-step generation optimization

LoRA Type Classifications

  • Base LoRA: Lightning (always loaded, always active)
  • Edit LoRAs: Object Remover (requires input images, uses standard fusion)
  • None: Base model with Lightning optimization

Key Features

1. Dynamic UI Components

The system automatically adapts the user interface and shows Lightning status:

def on_lora_change(lora_name):
    """Dynamic UI component visibility handler"""
    config = LORA_CONFIG[lora_name]
    is_style_lora = config["type"] == "style"
    
    # Lightning LoRA info
    lightning_info = "⚑ **Lightning LoRA always active** - Fast 4-step generation enabled"
    
    return {
        lora_description: gr.Markdown(visible=True, value=f"**{lightning_info}**  \n\n**Description:** {config['description']}"),
        input_image_box: gr.Image(visible=not is_style_lora, type="pil"),
        style_image_box: gr.Image(visible=is_style_lora, type="pil"),
        prompt_box: gr.Textbox(visible=(config["prompt_template"] != "change the face to face segmentation mask"))
    }

2. Always-On Lightning Performance

# Lightning configuration as always-loaded base
"Lightning (4-Step)": {
    "repo_id": "lightx2v/Qwen-Image-Lightning",
    "filename": "Qwen-Image-Lightning-4steps-V2.0.safetensors",
    "type": "base",
    "method": "standard",
    "always_load": True,
    "prompt_template": "{prompt}",
    "description": "Fast 4-step generation LoRA - always loaded as base optimization.",
}

3. Simplified Multi-LoRA Fusion

  • Lightning Base: Always loaded, always active
  • Object Remover: Loaded alongside Lightning using standard fusion
  • None: Lightning-only operation

4. Memory Management with Lightning

  • Lightning LoRA remains loaded throughout session
  • Object Remover LoRA loaded/unloaded as needed
  • GPU memory optimized for Lightning + one additional LoRA
  • Automatic cleanup of non-Lightning adapters

5. Prompt Template System

Each LoRA has a custom prompt template (Lightning provides base 4-step generation):

"Object Remover": {
    "prompt_template": "Remove {prompt}",
    "type": "edit"
}

Usage

Basic Usage with Always-On Lightning

  1. Lightning is Always Active: No selection needed - Lightning runs all operations
  2. Select Additional LoRA: Choose "Object Remover" to combine with Lightning
  3. Upload Images: Upload input image to edit
  4. Enter Prompt: Describe the object to remove
  5. Configure Settings: Adjust advanced parameters (4-step generation always enabled)
  6. Generate: Click "Generate!" to process with Lightning optimization

Object Remover Usage

  1. Select "Object Remover" from the dropdown
  2. Upload Input Image: The image containing the object to remove
  3. Enter Prompt: Describe the object to remove (e.g., "person", "car", "tree")
  4. Generate: Lightning + Object Remover will remove the specified object

Advanced Configuration

Adding New LoRAs (with Lightning Always-On)

  1. Add to LORA_CONFIG:
"Custom LoRA": {
    "repo_id": "username/custom-lora",
    "filename": "custom.safetensors",
    "type": "edit",  # or "style"
    "method": "standard",  # or "manual_fuse"
    "prompt_template": "Custom instruction: {prompt}",
    "description": "Description of the LoRA capabilities"
}
  1. Register with LoRAManager:
lora_path = hf_hub_download(repo_id=config["repo_id"], filename=config["filename"])
lora_manager.register_lora("Custom LoRA", lora_path, **config)
  1. Lightning + Custom LoRA: Automatically combines with always-on Lightning

Technical Implementation

Lightning Always-On Process

  1. Initialization: Load Lightning LoRA first
  2. Fusion: Fuse Lightning weights permanently
  3. Persistence: Keep Lightning active throughout session
  4. Combination: Load Object Remover alongside Lightning
  5. Preservation: Never unload Lightning LoRA

Lightning Loading Process

def load_and_fuse_lora(lora_name):
    """Legacy function for backward compatibility"""
    if lora_name == LIGHTNING_LORA_NAME:
        # Lightning is already loaded, just ensure it's active
        print("Lightning LoRA is already active.")
        pipe.set_adapters([LIGHTNING_LORA_NAME])
        return
    
    load_and_fuse_additional_lora(lora_name)

Memory Management with Lightning

# Don't unfuse Lightning - keep it active for next inference
if lora_name != LIGHTNING_LORA_NAME:
    pipe.disable_adapters()  # Disable additional LoRA but keep Lightning
gc.collect()
torch.cuda.empty_cache()

Testing and Validation

Validation Scripts

  • test_lora_logic.py: Validates implementation logic without dependencies
  • test_lightning_always_on.py: Validates Lightning always-on functionality
  • test_lora_implementation.py: Full integration testing (requires PyTorch)

Lightning Always-On Test Coverage

βœ… Lightning LoRA configured as always-loaded base
βœ… Lightning LoRA loaded and fused on startup
βœ… Inference preserves Lightning LoRA state
βœ… Multi-LoRA combination supported
βœ… UI indicates Lightning always active
βœ… Proper loading sequence implemented

Object Remover Testing

βœ… Object Remover loads alongside Lightning
βœ… Lightning + Object Remover combination works
βœ… Prompt template "Remove {prompt}" functions correctly
βœ… Memory management for Lightning + Object Remover

Performance Considerations

Lightning Always-On Benefits

  • Consistent Speed: All operations use 4-step generation
  • Reduced Latency: No loading time for Lightning between requests
  • Enhanced Performance: Lightning optimization applies to Object Remover
  • Memory Efficiency: Lightning stays in memory, Object Remover loaded as needed

Speed Optimization

  • 4-Step Generation: Lightning provides ultra-fast inference
  • AOT Compilation: Ahead-of-time compilation with Lightning active
  • Adapter Combination: Lightning + Object Remover for optimal results
  • Optimized Attention Processors: FA3 attention with Lightning

Memory Optimization

  • Lightning LoRA always in memory (base memory usage)
  • Object Remover LoRA loaded on-demand
  • Efficient adapter switching
  • GPU memory management for multiple adapters

Troubleshooting

Common Issues

  1. Lightning Not Loading

    • Check HuggingFace Hub connectivity for Lightning repo
    • Verify lightx2v/Qwen-Image-Lightning repository exists
    • Ensure sufficient GPU memory for Lightning LoRA
  2. Slow Performance (Lightning Not Active)

    • Check Lightning LoRA is loaded: Look for "Lightning LoRA is already active"
    • Verify adapter status: pipe.get_active_adapters()
    • Ensure Lightning is not being disabled
  3. Object Remover Issues

    • Check Object Remover loading: Look for "Lightning + Object Remover now active"
    • Verify prompt format: Should be "Remove {object}"
    • Monitor memory usage for Lightning + Object Remover

Debug Mode

Enable debug logging to see Lightning always-on status:

import logging
logging.basicConfig(level=logging.DEBUG)

# Check Lightning status
print(f"Lightning active: {LIGHTNING_LORA_NAME in pipe.get_active_adapters()}")
print(f"All active adapters: {pipe.get_active_adapters()}")

Future Enhancements

Planned Features

  1. Additional LoRAs: Add more LoRAs after successful Object Remover testing
  2. LoRA Blending: Advanced blending of multiple LoRAs with Lightning
  3. Lightning Optimization: Dynamic Lightning parameter adjustment
  4. Performance Monitoring: Real-time Lightning performance metrics
  5. Batch Processing: Process multiple images with Lightning always-on

Extension Points

  • Custom Lightning optimization strategies
  • Multiple base LoRAs (beyond Lightning)
  • Advanced multi-LoRA combination algorithms
  • Lightning performance profiling

Simplified Configuration Benefits

Focused Testing

  • Reduced Complexity: Only Lightning + Object Remover to test
  • Clear Validation: Easy to verify Lightning always-on functionality
  • Debugging: Simplified troubleshooting with fewer variables
  • Performance: Clear performance benefits of Lightning always-on

Risk Mitigation

  • Gradual Rollout: Test one LoRA before adding more
  • Validation: Ensure Lightning + LoRA combination works correctly
  • Memory Management: Verify memory usage with Lightning + one LoRA
  • User Experience: Validate simplified UI with fewer options

References

License

This implementation follows the same license as the original Qwen-Image-Edit project.