Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2403.03206

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

papers [image generation]

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

Image Generation

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5 • 27

Diffusion-Papers

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
Elucidating the Design Space of Diffusion-Based Generative Models

Paper • 2206.00364 • Published Jun 1, 2022 • 18
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

Diffusion Models

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published Aug 26, 2024 • 62
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 63
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model

Paper • 2409.01199 • Published Sep 2, 2024 • 14

FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 72
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28 • 116
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22 • 53
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Paper • 2509.12201 • Published Sep 15 • 104

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

image-generation

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 31
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.22M • • 12k
Qwen/Qwen2-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Feb 6 • 1.57M • • 1.24k
zer0int/CLIP-GmP-ViT-L-14

Zero-Shot Image Classification • 0.4B • Updated Jul 16 • 7.23k • 507

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 72
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28 • 116
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22 • 53
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Paper • 2509.12201 • Published Sep 15 • 104

papers [image generation]

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

Image Generation

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5 • 27

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

Diffusion-Papers

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published Sep 17, 2024 • 115
Elucidating the Design Space of Diffusion-Based Generative Models

Paper • 2206.00364 • Published Jun 1, 2022 • 18
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5, 2024 • 71

Diffusion Models

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Paper • 2408.14176 • Published Aug 26, 2024 • 62
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27, 2024 • 126
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 63
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model

Paper • 2409.01199 • Published Sep 2, 2024 • 14

image-generation

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 31
black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.22M • • 12k
Qwen/Qwen2-VL-7B-Instruct

Image-Text-to-Text • 8B • Updated Feb 6 • 1.57M • • 1.24k
zer0int/CLIP-GmP-ViT-L-14

Zero-Shot Image Classification • 0.4B • Updated Jul 16 • 7.23k • 507

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs