OpenGPT-4o-Image: A Comprehensive Dataset for Advanced Image Generation and Editing Paper • 2509.24900 • Published Sep 29 • 53
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks Aug 11 • 75
Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation Paper • 2508.09987 • Published Aug 13 • 25
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model Paper • 2507.01953 • Published Jul 2 • 19
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Paper • 2506.17450 • Published Jun 20 • 64
LettinGo: Explore User Profile Generation for Recommendation System Paper • 2506.18309 • Published Jun 23 • 11
AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation Paper • 2506.10540 • Published Jun 12 • 37
MiniMax-Speech: Intrinsic Zero-Shot Text-to-Speech with a Learnable Speaker Encoder Paper • 2505.07916 • Published May 12 • 133