MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory Paper • 2605.15128 • Published 5 days ago • 60
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation Paper • 2605.13724 • Published 6 days ago • 92
CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives Paper • 2605.12496 • Published 7 days ago • 28
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 7 days ago • 179
HumanNet: Scaling Human-centric Video Learning to One Million Hours Paper • 2605.06747 • Published 12 days ago • 51
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Paper • 2604.28185 • Published 19 days ago • 90
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published 20 days ago • 106
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 22 days ago • 118
ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents Paper • 2604.23781 • Published 23 days ago • 33
MultiWorld: Scalable Multi-Agent Multi-View Video World Models Paper • 2604.18564 • Published 29 days ago • 45
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 162
Elucidating the SNR-t Bias of Diffusion Probabilistic Models Paper • 2604.16044 • Published Apr 17 • 74
ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents Paper • 2604.11784 • Published Apr 13 • 143
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation Paper • 2604.11804 • Published Apr 13 • 72
Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published Mar 30 • 58
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs Paper • 2603.05890 • Published Mar 6 • 93