Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published 1 day ago • 103
Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO Paper • 2511.16669 • Published 21 days ago • 31
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13 • 176
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26 • 184
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO Paper • 2505.13031 • Published May 19 • 4
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation Paper • 2305.17011 • Published May 26, 2023
GrootVL: Tree Topology is All You Need in State Space Model Paper • 2406.02395 • Published Jun 4, 2024 • 1
COVE: Unleashing the Diffusion Feature Correspondence for Consistent Video Editing Paper • 2406.08850 • Published Jun 13, 2024
HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding Paper • 2503.14694 • Published Mar 12
MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO Paper • 2505.13031 • Published May 19 • 4
HaploOmni: Unified Single Transformer for Multimodal Video Understanding and Generation Paper • 2506.02975 • Published Jun 3