Emergent Social Intelligence Risks in Generative Multi-Agent Systems Paper • 2603.27771 • Published 11 days ago • 50
Gen-Searcher: Reinforcing Agentic Search for Image Generation Paper • 2603.28767 • Published 10 days ago • 56
Vega: Learning to Drive with Natural Language Instructions Paper • 2603.25741 • Published 14 days ago • 6
Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought Paper • 2603.22847 • Published 16 days ago • 25
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published 23 days ago • 94
EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation Paper • 2603.12267 • Published 28 days ago • 13
Beyond Imitation: Reinforcement Learning for Active Latent Planning Paper • 2601.21598 • Published Jan 29 • 10
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published Dec 23, 2025 • 62
Bottom-up Policy Optimization: Your Language Model Policy Secretly Contains Internal Policies Paper • 2512.19673 • Published Dec 22, 2025 • 66
Reinforcement Learning for Self-Improving Agent with Skill Library Paper • 2512.17102 • Published Dec 18, 2025 • 42