Active Video Perception: Iterative Evidence Seeking for Agentic Long Video Understanding Paper • 2512.05774 • Published 5 days ago • 5
Scaling Spatial Intelligence with Multimodal Foundation Models Paper • 2511.13719 • Published 23 days ago • 44
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 8 days ago • 193
Flash-DMD: Towards High-Fidelity Few-Step Image Generation with Efficient Distillation and Joint Reinforcement Learning Paper • 2511.20549 • Published 15 days ago • 24
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published 13 days ago • 69
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens Paper • 2511.19418 • Published 16 days ago • 26
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models Paper • 2511.11007 • Published 26 days ago • 15
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published 23 days ago • 132