view article Article Introducing smolagents: simple agents that write actions in code. +1 Dec 31, 2024 • 1.15k
Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration Paper • 2509.14760 • Published Sep 18 • 53
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models Paper • 2508.09834 • Published Aug 13 • 53
C3: A Bilingual Benchmark for Spoken Dialogue Models Exploring Challenges in Complex Conversations Paper • 2507.22968 • Published Jul 30 • 24
Pixels, Patterns, but No Poetry: To See The World like Humans Paper • 2507.16863 • Published Jul 21 • 68
SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks Paper • 2506.10954 • Published Jun 12 • 52
Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations Paper • 2506.04633 • Published Jun 5 • 19
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models Paper • 2505.23656 • Published May 29 • 25
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models Paper • 2505.23656 • Published May 29 • 25 • 2
Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning Paper • 2506.04207 • Published Jun 4 • 48
CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions Paper • 2407.17757 • Published Jul 25, 2024 • 1
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28 • 131
FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow Paper • 2505.17399 • Published May 23 • 14
CRASH: Crash Recognition and Anticipation System Harnessing with Context-Aware and Temporal Focus Attentions Paper • 2407.17757 • Published Jul 25, 2024 • 1
FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow Paper • 2505.17399 • Published May 23 • 14
Learn to Reason Efficiently with Adaptive Length-based Reward Shaping Paper • 2505.15612 • Published May 21 • 34
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published May 20 • 62