ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning Paper • 2512.05111 • Published 5 days ago • 45
view article Article OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve May 20 • 52
Confidence as a Reward: Transforming LLMs into Reward Models Paper • 2510.13501 • Published Oct 15 • 1
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space Paper • 2504.13835 • Published Apr 18 • 38
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution Paper • 2501.05040 • Published Jan 9 • 15