ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning Paper • 2512.05111 • Published 5 days ago • 45
view article Article OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve May 20 • 52
EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning Paper • 2510.17928 • Published Oct 20 • 2
EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning Paper • 2510.17928 • Published Oct 20 • 2 • 2
Confidence as a Reward: Transforming LLMs into Reward Models Paper • 2510.13501 • Published Oct 15 • 1
DevBench: A Comprehensive Benchmark for Software Development Paper • 2403.08604 • Published Mar 13, 2024 • 2
SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution Paper • 2501.05040 • Published Jan 9 • 15
Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation Paper • 2502.06563 • Published Feb 10
Confidence as a Reward: Transforming LLMs into Reward Models Paper • 2510.13501 • Published Oct 15 • 1
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space Paper • 2504.13835 • Published Apr 18 • 38