1 6 6

duhe

Elynden

kinza99

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

liked a dataset about 1 month ago

Elynden/AgentBench-EvoSyn

upvoted an article about 2 months ago

OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve

View all activity

Organizations

None yet

upvoted a paper 4 days ago

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Paper • 2512.05111 • Published 5 days ago • 45

liked a dataset about 1 month ago

Elynden/AgentBench-EvoSyn

Updated Oct 23 • 13 • 1

upvoted an article about 2 months ago

Article

OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve

May 20

•

updated a dataset about 2 months ago

Elynden/AgentBench-EvoSyn

Updated Oct 23 • 13 • 1

authored a paper about 2 months ago

EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning

Paper • 2510.17928 • Published Oct 20 • 2

commented a paper about 2 months ago

EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning

Paper • 2510.17928 • Published Oct 20 • 2 •

updated a dataset about 2 months ago

Elynden/LiveCodeBench-EvoSyn

Updated Oct 22 • 29

published 2 datasets about 2 months ago

Elynden/AgentBench-EvoSyn

Updated Oct 23 • 13 • 1

Elynden/LiveCodeBench-EvoSyn

Updated Oct 22 • 29

updated a collection about 2 months ago

EvoSyn

Collection

2 items • Updated Oct 20

upvoted a paper about 2 months ago

Confidence as a Reward: Transforming LLMs into Reward Models

Paper • 2510.13501 • Published Oct 15 • 1

authored 4 papers about 2 months ago

DevBench: A Comprehensive Benchmark for Software Development

Paper • 2403.08604 • Published Mar 13, 2024 • 2

SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution

Paper • 2501.05040 • Published Jan 9 • 15

Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation

Paper • 2502.06563 • Published Feb 10

Confidence as a Reward: Transforming LLMs into Reward Models

Paper • 2510.13501 • Published Oct 15 • 1

liked a dataset 4 months ago

open-r1/ioi

Viewer • Updated Mar 12 • 270 • 248 • 10

upvoted a paper 6 months ago

SWE-bench Goes Live!

Paper • 2505.23419 • Published May 29 • 21

liked a Space 7 months ago

Open LMM Subjective Leaderboard

🌎

VLMEvalKit Subjectivce Benchmark Results

upvoted a paper 8 months ago

MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space

Paper • 2504.13835 • Published Apr 18 • 38

duhe

AI & ML interests

Recent Activity

Organizations

Elynden's activity

OpenEvolve: An Open Source Implementation of Google DeepMind's AlphaEvolve

Open LMM Subjective Leaderboard