Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published 26 days ago • 112
How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation Paper • 2312.17115 • Published Dec 28, 2023 • 2
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States Paper • 2505.17663 • Published May 23, 2025 • 15
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling Paper • 2505.19187 • Published May 25, 2025 • 13 • 3
LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling Paper • 2505.19187 • Published May 25, 2025 • 13