BESPOKE: Benchmark for Search-Augmented Large Language Model Personalization via Diagnostic Feedback Paper • 2509.21106 • Published Sep 25 • 7
In Their Own Words: Reasoning Traces Tailored for Small Models Make Them Better Reasoners Paper • 2509.22230 • Published Sep 26 • 5
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation Paper • 2410.13232 • Published Oct 17, 2024 • 44