The Alignment Waltz: Jointly Training Agents to Collaborate for Safety Paper • 2510.08240 • Published Oct 9, 2025 • 41
IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning Paper • 2509.22621 • Published Sep 26, 2025 • 8
view article Article Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment Feb 11, 2025 • 106
Jointly Reinforcing Diversity and Quality in Language Model Generations Paper • 2509.02534 • Published Sep 2, 2025 • 25
Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback Paper • 2506.11930 • Published Jun 13, 2025 • 53
Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback Paper • 2506.11930 • Published Jun 13, 2025 • 53
Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback Paper • 2506.11930 • Published Jun 13, 2025 • 53 • 3
Optimizing Decomposition for Optimal Claim Verification Paper • 2503.15354 • Published Mar 19, 2025 • 18