Adaptive TT ManCAR: Manifold-Constrained Latent Reasoning with Adaptive Test-Time Computation for Sequential Recommendation Paper • 2602.20093 • Published 23 days ago • 29
ManCAR: Manifold-Constrained Latent Reasoning with Adaptive Test-Time Computation for Sequential Recommendation Paper • 2602.20093 • Published 23 days ago • 29
RL QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13, 2025 • 182 Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 262 Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use Paper • 2603.03205 • Published 15 days ago • 11 π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs Paper • 2603.02083 • Published 16 days ago • 9
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13, 2025 • 182
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 262
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use Paper • 2603.03205 • Published 15 days ago • 11
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs Paper • 2603.02083 • Published 16 days ago • 9
Adaptive TT ManCAR: Manifold-Constrained Latent Reasoning with Adaptive Test-Time Computation for Sequential Recommendation Paper • 2602.20093 • Published 23 days ago • 29
ManCAR: Manifold-Constrained Latent Reasoning with Adaptive Test-Time Computation for Sequential Recommendation Paper • 2602.20093 • Published 23 days ago • 29
RL QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13, 2025 • 182 Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 262 Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use Paper • 2603.03205 • Published 15 days ago • 11 π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs Paper • 2603.02083 • Published 16 days ago • 9
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13, 2025 • 182
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 262
Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use Paper • 2603.03205 • Published 15 days ago • 11
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs Paper • 2603.02083 • Published 16 days ago • 9