5 35 46

Xiao Liang

MasterVito

AI & ML interests

None yet

Recent Activity

upvoted a paper 12 days ago

Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

upvoted a paper 28 days ago

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL

upvoted a paper about 1 month ago

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

View all activity

Organizations

upvoted a paper 12 days ago

Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Paper • 2606.06428 • Published 13 days ago • 25

upvoted a paper 28 days ago

EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL

Paper • 2605.18703 • Published 30 days ago • 50

upvoted a paper about 1 month ago

ClawEnvKit: Automatic Environment Generation for Claw-Like Agents

Paper • 2604.18543 • Published Apr 20 • 30

upvoted 2 papers 2 months ago

OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks

Paper • 2604.08539 • Published Apr 9 • 50

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published Apr 2 • 101

upvoted 2 papers 3 months ago

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published Mar 17 • 110

When AI Navigates the Fog of War

Paper • 2603.16642 • Published Mar 17 • 31

upvoted an article 4 months ago

Article

DenseR: Dense Rewards For Free in LLM Reasoning

hbXNov

•

Feb 18

• 21

upvoted 3 papers 4 months ago

Improving Data and Reward Design for Scientific Reasoning in Large Language Models

Paper • 2602.08321 • Published Feb 9 • 44

MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration

Paper • 2602.01734 • Published Feb 2 • 32

Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability

Paper • 2602.02477 • Published Feb 2 • 11

authored a paper 4 months ago

Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability

Paper • 2602.02477 • Published Feb 2 • 11

submitted a paper to Daily Papers 4 months ago

Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability

Paper • 2602.02477 • Published Feb 2 • 11

updated a dataset 4 months ago

HAGeo-IMO/HAGeo-409

Viewer • Updated Feb 2 • 409 • 62

authored a paper 5 months ago

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

Paper • 2601.14004 • Published Jan 20 • 49

upvoted a paper 5 months ago

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

Paper • 2601.14004 • Published Jan 20 • 49

upvoted a paper 6 months ago

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Paper • 2512.22615 • Published Dec 27, 2025 • 51

liked a model 6 months ago

RLVR-SvS/SvS-Qwen-Code-7B

Reinforcement Learning • 8B • Updated Dec 11, 2025 • 3 • 3

updated 2 models 6 months ago

RLVR-SvS/SvS-Qwen-3B

Reinforcement Learning • 3B • Updated Dec 11, 2025 • 2

RLVR-SvS/SvS-Qwen-32B

Reinforcement Learning • 33B • Updated Dec 11, 2025 • 1

Xiao Liang

AI & ML interests

Recent Activity

Organizations

MasterVito's activity

DenseR: Dense Rewards For Free in LLM Reasoning