22 2

liyaxuan

lllyx

AI & ML interests

None yet

Recent Activity

updated a collection about 24 hours ago

Rethinking OPD

updated a dataset 1 day ago

lllyx/OpenThought3-Qwen3-4B

updated a model 1 day ago

lllyx/Qwen3-1.7B-SFT

View all activity

Organizations

None yet

updated a collection about 24 hours ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 4 items • Updated about 24 hours ago • 1

updated a dataset 1 day ago

lllyx/OpenThought3-Qwen3-4B

Viewer • Updated about 24 hours ago • 305k • 28 • 1

updated a model 1 day ago

lllyx/Qwen3-1.7B-SFT

Text Generation • 2B • Updated about 24 hours ago • 882 • 2

published a dataset 1 day ago

lllyx/OpenThought3-Qwen3-4B

Viewer • Updated about 24 hours ago • 305k • 28 • 1

upvoted a collection 2 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 4 items • Updated about 24 hours ago • 1

upvoted a paper 2 days ago

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Paper • 2605.08083 • Published 5 days ago • 60

upvoted 4 papers 3 days ago

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

Paper • 2604.28123 • Published 12 days ago • 47

updated a collection 10 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 4 items • Updated about 24 hours ago • 1

upvoted a paper 10 days ago

MAIC-UI: Making Interactive Courseware with Generative UI

Paper • 2604.25806 • Published 15 days ago • 8

updated a collection 10 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 4 items • Updated about 24 hours ago • 1

updated a model 10 days ago

lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated 10 days ago • 163 • 2

published a model 10 days ago

lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated 10 days ago • 163 • 2

upvoted a paper 10 days ago

Co-Evolving Policy Distillation

Paper • 2604.27083 • Published 14 days ago • 63

upvoted a paper 19 days ago

Near-Future Policy Optimization

Paper • 2604.20733 • Published 21 days ago • 74

updated a collection 26 days ago

Rethinking OPD

Collection

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip • 4 items • Updated about 24 hours ago • 1

authored a paper 28 days ago

DeepPrune: Parallel Scaling without Inter-trace Redundancy

Paper • 2510.08483 • Published Oct 9, 2025 • 24

liyaxuan

AI & ML interests

Recent Activity

Organizations

lllyx's activity