This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip
-
lllyx/Qwen3-1.7B-SFT
Text Generation • 2B • Updated • 835 • 2 -
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
Paper • 2604.13016 • Published • 91 -
lllyx/Qwen3-4B-Base-GRPO
Text Generation • 4B • Updated • 152 • 2 -
lllyx/OpenThought3-Qwen3-4B
Viewer • Updated • 305k