Submitted by
Chenyang Song
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe