Beyond Fixed Frames: Dynamic Character-Aligned Speech Tokenization Paper • 2601.23174 • Published 9 days ago • 2
Learning Rate Matters: Vanilla LoRA May Suffice for LLM Fine-tuning Paper • 2602.04998 • Published 4 days ago • 3
Late-to-Early Training: LET LLMs Learn Earlier, So Faster and Better Paper • 2602.05393 • Published 4 days ago • 5
DASH: Faster Shampoo via Batched Block Preconditioning and Efficient Inverse-Root Solvers Paper • 2602.02016 • Published 6 days ago • 9
Privileged Information Distillation for Language Models Paper • 2602.04942 • Published 4 days ago • 20
Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing Paper • 2602.03845 • Published 5 days ago • 24
Search-R2: Enhancing Search-Integrated Reasoning via Actor-Refiner Collaboration Paper • 2602.03647 • Published 5 days ago • 7
Horizon-LM: A RAM-Centric Architecture for LLM Training Paper • 2602.04816 • Published 4 days ago • 16
EPAS: Efficient Training with Progressive Activation Sharing Paper • 2601.19089 • Published 13 days ago • 1
Expanding the Capabilities of Reinforcement Learning via Text Feedback Paper • 2602.02482 • Published 6 days ago • 2
Beyond Output Critique: Self-Correction via Task Distillation Paper • 2602.00871 • Published 8 days ago • 2
Chronicals: A High-Performance Framework for LLM Fine-Tuning with 3.51x Speedup over Unsloth Paper • 2601.02609 • Published Jan 6 • 2
No One-Size-Fits-All: Building Systems For Translation to Bashkir, Kazakh, Kyrgyz, Tatar and Chuvash Using Synthetic And Original Data Paper • 2602.04442 • Published 4 days ago • 3
Falcon-H1-Tiny Collection A series of extremely small, yet powerful language models redefining capabilities at small scale • 22 items • Updated 24 days ago • 35
Reinforcement Learning from Meta-Evaluation: Aligning Language Models Without Ground-Truth Labels Paper • 2601.21268 • Published 11 days ago • 4
FROST: Filtering Reasoning Outliers with Attention for Efficient Reasoning Paper • 2601.19001 • Published 13 days ago • 4
Mechanistic Data Attribution: Tracing the Training Origins of Interpretable LLM Units Paper • 2601.21996 • Published 10 days ago • 4
ECO: Quantized Training without Full-Precision Master Weights Paper • 2601.22101 • Published 10 days ago • 6
KromHC: Manifold-Constrained Hyper-Connections with Kronecker-Product Residual Matrices Paper • 2601.21579 • Published 10 days ago • 6