RaBiT: Residual-Aware Binarization Training for Accurate and Efficient LLMs Paper โข 2602.05367 โข Published 10 days ago โข 7
DFlash: Block Diffusion for Flash Speculative Decoding Paper โข 2602.06036 โข Published 10 days ago โข 41
POP: Prefill-Only Pruning for Efficient Large Model Inference Paper โข 2602.03295 โข Published 12 days ago โข 4
Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {pm 1, pm i} Paper โข 2512.02901 โข Published Dec 2, 2025 โข 6
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models Paper โข 2511.23319 โข Published Nov 28, 2025 โข 24
Metis: Training Large Language Models with Advanced Low-Bit Quantization Paper โข 2509.00404 โข Published Aug 30, 2025 โข 7