HypeNet Collection The models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts • 2 items • Updated 24 days ago
HypeNet Collection The models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts • 2 items • Updated 24 days ago
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 898
Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts Paper • 2601.22156 • Published Jan 29 • 14
Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts Paper • 2601.22156 • Published Jan 29 • 14
Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts Paper • 2601.22156 • Published Jan 29 • 14
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30, 2025 • 133
StateX: Enhancing RNN Recall via Post-training State Expansion Paper • 2509.22630 • Published Sep 26, 2025 • 4
StateX: Enhancing RNN Recall via Post-training State Expansion Paper • 2509.22630 • Published Sep 26, 2025 • 4
StateX: Enhancing RNN Recall via Post-training State Expansion Paper • 2509.22630 • Published Sep 26, 2025 • 4 • 2