OpenLearnLM/special-r1-deepseek-qwen3-8b-sped-adaptive-think-noreward Text Generation • 8B • Updated 2 days ago • 185
OpenLearnLM/special-r1-deepseek-qwen3-8b-sped-adaptive-think-noreward Text Generation • 8B • Updated 2 days ago • 185
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published Oct 22, 2025 • 117