Running 3.55k The Ultra-Scale Playbook ๐ 3.55k The ultimate guide to training LLM on large GPU Clusters
Running on CPU Upgrade Featured 2.56k The Smol Training Playbook ๐ 2.56k The secrets to building world-class LLMs
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing Paper โข 2110.13900 โข Published Oct 26, 2021 โข 1
Running Featured 179 Gradio Hackathon Registration Winter 25 ๐ 179 Gradio Agents & MCP Hackathon Winter 2025 Registration Page
view post Post 6783 LiquidAI/LFM2-8B-A1B just dropped!8.3B params with only 1.5B active/token ๐> Quality โ 3โ4B dense, yet faster than Qwen3-1.7B> MoE designed to run on phones/laptops (llama.cpp / vLLM)> Pre-trained on 12T tokens โ strong math/code/IF See translation 1 reply ยท ๐ฅ 7 7 ๐ 3 3 + Reply
Moshi: a speech-text foundation model for real-time dialogue Paper โข 2410.00037 โข Published Sep 17, 2024 โข 8