Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 26 days ago • 123
mistralai/Voxtral-Mini-4B-Realtime-2602 Automatic Speech Recognition • 4B • Updated Mar 11 • 904k • 826