Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding Paper • 2601.10611 • Published Jan 15 • 28
OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models Paper • 2601.21639 • Published 23 days ago • 50
Running on Zero Featured 1.48k Qwen3-TTS Demo 🎙 1.48k Generate custom speech from text, voice descriptions, or samples
LightOnOCR-2 🦉 Collection LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family • 12 items • Updated 1 day ago • 22