Images in Sentences: Scaling Interleaved Instructions for Unified Visual Generation Paper • 2605.12305 • Published 22 days ago • 2
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details Paper • 2604.06870 • Published Apr 8 • 43
docling-project/SmolDocling-256M-preview Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 31.7k • 1.61k