Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training Paper • 2603.12255 • Published 3 days ago • 70
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published 28 days ago • 52
Qwen/Qwen3.5-397B-A17B Image-Text-to-Text • 403B • Updated about 5 hours ago • 1.73M • • 1.33k
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA Image-to-Image • Updated Jan 7 • 53.4k • • 1.13k
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence Paper • 2512.16793 • Published Dec 18, 2025 • 75
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published Dec 15, 2025 • 74