Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding Paper • 2604.00528 • Published 5 days ago • 8
Learn2Fold: Structured Origami Generation with World Model Planning Paper • 2603.29585 • Published Feb 2 • 15
RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation Paper • 2603.25804 • Published 10 days ago • 28
Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis Paper • 2603.29620 • Published 5 days ago • 45
TAPS: Task Aware Proposal Distributions for Speculative Sampling Paper • 2603.27027 • Published 9 days ago • 140
Running on CPU Upgrade Featured 90 Cohere Multilingual ASR 🎙 90 Transcribe audio clips to text in many languages
UniVideo: Unified Understanding, Generation, and Editing for Videos Paper • 2510.08377 • Published Oct 9, 2025 • 81
AutoDeco Collection Chat with truly end-to-end LLMs with AutoDeco heads • 8 items • Updated Dec 20, 2025 • 6
view article Article Australian-made LLM beats OpenAI and Google at legal retrieval Oct 23, 2025 • 26
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention Paper • 2507.17745 • Published Jul 23, 2025 • 36
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact Paper • 2507.00951 • Published Jul 1, 2025 • 24