AudioMosaic Collection ICML2026 AudioMosaic: Contrastive Masked Audio Representation Learning • 15 items • Updated 26 days ago • 3
MOSS-Audio Collection An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 7 items • Updated May 2 • 66
Cosmos3 Collection Omnimodal World Models for Physical AI • 15 items • Updated about 22 hours ago • 83
gliner2 family Collection GLiNER2 extends the original GLiNER architecture to support multi-task information extraction with a schema-driven interface. • 7 items • Updated 20 days ago • 48
CubePart: An Open-Vocabulary Part-Controllable 3D Generator Paper • 2605.28763 • Published 10 days ago • 14
GEM: Generative Supervision Helps Embodied Intelligence Paper • 2605.28548 • Published 10 days ago • 41
InstructSAM: Segment Any Instance with Any Instructions Paper • 2605.26102 • Published 12 days ago • 17
ControlLight: Towards Controllable, Consistent, and Generalizable Low-Light Enhancement Paper • 2605.25569 • Published 12 days ago • 21
Lens: Rethinking Training Efficiency for Foundational Text-to-Image Models Paper • 2605.21573 • Published 17 days ago • 108
TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction Paper • 2605.26115 • Published 12 days ago • 51
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU Paper • 2604.05091 • Published Apr 6 • 47
RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments Paper • 2604.26067 • Published Apr 28 • 74
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published Apr 15 • 123
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published Apr 8 • 189
DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models Paper • 2603.26164 • Published Mar 27 • 366
Code-as-Room: Generating 3D Rooms from Top-Down View Images via Agentic Code Synthesis Paper • 2605.18451 • Published 19 days ago • 41
PhysX-Omni: Unified Simulation-Ready Physical 3D Generation for Rigid, Deformable, and Articulated Objects Paper • 2605.21572 • Published 17 days ago • 52
DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo Paper • 2605.16257 • Published 22 days ago • 53