ColorAgent: Building A Robust, Personalized, and Interactive OS Agent Paper • 2510.19386 • Published Oct 22 • 8
DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents Paper • 2510.19336 • Published Oct 22 • 16
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization Paper • 2504.00999 • Published Apr 1 • 95
Improved Visual-Spatial Reasoning via R1-Zero-Like Training Paper • 2504.00883 • Published Apr 1 • 66