OmniAlpha: A Sequence-to-Sequence Framework for Unified Multi-Task RGBA Generation Paper • 2511.20211 • Published 13 days ago • 12
UI2Code^N: A Visual Language Model for Test-Time Scalable Interactive UI-to-Code Generation Paper • 2511.08195 • Published 27 days ago • 30
Glyph: Scaling Context Windows via Visual-Text Compression Paper • 2510.17800 • Published Oct 20 • 67
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 240
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training Paper • 2503.08525 • Published Mar 11 • 17
ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation Paper • 2304.05977 • Published Apr 12, 2023 • 3
VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation Paper • 2412.21059 • Published Dec 30, 2024 • 18