Talk2Move: Reinforcement Learning for Text-Instructed Object-Level Geometric Transformation in Scenes
Paper
•
2601.02356
•
Published
•
12
ARC mainly focuses on areas of computer vision, speech, and natural language processing, including speech/video generation, enhancement, retrieval, understanding, AutoML, etc. Considering research developments and industry trends, ARC consistently pursues exploration, innovation, and breakthroughs in technologies.
TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries