Enrich VLMs’ vision-centric reasoning capabilities via Chain-of-Visual-Thought!
YM Qin
Wakals
AI & ML interests
Computer Vision, Vision-language Model, Generative Model
Recent Activity
upvoted a paper about 1 month ago
Render-of-Thought: Rendering Textual Chain-of-Thought as Images for Visual Latent Reasoning upvoted a paper about 2 months ago
Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning upvoted a collection 2 months ago
Qwen3.5Organizations
None yet