A benchmark multimodal oro-dental dataset for large vision-language models Paper โข 2511.04948 โข Published Nov 7
PsOCR: Benchmarking Large Multimodal Models for Optical Character Recognition in Low-resource Pashto Language Paper โข 2505.10055 โข Published May 15 โข 1
Transformer-based Spatial Grounding: A Comprehensive Survey Paper โข 2507.12739 โข Published Jul 17