X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding Paper • 2606.02482 • Published 2 days ago • 23
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published 9 days ago • 101
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 21 days ago • 270
AnalogRetriever: Learning Cross-Modal Representations for Analog Circuit Retrieval Paper • 2604.23195 • Published Apr 25 • 3
Credal Concept Bottleneck Models for Epistemic-Aleatoric Uncertainty Decomposition Paper • 2604.24170 • Published Apr 27 • 2
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242
ClawArena: Benchmarking AI Agents in Evolving Information Environments Paper • 2604.04202 • Published Apr 5 • 37
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 504
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 246