IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST
•
18
Enterprise AI and ML, Foundation Models, Responsible AI
VAREX: A Benchmark for Multi-Modal Structured Extraction from Documents
Look Where It Matters: High-Resolution Crops Retrieval for Efficient VLMs