Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published 26 days ago • 104
AU-Harness: An Open-Source Toolkit for Holistic Evaluation of Audio LLMs Paper • 2509.08031 • Published Sep 9 • 21