OpenAgents: An Open Platform for Language Agents in the Wild Paper • 2310.10634 • Published Oct 16, 2023 • 9
DocBank: A Benchmark Dataset for Document Layout Analysis Paper • 2006.01038 • Published Jun 1, 2020
LayoutReader: Pre-training of Text and Layout for Reading Order Detection Paper • 2108.11591 • Published Aug 26, 2021 • 1
In-Context Learning with Many Demonstration Examples Paper • 2302.04931 • Published Feb 9, 2023 • 2
VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos Paper • 2510.19488 • Published Oct 22 • 19
Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents Paper • 2510.24702 • Published Oct 28 • 27
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant Paper • 2410.18603 • Published Oct 24, 2024 • 32
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published May 26 • 104
xbench: Tracking Agents Productivity Scaling with Profession-Aligned Real-World Evaluations Paper • 2506.13651 • Published Jun 16 • 8
MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents Paper • 2507.19478 • Published Jul 25 • 31
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis Paper • 2505.13227 • Published May 19 • 45
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis Paper • 2505.13227 • Published May 19 • 45
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14 • 97
Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments Paper • 2501.10893 • Published Jan 18 • 26
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials Paper • 2412.09605 • Published Dec 12, 2024 • 30
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction Paper • 2412.04454 • Published Dec 5, 2024 • 72