AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios Paper • 2602.23166 • Published 22 days ago • 44
SimpleOCR: Rendering Visualized Questions to Teach MLLMs to Read Paper • 2602.22426 • Published 23 days ago
Reliable and Responsible Foundation Models: A Comprehensive Survey Paper • 2602.08145 • Published Feb 4 • 8
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published Feb 9 • 70
MedVerse: Efficient and Reliable Medical Reasoning via DAG-Structured Parallel Execution Paper • 2602.07529 • Published Feb 7
SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning Paper • 2602.08234 • Published Feb 9 • 70
Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch Paper • 2512.02395 • Published Dec 2, 2025 • 50
SimWorld: An Open-ended Realistic Simulator for Autonomous Agents in Physical and Social Worlds Paper • 2512.01078 • Published Nov 30, 2025 • 34
Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning Paper • 2511.19900 • Published Nov 25, 2025 • 49
Mimicking the Physicist's Eye:A VLM-centric Approach for Physics Formula Discovery Paper • 2508.17380 • Published Aug 24, 2025 • 7
Position: The Hidden Costs and Measurement Gaps of Reinforcement Learning with Verifiable Rewards Paper • 2509.21882 • Published Sep 26, 2025
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published Nov 20, 2025 • 110
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent Paper • 2508.05748 • Published Aug 7, 2025 • 142
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving Paper • 2507.06229 • Published Jul 8, 2025 • 76
MMedAgent-RL: Optimizing Multi-Agent Collaboration for Multimodal Medical Reasoning Paper • 2506.00555 • Published May 31, 2025 • 1
PhysUniBench: An Undergraduate-Level Physics Reasoning Benchmark for Multimodal Models Paper • 2506.17667 • Published Jun 21, 2025 • 4