Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem Paper • 2512.24873 • Published 2 days ago • 36
Part II: ROLL Flash -- Accelerating RLVR and Agentic Training with Asynchrony Paper • 2510.11345 • Published Oct 13, 2025 • 17
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20, 2025 • 106
ProgCo: Program Helps Self-Correction of Large Language Models Paper • 2501.01264 • Published Jan 2, 2025 • 26
MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models Paper • 2410.11710 • Published Oct 15, 2024 • 20