Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives Paper • 2601.20833 • Published Jan 28 • 183
HaluMem: Evaluating Hallucinations in Memory Systems of Agents Paper • 2511.03506 • Published Nov 5, 2025 • 95
BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology Paper • 2503.00096 • Published Feb 28, 2025 • 3