Reinforcement Learning Foundations for Deep Research Systems: A Survey Paper • 2509.06733 • Published Sep 8 • 32
DivMerge: A divergence-based model merging method for multi-tasking Paper • 2509.02108 • Published Sep 2 • 25
MolmoAct: Action Reasoning Models that can Reason in Space Paper • 2508.07917 • Published Aug 11 • 44
ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development Paper • 2506.05010 • Published Jun 5 • 79
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 188