Search Self-play: Pushing the Frontier of Agent Capability without Supervision Paper • 2510.18821 • Published Oct 21 • 17
Scaling Language-Centric Omnimodal Representation Learning Paper • 2510.11693 • Published Oct 13 • 100
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7 • 180