ROOT: Robust Orthogonalized Optimizer for Neural Network Training Paper • 2511.20626 • Published 14 days ago • 169
DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation Paper • 2511.06307 • Published about 1 month ago • 50
π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models Paper • 2510.25889 • Published Oct 29 • 64
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 26 days ago • 93
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models Paper • 2511.08577 • Published 28 days ago • 104
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published 22 days ago • 132
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published 22 days ago • 134