E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models Paper • 2601.00423 • Published 22 days ago • 9
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss Paper • 2512.23447 • Published 25 days ago • 95
Running Featured 1.27k FineWeb: decanting the web for the finest text data at scale 🍷 1.27k Generate high-quality text data for LLMs using FineWeb