view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 8 days ago • 63
Running Featured 68 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 68 Who needs 1T parameters? Olympiad proofs with a 4B model