Running 13 Defeating the trainer-generator precision mismatch in TRL 🎯 13 Download research PDF (Pro access required)
LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories Paper • 2604.15311 • Published 8 days ago • 12
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 9 days ago • 151
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 10 days ago • 85
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 16 days ago • 59
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 22 days ago • 876
GPT-1900 Collection Pre-1900 LLMs for physics reasoning. RL models are physics-only; use the SFT model for general chat. Tune temperature (0.6-0.7). • 11 items • Updated 21 days ago • 6