On the Optimal Reasoning Length for RL-Trained Language Models Paper • 2602.09591 • Published 16 days ago • 5
mmnga-o/NVIDIA-Nemotron-Nano-9B-v2-Japanese-gguf Text Generation • 9B • Updated 8 days ago • 10.8k • 48
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 302