Running on A100 102 Music Flamingo 🎵 102 Upload music or YouTube videos and ask detailed questions about them
ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents Paper • 2511.07685 • Published Nov 10, 2025 • 9
Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following Paper • 2511.10507 • Published Nov 13, 2025 • 6
Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper • 2511.04962 • Published Nov 7, 2025 • 54
Running on CPU Upgrade Featured 2.85k The Smol Training Playbook 📚 2.85k The secrets to building world-class LLMs