In-Context Reinforcement Learning for Tool Use in Large Language Models Paper • 2603.08068 • Published 7 days ago • 28
Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning Paper • 2603.04597 • Published 11 days ago • 157
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents Paper • 2505.22954 • Published May 29, 2025 • 15
Learning to Continually Learn via Meta-learning Agentic Memory Designs Paper • 2602.07755 • Published Feb 8 • 7
Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions Paper • 1901.01753 • Published Jan 7, 2019 • 2
Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning Paper • 2509.24372 • Published Sep 29, 2025 • 12
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights Paper • 2603.12228 • Published 3 days ago • 6
ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning Paper • 2603.05863 • Published 10 days ago • 4
Test-Driven AI Agent Definition (TDAD): Compiling Tool-Using Agents from Behavioral Specifications Paper • 2603.08806 • Published 6 days ago • 7
Lost in Backpropagation: The LM Head is a Gradient Bottleneck Paper • 2603.10145 • Published 5 days ago • 7
TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment Paper • 2602.23068 • Published 17 days ago • 6
Reasoning Models Struggle to Control their Chains of Thought Paper • 2603.05706 • Published 10 days ago • 27
Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains Paper • 2507.17746 • Published Jul 23, 2025 • 5
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents Paper • 2602.02474 • Published Feb 2 • 59
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 109
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published Jan 30 • 217