Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published 19 days ago • 134
timm DINOv3 Collection Meta AI's DINOv3 weights in timm. ViTs with `qkvb` have a zero QV bias present, otherwise bias is disabled. QKV bias are all 0 in original weights. • 18 items • Updated Sep 19 • 22
Black-Box On-Policy Distillation of Large Language Models Paper • 2511.10643 • Published 23 days ago • 46
π_RL: Online RL Fine-tuning for Flow-based Vision-Language-Action Models Paper • 2510.25889 • Published Oct 29 • 64
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning Paper • 2509.08755 • Published Sep 10 • 56
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory Paper • 2508.09736 • Published Aug 13 • 57
Hybrid Linear Attention Research Collection All 1.3B & 340M hybrid linear-attention experiments. • 62 items • Updated Sep 11 • 12
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities Paper • 2507.13158 • Published Jul 17 • 23
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation Paper • 2507.02608 • Published Jul 3 • 21
FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing Paper • 2506.20911 • Published Jun 26 • 41
A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA Paper • 2312.03732 • Published Nov 28, 2023 • 11
MoTE: Mixture of Ternary Experts for Memory-efficient Large Multimodal Models Paper • 2506.14435 • Published Jun 17 • 7