LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published 2 days ago • 93
MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory Paper • 2605.15128 • Published 6 days ago • 60
Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video Paper • 2605.15182 • Published 6 days ago • 38
SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer Paper • 2605.15178 • Published 6 days ago • 76
WorldReasonBench: Human-Aligned Stress Testing of Video Generators as Future World-State Predictors Paper • 2605.10434 • Published 9 days ago • 30
World Model for Robot Learning: A Comprehensive Survey Paper • 2605.00080 • Published 20 days ago • 16
HumanNet: Scaling Human-centric Video Learning to One Million Hours Paper • 2605.06747 • Published 13 days ago • 51
ExoActor: Exocentric Video Generation as Generalizable Interactive Humanoid Control Paper • 2604.27711 • Published 20 days ago • 41
Map2World: Segment Map Conditioned Text to 3D World Generation Paper • 2605.00781 • Published 19 days ago • 25
UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Paper • 2605.00658 • Published 19 days ago • 82
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling Paper • 2604.28185 • Published 20 days ago • 90
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published 21 days ago • 106
From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company Paper • 2604.22446 • Published 26 days ago • 121
PhyCo: Learning Controllable Physical Priors for Generative Motion Paper • 2604.28169 • Published 20 days ago • 13
MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons Paper • 2604.28130 • Published 20 days ago • 22
Synthetic Computers at Scale for Long-Horizon Productivity Simulation Paper • 2604.28181 • Published 20 days ago • 19