Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills Paper • 2603.25158 • Published 5 days ago • 34
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models Paper • 2603.25716 • Published 4 days ago • 135
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Paper • 2603.25730 • Published 4 days ago • 39
RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models Paper • 2603.25502 • Published 5 days ago • 50
PixelSmile: Toward Fine-Grained Facial Expression Editing Paper • 2603.25728 • Published 4 days ago • 114
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG Paper • 2603.23497 • Published 6 days ago • 86
MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding Paper • 2603.22458 • Published 7 days ago • 130
GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents Paper • 2603.24329 • Published 6 days ago • 20
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Paper • 2603.24472 • Published 5 days ago • 41
EVA: Efficient Reinforcement Learning for End-to-End Video Agent Paper • 2603.22918 • Published 7 days ago • 40
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 5 days ago • 89
LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation Paper • 2603.20192 • Published 10 days ago • 22
HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning Paper • 2603.17024 • Published 13 days ago • 106
EvoClaw: Evaluating AI Agents on Continuous Software Evolution Paper • 2603.13428 • Published 18 days ago • 20
HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions Paper • 2603.15612 • Published 14 days ago • 151