MultiWorld: Scalable Multi-Agent Multi-View Video World Models Paper • 2604.18564 • Published 8 days ago • 43
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 13 days ago • 115
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data Paper • 2603.25319 • Published Mar 26 • 32
Fair splits flip the leaderboard: CHANRG reveals limited generalization in RNA secondary-structure prediction Paper • 2603.22330 • Published Mar 20 • 6
Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens Paper • 2603.19232 • Published Mar 19 • 33
EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation Paper • 2603.12267 • Published Mar 12 • 13
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow Paper • 2410.07303 • Published Oct 9, 2024 • 19
OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens Paper • 2603.02138 • Published Mar 2 • 151
Video Reality Test: Can AI-Generated ASMR Videos fool VLMs and Humans? Paper • 2512.13281 • Published Dec 15, 2025 • 65
Ground Slow, Move Fast: A Dual-System Foundation Model for Generalizable Vision-and-Language Navigation Paper • 2512.08186 • Published Dec 9, 2025 • 23
PhysX-Anything: Simulation-Ready Physical 3D Assets from Single Image Paper • 2511.13648 • Published Nov 17, 2025 • 53
Part-X-MLLM: Part-aware 3D Multimodal Large Language Model Paper • 2511.13647 • Published Nov 17, 2025 • 72
OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes Paper • 2510.26800 • Published Oct 30, 2025 • 22
Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published Oct 27, 2025 • 181
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks Paper • 2510.15019 • Published Oct 16, 2025 • 65
CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images Paper • 2510.11718 • Published Oct 13, 2025 • 14
LongLive: Real-time Interactive Long Video Generation Paper • 2509.22622 • Published Sep 26, 2025 • 189