Unified Multimodal Autoregressive Modeling with Shared Context-Visual Tokenizer is Key to Unification Paper • 2606.18249 • Published 5 days ago • 14
UniAR Collection Model checkpoints for UniAR: Unified Multimodal Autoregressive Modeling with Shared Context—Visual Tokenizer is Key to Unification. • 2 items • Updated 4 days ago
ARM: An AutoRegressive Large Multimodal Model with Unified Discrete Representations Paper • 2606.11188 • Published 12 days ago • 26
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper • 2605.30280 • Published 24 days ago • 146
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published May 12 • 194
CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization Paper • 2603.06449 • Published Mar 6 • 6
RoboOmni: Proactive Robot Manipulation in Omni-modal Context Paper • 2510.23763 • Published Oct 27, 2025 • 62
LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models Paper • 2510.13626 • Published Oct 15, 2025 • 48
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning Paper • 2508.20751 • Published Aug 28, 2025 • 90
Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment Paper • 2505.18600 • Published May 24, 2025 • 49
CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models Paper • 2505.12504 • Published May 18, 2025 • 24