Chengxuan Qian's picture

On Vacation 🏝️

Chengxuan Qian

Raymond-Qiancx

·

https://qiancx.com/

AI & ML interests

Vision-Language Models

Recent Activity

upvoted a paper about 7 hours ago

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

liked a model 2 days ago

google/gemma-2-2b-it

upvoted a paper 3 days ago

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

View all activity

Organizations

None yet

upvoted a paper about 7 hours ago

LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation

Paper • 2605.18739 • Published 2 days ago • 93

liked a model 2 days ago

google/gemma-2-2b-it

Text Generation • 3B • Updated Aug 27, 2024 • 387k • • 1.35k

upvoted 3 papers 3 days ago

MemEye: A Visual-Centric Evaluation Framework for Multimodal Agent Memory

Paper • 2605.15128 • Published 6 days ago • 60

Warp-as-History: Generalizable Camera-Controlled Video Generation from One Training Video

Paper • 2605.15182 • Published 6 days ago • 38

SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer

Paper • 2605.15178 • Published 6 days ago • 76

upvoted 3 papers 6 days ago

WorldReasonBench: Human-Aligned Stress Testing of Video Generators as Future World-State Predictors

Paper • 2605.10434 • Published 9 days ago • 30

World Model for Robot Learning: A Comprehensive Survey

Paper • 2605.00080 • Published 20 days ago • 16

World Action Models: The Next Frontier in Embodied AI

Paper • 2605.12090 • Published 8 days ago • 64

upvoted 2 papers 7 days ago

HumanNet: Scaling Human-centric Video Learning to One Million Hours

Paper • 2605.06747 • Published 13 days ago • 51

Qwen-Image-2.0 Technical Report

Paper • 2605.10730 • Published 9 days ago • 106

upvoted 3 papers 15 days ago

ExoActor: Exocentric Video Generation as Generalizable Interactive Humanoid Control

Paper • 2604.27711 • Published 20 days ago • 41

Map2World: Segment Map Conditioned Text to 3D World Generation

Paper • 2605.00781 • Published 19 days ago • 25

UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors

Paper • 2605.00658 • Published 19 days ago • 82

upvoted 2 papers 17 days ago

Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling

Paper • 2604.28185 • Published 20 days ago • 90

GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents

Paper • 2604.26752 • Published 21 days ago • 106

upvoted 5 papers 18 days ago

From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company

Paper • 2604.22446 • Published 26 days ago • 121

A Survey on LLM-based Conversational User Simulation

Paper • 2604.24977 • Published 23 days ago • 8

PhyCo: Learning Controllable Physical Priors for Generative Motion

Paper • 2604.28169 • Published 20 days ago • 13

MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons

Paper • 2604.28130 • Published 20 days ago • 22

Synthetic Computers at Scale for Long-Horizon Productivity Simulation

Paper • 2604.28181 • Published 20 days ago • 19