Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Yusu Qian's picture
2 7

Yusu Qian

YusuQian
hmb's profile picture apple-intelligence's profile picture leoye's profile picture
·

AI & ML interests

multimodal llm research

Organizations

Apple's profile picture

upvoted 3 papers 3 months ago

PRISM-Bench: A Benchmark of Puzzle-Based Visual Tasks with CoT Error Detection

Paper • 2510.23594 • Published Oct 27, 2025 • 6

Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

Paper • 2510.19808 • Published Oct 22, 2025 • 30

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17, 2025 • 91
upvoted a paper 8 months ago

GIE-Bench: Towards Grounded Evaluation for Text-Guided Image Editing

Paper • 2505.11493 • Published May 16, 2025 • 3
upvoted a paper over 1 year ago

Understanding Alignment in Multimodal LLMs: A Comprehensive Study

Paper • 2407.02477 • Published Jul 2, 2024 • 24
upvoted 2 papers almost 2 years ago

How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts

Paper • 2402.13220 • Published Feb 20, 2024 • 14

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Paper • 2403.09611 • Published Mar 14, 2024 • 129
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs