From Pixels to Words -- Towards Native Vision-Language Primitives at Scale
Haiwen Diao
Paranioar
AI & ML interests
Vision-and-Language, Parameter-efficient Transfer Learning, Multi-modal Large Language Model
Recent Activity
upvoted a paper about 2 hours ago
HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions updated
a collection
4 days ago
NEO1_5 updated
a collection
4 days ago
NEO1_5