Multimodal Understanding - a VincentWei1021 Collection

VincentWei1021 's Collections

Multimodal Understanding

Multimodal Understanding

updated Oct 13, 2025

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published Jan 10, 2025 • 75
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26, 2025 • 140