Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation
-
VisPer-LM
🔍5Visualize image depth, segmentation, and generation
-
shi-labs/OLA-VLM-CLIP-ViT-Llama3-8b
Image-Text-to-Text • 8B • Updated • 8 -
shi-labs/OLA-VLM-CLIP-ConvNeXT-Phi3-4k-mini
Image-Text-to-Text • 5B • Updated • 9 • 1 -
shi-labs/OLA-VLM-CLIP-ConvNeXT-Llama3-8b
Image-Text-to-Text • 9B • Updated • 7 • 1