Ovis2.5 Collection Our next-generation MLLMs for native-resolution vision and advanced reasoning • 5 items • Updated Aug 19 • 57
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13 • 173
My MCP-ready spaces [WIP] Collection Progressive list of MCP server ready trending spaces maintained by fffiloni • 24 items • Updated Aug 27 • 9
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community +1 Apr 15, 2024 • 190
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving Paper • 2404.16771 • Published Apr 25, 2024 • 19
Idefics2 🐶 Collection Idefics2-8B is a foundation vision-language model. In this collection, you will find the models, datasets and demo related to its creation. • 11 items • Updated May 6, 2024 • 92
HF-curated models available on Workers AI Collection A collection of models curated with Hugging Face that can be run on Cloudflare's Workers AI serverless inference platform. • 15 items • Updated Apr 2, 2024 • 52
StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control Paper • 2403.09055 • Published Mar 14, 2024 • 27
Foundation Models for Vision 🧩 Collection Foundation models for computer vision. • 24 items • Updated Mar 11, 2024 • 20
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Paper • 2402.19479 • Published Feb 29, 2024 • 35
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated Jul 10 • 345