Emu3.5: Native Multimodal Models are World Learners Paper • 2510.26583 • Published Oct 30, 2025 • 108
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 3 days ago • 672
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 133
GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars Paper • 2408.13674 • Published Aug 24, 2024 • 18