One speech model with seven voices, streamlined with multimodal capabilities for vision tasks. Performs vision(image-text) to audio inference with Qwen2.5-VL + VibeVoice-Realtime-0.5B. Vision to VibeVoice (EN) - The demo is live. 🗣️🔥

🤗 Vision-to-VibeVoice-en [Demo]: prithivMLmods/Vision-to-VibeVoice-en
✨ Collection: https://huggingface.co/collections/prithivMLmods/multimodal-implementations
✨ Speech [VibeVoice-Realtime-0.5B]: microsoft/VibeVoice-Realtime-0.5B
✨ Vision [Qwen2.5-VL]: Qwen/Qwen2.5-VL-7B-Instruct

To know more about it, visit the app page or the respective model page!

6 replies

liked 2 Spaces 2 months ago

Z Image Turbo

🏃

1.68k

Generate realistic images from text descriptions

DeepSite v3

🐳

16.4k

Generate any application by Vibe Coding

reacted to MonsterMMORPG's post with 🚀 2 months ago

Post

5616

FLUX 2 vs FLUX SRPO, New FLUX Training Kohya SS GUI Premium App With Presets & Features : https://youtu.be/RQHmyJVOHXo

FLUX 2 has been published and I have compared it to the very best FLUX base model known as FLUX SRPO. Moreover, we have updated our FLUX Training APP and presets to the next level. Massive speed up gaings with 0 quality loss and lots of new features. I will show all of the new features we have with new SECourses Kohya SS GUI Premium app and compare FLUX SRPO trained model results with FLUX 2.

https://youtu.be/RQHmyJVOHXo

Get the SECourses Premium Kohya Trainer DreamBooth / Fine Tuning : [ https://www.patreon.com/posts/Kohya-FLUX-DreamBooth-Trainer-App-112099700 ]

Get the SECourses Premium Kohya Trainer LoRA : [ https://www.patreon.com/posts/Kohya-FLUX-LoRA-Trainer-App-110879657 ]

DreamBooth Training Tutorial: [ https://www.youtube.com/watch?v=FvpWy1x5etM ]

LoRA Training Tutorial: [ https://www.youtube.com/watch?v=nySGu12Y05k ]

Qwen Image Realism Tutorial: [ https://youtu.be/XWzZ2wnzNuQ ]

Join our Discord Community: [ https://discord.com/servers/secourses-Discord-772774097734074388 ]

⏱️ Video Chapters:
0:00 Introduction to New FLUX Training Improvements and Local Training Showcase
0:24 Understanding FLUX SRPO Model: High Realism with Minimal VRAM Requirements
0:38 Updated Configurations for Training Realism on 6GB VRAM GPUs Locally
1:07 FLUX 2 Announcement and Setting Up Comparisons with BFL Playground
1:45 FLUX 2 Dev Model Technical Specs: 32 Billion Parameters and Hardware Challenges
2:11 Overview of Changes in SECourses Premium Kohya Trainer Version 35
2:46 Development Updates: GUI Improvements and Full Torch Compile Support
3:13 LoRA Presets Update: VRAM Optimization and Speed Improvements via Torch Compile
3:27 Introducing On-the-Fly FP8 Scaled LoRA Training Support
3:42 Quality Comparison Analysis: BF16 vs FP8 Scaled Weights LoRA
4:24 VRAM Usage and Speed Analysis: Block Swap Count Reduction with FP8 Scaled
....