Model Card for Omni DeepSeek

Model Training Details

Training Steps

Model Description

I've trained an interesting text-image-video CoT LLM through image-text space alignment, CoT-SFT warm-up, and GRPO training. I've observed that this model can provide more accurate answers through Chain-of-Thought reasoning, particularly excelling in complex instruction following and formatted output.

  • Developed by: princepride
  • Model type: text-image-video CoT LLM

Model Sources

  • Demo :
  • It can analysis the video with chain of thought to output a better result
  • Input :
  • Click to watch the demo
  • Output :
  • Click to watch the demo
  • Because chain of thought, it can better follow complex instructions and generate accurately formatted output:
  • Input :
  • Click to watch the demo
  • Output :
  • Click to watch the demo

Uses(At least 24GB of VRAM is required)

!pip install --ignore-installed flask flask-ngrok
!wget https://bin.equinox.io/c/bNyj1mQVY4c/ngrok-v3-stable-linux-amd64.tgz
!apt update && apt upgrade -y
!apt-get install p7zip-full -y
!tar -xvzf ngrok-v3-stable-linux-amd64.tgz
!./ngrok authtoken YOUR_NGROK_TOKEN
!pip install -r requirements.txt
streamlit run app.py
./ngrok http 8501
Downloads last month
33
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support