File size: 578 Bytes
f3257c9
 
 
 
 
1
2
3
4
5
6
**VideoSSR-8B** is a multimodal large language model (MLLM) fine-tuned from `Qwen-VL-8B-Instruct` for enhanced video understanding. It is trained using a novel **Video Self-Supervised Reinforcement Learning (VideoSSR)** framework, which generates its own high-quality training data directly from videos, eliminating the need for manual annotation.

- **Base Model:** `Qwen-VL-8B-Instruct`
- **Paper:** [VideoSSR: Video Self-Supervised Reinforcement Learning](https://arxiv.org/abs/2511.06281)
- **Code:** [https://github.com/lcqysl/VideoSSR](https://github.com/lcqysl/VideoSSR)