File size: 578 Bytes

f3257c9

**VideoSSR-8B** is a multimodal large language model (MLLM) fine-tuned from `Qwen-VL-8B-Instruct` for enhanced video understanding. It is trained using a novel **Video Self-Supervised Reinforcement Learning (VideoSSR)** framework, which generates its own high-quality training data directly from videos, eliminating the need for manual annotation.

- **Base Model:** `Qwen-VL-8B-Instruct`
- **Paper:** [VideoSSR: Video Self-Supervised Reinforcement Learning](https://arxiv.org/abs/2511.06281)
- **Code:** [https://github.com/lcqysl/VideoSSR](https://github.com/lcqysl/VideoSSR)