yhx12
/

VideoSSR

Model card Files Files and versions

VideoSSR / README.md

yhx12's picture

Create README.md

f3257c9 verified about 1 month ago

|

history blame contribute delete

578 Bytes

	VideoSSR-8B is a multimodal large language model (MLLM) fine-tuned from `Qwen-VL-8B-Instruct` for enhanced video understanding. It is trained using a novel Video Self-Supervised Reinforcement Learning (VideoSSR) framework, which generates its own high-quality training data directly from videos, eliminating the need for manual annotation.

	- Base Model: `Qwen-VL-8B-Instruct`
	- Paper: [VideoSSR: Video Self-Supervised Reinforcement Learning](https://arxiv.org/abs/2511.06281)
	- Code: [https://github.com/lcqysl/VideoSSR](https://github.com/lcqysl/VideoSSR)