VideoSSR / README.md
yhx12's picture
Create README.md
f3257c9 verified

VideoSSR-8B is a multimodal large language model (MLLM) fine-tuned from Qwen-VL-8B-Instruct for enhanced video understanding. It is trained using a novel Video Self-Supervised Reinforcement Learning (VideoSSR) framework, which generates its own high-quality training data directly from videos, eliminating the need for manual annotation.