nkkbr
/

ViCA-thinking

Video-Text-to-Text

text-generation

vision-language

video understanding

spatial reasoning

visuospatial cognition

Model card Files Files and versions

nkkbr commited on May 7

Commit

aa140e5

·

verified ·

1 Parent(s): 28cbe6a

Create README.md

Files changed (1) hide show

README.md +26 -0

README.md ADDED Viewed

	@@ -0,0 +1,26 @@

+---
+license: apache-2.0
+tags:
+  - multimodal
+  - vision-language
+  - video understanding
+  - spatial reasoning
+  - visuospatial cognition
+  - llava
+  - qwen
+  - llava-video
+datasets:
+  - nkkbr/ViCA-322K
+  - nkkbr/ViCA-thinking-2.68k
+language:
+  - en
+library_name: transformers
+pipeline_tag: video-text-to-text
+model_name: ViCA-Thinking-7B
+base_model: lmms-lab/LLaVA-Video-7B-Qwen2
+---
+## Usage and Full Documentation
+For detailed model description, training setup, datasets, evaluation results, and inference code, **please refer to the main ViCA-7B README**:
+[**nkkbr/ViCA**](https://huggingface.co/nkkbr/ViCA)