--- license: apache-2.0 language: - en base_model: - Qwen/Qwen2.5-7B tags: - capability-tagging - task - qwen --- # Model Card for CDT-Task-Tagger This model is a component of the **Cognition-Domain-Task (CDT) framework**, a comprehensive capability framework for Large Language Models presented in our paper CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task. It has been specifically fine-tuned to classify a given instruction into one of 16 task types. ## Model Details ### Model Description This model identifies the fundamental task a user wants the LLM to perform. - **Model type:** Qwen2ForCausalLM - **Language(s) (NLP):** English - **License:** Apache 2.0 - **Finetuned from model:** Qwen2.5-7B-Base ### Model Sources - **Repository:** https://github.com/Alessa-mo/CDT - **Paper Link:** https://arxiv.org/abs/2509.24422 ### Basic Usage Please refer to https://github.com/Alessa-mo/CDT. You can run the following scripts to tag the cognition labels. ```bash cd tag_annotate export CUDA_VISIBLE_DEVICES=0 python annotate.py \ --data_path path/to/your/data \ --output_dir path/to/output/dir \ --model_path CDT-Task-Tagger \ --prompt_file ./prompt/annotation_prompt.jsonl \ --cognition_skill_file ./prompt/cognition.json \ --domain_skill_file ./prompt/domain.json \ --task_skill_file ./prompt/task.json \ --tag_type "task" \ --batch_size 32 ``` **Note**: Make sure your data is a JSON file and has the following format: ```json [ { "messages": [ { "role": "user", "content": "xxxx" }, { "role": "assistant", "content": "xxxx" } ] }, ] ``` ## Citation If you find this model useful, please cite: ```bash @misc{mo2025cdtcomprehensivecapabilityframework, title={CDT: A Comprehensive Capability Framework for Large Language Models Across Cognition, Domain, and Task}, author={Haosi Mo and Xinyu Ma and Xuebo Liu and Derek F. Wong and Yu Li and Jie Liu and Min Zhang}, year={2025}, eprint={2509.24422}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2509.24422}, } ```