Update README.md

59af9f7 verified 11 months ago

4.12 kB

	---
	license: mit
	language:
	- en
	pipeline_tag: text-generation
	datasets:
	- FreedomIntelligence/medical-o1-reasoning-SFT
	base_model:
	- unsloth/DeepSeek-R1-Distill-Llama-8B
	---
	# DeepSeek-R1-Distill-Llama-8B - Fine-Tuned for Medical Chain-of-Thought Reasoning

	## Model Overview
	The DeepSeek-R1-Distill-Llama-8B model has been fine-tuned for medical chain-of-thought (CoT) reasoning. This fine-tuning process enhances the model's ability to generate structured, concise, and accurate medical reasoning outputs. The model was trained using a 500-sample subset of the medical-o1-reasoning-SFT dataset, with optimizations including 4-bit quantization and LoRA adapters to improve efficiency and reduce memory usage.

	### Key Features
	- Base Model: [unsloth/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B)
	- Fine-Tuning Objective: Adaptation for structured, step-by-step medical reasoning tasks.
	- Training Dataset: 500 samples from medical-o1-reasoning-SFT dataset.
	- Tools Used:
	- Unsloth: Accelerates training by 2x.
	- 4-bit Quantization: Reduces model memory usage.
	- LoRA Adapters: Enables parameter-efficient fine-tuning.
	- Training Time: 44 minutes.

	### Performance Improvements
	- Response Length: Reduced from an average of 450 words to 150 words, improving conciseness.
	- Reasoning Style: Shifted from verbose explanations to more focused, structured reasoning.
	- Answer Format: Transitioned from bulleted lists to paragraph-style answers for clarity.

	## Intended Use
	This model is designed for use by:
	- Medical professionals requiring structured diagnostic reasoning.
	- Researchers seeking assistance in medical knowledge extraction.
	- Developers integrating the model for medical CoT tasks in clinical settings, treatment planning, and education.

	Typical use cases include:
	- Clinical diagnostics
	- Treatment planning
	- Medical education and training
	- Research assistance

	## Training Details

	### Key Components:
	- Model: [unsloth/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B)
	- Dataset: medical-o1-reasoning-SFT (500 samples)
	- Training Tools:
	- Unsloth: Optimized training for faster results (2x speedup).
	- 4-bit Quantization: Optimized memory usage for efficient training.
	- LoRA Adapters: Enables lightweight fine-tuning with reduced computational costs.

	### Fine-Tuning Process:
	1. Install Required Packages:
	Installed necessary libraries, including unsloth and kaggle.

	2. Authentication:
	Authenticated with Hugging Face Hub and Weights & Biases for tracking experiments and versioning.

	3. Model Initialization:
	Initialized the base model with 4-bit quantization and a sequence length of up to 2048 tokens.

	4. Pre-Fine-Tuning Inference:
	Conducted an initial inference to establish the model’s baseline performance on a medical question.

	5. Dataset Preparation:
	Structured and formatted the training data using a custom template tailored to medical CoT reasoning tasks.

	6. Application of LoRA Adapters:
	Incorporated LoRA adapters for efficient parameter tuning during fine-tuning.

	7. Supervised Fine-Tuning:
	Utilized SFTTrainer to fine-tune the model with optimized hyperparameters for 44 minutes.

	8. Post-Fine-Tuning Inference:
	Evaluated the model’s improved performance by testing it on the same medical question after fine-tuning.

	9. Saving and Loading:
	Stored the fine-tuned model, including LoRA adapters, for easy future use and deployment.

	10. Model Deployment:
	Pushed the fine-tuned model to Hugging Face Hub in GGUF format with 4-bit quantization enabled for efficient use.

	## Notebook

	Access the implementation notebook for this model[here](https://github.com/SURESHBEEKHANI/Advanced-LLM-Fine-Tuning/blob/main/Deep-seek-R1-Medical-reasoning-SFT.ipynb). This notebook provides detailed steps for fine-tuning and deploying the model.

	---
	license: mit
	language:
	- en
	pipeline_tag: text-generation
	datasets:
	- FreedomIntelligence/medical-o1-reasoning-SFT
	base_model:
	- unsloth/DeepSeek-R1-Distill-Llama-8B
	---
	# DeepSeek-R1-Distill-Llama-8B - Fine-Tuned for Medical Chain-of-Thought Reasoning

	## Model Overview
	The DeepSeek-R1-Distill-Llama-8B model has been fine-tuned for medical chain-of-thought (CoT) reasoning. This fine-tuning process enhances the model's ability to generate structured, concise, and accurate medical reasoning outputs. The model was trained using a 500-sample subset of the medical-o1-reasoning-SFT dataset, with optimizations including 4-bit quantization and LoRA adapters to improve efficiency and reduce memory usage.

	### Key Features
	- Base Model: [unsloth/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B)
	- Fine-Tuning Objective: Adaptation for structured, step-by-step medical reasoning tasks.
	- Training Dataset: 500 samples from medical-o1-reasoning-SFT dataset.
	- Tools Used:
	- Unsloth: Accelerates training by 2x.
	- 4-bit Quantization: Reduces model memory usage.
	- LoRA Adapters: Enables parameter-efficient fine-tuning.
	- Training Time: 44 minutes.

	### Performance Improvements
	- Response Length: Reduced from an average of 450 words to 150 words, improving conciseness.
	- Reasoning Style: Shifted from verbose explanations to more focused, structured reasoning.
	- Answer Format: Transitioned from bulleted lists to paragraph-style answers for clarity.

	## Intended Use
	This model is designed for use by:
	- Medical professionals requiring structured diagnostic reasoning.
	- Researchers seeking assistance in medical knowledge extraction.
	- Developers integrating the model for medical CoT tasks in clinical settings, treatment planning, and education.

	Typical use cases include:
	- Clinical diagnostics
	- Treatment planning
	- Medical education and training
	- Research assistance

	## Training Details

	### Key Components:
	- Model: [unsloth/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B)
	- Dataset: medical-o1-reasoning-SFT (500 samples)
	- Training Tools:
	- Unsloth: Optimized training for faster results (2x speedup).
	- 4-bit Quantization: Optimized memory usage for efficient training.
	- LoRA Adapters: Enables lightweight fine-tuning with reduced computational costs.

	### Fine-Tuning Process:
	1. Install Required Packages:
	Installed necessary libraries, including unsloth and kaggle.

	2. Authentication:
	Authenticated with Hugging Face Hub and Weights & Biases for tracking experiments and versioning.

	3. Model Initialization:
	Initialized the base model with 4-bit quantization and a sequence length of up to 2048 tokens.

	4. Pre-Fine-Tuning Inference:
	Conducted an initial inference to establish the model’s baseline performance on a medical question.

	5. Dataset Preparation:
	Structured and formatted the training data using a custom template tailored to medical CoT reasoning tasks.

	6. Application of LoRA Adapters:
	Incorporated LoRA adapters for efficient parameter tuning during fine-tuning.

	7. Supervised Fine-Tuning:
	Utilized SFTTrainer to fine-tune the model with optimized hyperparameters for 44 minutes.

	8. Post-Fine-Tuning Inference:
	Evaluated the model’s improved performance by testing it on the same medical question after fine-tuning.

	9. Saving and Loading:
	Stored the fine-tuned model, including LoRA adapters, for easy future use and deployment.

	10. Model Deployment:
	Pushed the fine-tuned model to Hugging Face Hub in GGUF format with 4-bit quantization enabled for efficient use.

	## Notebook

	Access the implementation notebook for this model[here](https://github.com/SURESHBEEKHANI/Advanced-LLM-Fine-Tuning/blob/main/Deep-seek-R1-Medical-reasoning-SFT.ipynb). This notebook provides detailed steps for fine-tuning and deploying the model.