|
|
|
|
|
--- |
|
|
tags: |
|
|
- medical |
|
|
- disease |
|
|
- symptoms |
|
|
- treatment |
|
|
- gpt2 |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- QuyenAnhDE/Diseases_Symptoms |
|
|
--- |
|
|
|
|
|
# SmallMedLM |
|
|
|
|
|
**SmallMedLM** is a fine-tuned [`distilgpt2`](https://huggingface.co/distilgpt2) model trained on medical text data about diseases, symptoms, and treatments. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model is designed for **generating medical information** given a disease or symptom prompt. |
|
|
It can output possible **symptoms** for a disease or suggest **treatment directions** based on symptoms. |
|
|
|
|
|
⚠️ **Disclaimer**: This model is for research/educational purposes only. It is **not a substitute for professional medical advice**. Always consult a qualified healthcare professional. |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Data |
|
|
|
|
|
- Dataset: [Diseases_Symptoms](https://huggingface.co/datasets/QuyenAnhDE/Diseases_Symptoms) |
|
|
- Domain: Disease → Symptoms → Treatment mapping |
|
|
- Base model: `distilgpt2` |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Inference Example |
|
|
|
|
|
```python |
|
|
from transformers import GPT2LMHeadModel, GPT2Tokenizer |
|
|
|
|
|
model_name = "sumanthmandavalli/SmallMedLM" |
|
|
tokenizer = GPT2Tokenizer.from_pretrained(model_name) |
|
|
model = GPT2LMHeadModel.from_pretrained(model_name) |
|
|
|
|
|
def generate_medical_info(disease_name, max_length=100): |
|
|
prompt = f"Disease: {disease_name} | Symptoms: " |
|
|
inputs = tokenizer.encode(prompt, return_tensors="pt") |
|
|
|
|
|
outputs = model.generate( |
|
|
inputs, |
|
|
max_length=max_length, |
|
|
num_return_sequences=1, |
|
|
no_repeat_ngram_size=2, |
|
|
top_k=50, |
|
|
top_p=0.95, |
|
|
temperature=0.7, |
|
|
do_sample=True |
|
|
) |
|
|
|
|
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
|
|
print(generate_medical_info("Diabetes")) |
|
|
|