--- license: apache-2.0 tags: - question-answering - complexity-classification - distilbert datasets: - wesley7137/question_complexity_classification --- # question-complexity-classifier 🤖 Fine-tuned DistilBERT model for classifying question complexity (Simple vs Complex) ## Model Details ### Model Description - **Architecture:** DistilBERT base uncased - **Fine-tuned on:** Question Complexity Classification Dataset - **Language:** English - **License:** Apache 2.0 - **Max Sequence Length:** 128 tokens ## Uses ```python from transformers import pipeline classifier = pipeline( "text-classification", model="grahamaco/question-complexity-classifier", tokenizer="grahamaco/question-complexity-classifier", truncation=True, max_length=128 # Matches training config ) result = classifier("Explain quantum computing in simple terms") # Output example: {'label': 'COMPLEX', 'score': 0.97} ``` ## Training Details - **Epochs:** 5 - **Batch Size:** 32 (global) - **Learning Rate:** 2e-5 - **Train/Val/Test Split:** 80/10/10 (stratified) - **Early Stopping:** Patience of 2 epochs ## Evaluation Results | Metric | Value | |--------|-------| | Accuracy | 0.92 | | F1 Score | 0.91 | ## Performance | Metric | Value | |--------|-------| | Inference Latency | 15.2ms (CPU) | | Throughput | 68.4 samples/sec (GPU) | ## Ethical Considerations This model is intended for educational content classification only. Developers should: - Regularly audit performance across different question types - Monitor for unintended bias in complexity assessments - Provide human-review mechanisms for high-stakes classifications - Validate classifications against original context when used with RAG systems