abdelrahmane01's picture
Update README.md
a5fa9a6 verified
metadata
library_name: transformers
license: apache-2.0
base_model: distilbert-base-uncased
tags:
  - text-classification
  - sentiment-analysis
  - emotion-classification
  - generated_from_trainer
metrics:
  - f1_macro
  - accuracy
model-index:
  - name: SentimentAnalysis-distilbert-base-uncased-finetuned-emotion
    results:
      - task:
          type: text-classification
          name: Emotion Classification
        dataset:
          name: Emotion Dataset
          type: text
        metrics:
          - name: F1 Macro
            type: f1_macro
            value: 0.8902
          - name: Accuracy
            type: accuracy
            value: 0.927

SentimentAnalysis-distilbert-base-uncased-finetuned-emotion

This model is a fine-tuned version of distilbert-base-uncased for emotion classification of short texts (tweets).
It predicts one of six emotions:

  • sadness
  • joy
  • love
  • anger
  • fear
  • surprise

The model was trained using the 🤗 Transformers Trainer API with class-weighted loss to handle class imbalance.


Model performance

Evaluation results on the test set:

  • Loss: 0.2094
  • F1 Macro: 0.8902
  • Accuracy: 0.927

Note: F1 Macro is the primary metric since the dataset is imbalanced.


Intended uses

This model is suitable for:

  • Emotion analysis of tweets or short social media texts
  • NLP research and educational projects
  • Sentiment-aware chatbots or analytics dashboards

Limitations

  • Trained on short texts (tweets); performance may degrade on long documents
  • English-only
  • May inherit biases present in the training data
  • Not intended for high-stakes or sensitive decision-making

Training data

The model was trained on an emotion-labeled tweet dataset with six emotion classes.
The dataset was split into training, validation, and test sets.

Preprocessing steps included:

  • Tokenization using the DistilBERT tokenizer
  • Padding and truncation to a fixed maximum length
  • Label encoding using Hugging Face ClassLabel

Training procedure

Hyperparameters

  • Base model: distilbert-base-uncased
  • Learning rate: 2e-5
  • Train batch size: 32
  • Eval batch size: 32
  • Epochs: 10
  • Optimizer: AdamW (betas = 0.9, 0.999, epsilon = 1e-8)
  • Learning rate scheduler: Linear
  • Loss function: Cross-Entropy with class weights
  • Seed: 42

Training results

Training Loss Epoch Step Validation Loss F1 Macro Accuracy
0.6445 1.0 500 0.2374 0.8993 0.9235
0.1811 2.0 1000 0.1683 0.9109 0.9340
0.1357 3.0 1500 0.1686 0.9157 0.9380
0.1036 4.0 2000 0.1737 0.9192 0.9400
0.0816 5.0 2500 0.2204 0.9086 0.9345
0.0629 6.0 3000 0.2197 0.9142 0.9385
0.0475 7.0 3500 0.3064 0.9081 0.9355

The best model was selected based on macro F1 score.


Framework versions

  • Transformers: 4.44.2
  • PyTorch: 2.6.0+cu124
  • Datasets: 4.4.1
  • Tokenizers: 0.19.1

Source code

Training and evaluation code is available on GitHub:
👉 https://github.com/Abdelrahmanemam01/Sentiment-Analysis