File size: 2,338 Bytes
1c88a60
 
 
3c2f254
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
tags:
  - roberta
  - email-classification
  - text-classification
language: en
license: apache-2.0
datasets:
  - Tobi-Bueck/customer-support-tickets
metrics:
  - accuracy
model_type: xlm-roberta
pipeline_tag: text-classification
---

#  xlm-roberta-email-classifier

Fine-tuned version of `xlm-roberta-base` for multi-class classification of English-language emails.  
This model is designed to automatically route or tag incoming messages based on their content.

##  Model Overview

- **Base Model**: `xlm-roberta-base`
- **Task**: Email classification (10 categories)
- **Language**: English
- **Frameworks**: Hugging Face Transformers, PyTorch Lightning
- **Training Tracker**: Weights & Biases

##  Performance

- Accuracy: 0.42  
- F1 Score: 0.436  
- Precision: 0.527  
- Recall: 0.42  

## Class Labels

The model predicts one of the following categories:

| Label ID | Category                        |
|----------|---------------------------------|
| 0        | Billing and Payments            |
| 1        | Customer Service                |
| 2        | General Inquiry                 |
| 3        | Human Resources                 |
| 4        | IT Support                      |
| 5        | Product Support                 |
| 6        | Returns and Exchanges           |
| 7        | Sales and Pre-Sales             | 
| 8        | Service Outages and Maintenance |
| 9        | Technical Support               | 


##  Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("ale-dp/xlm-roberta-email-classifier")
model = AutoModelForSequenceClassification.from_pretrained("ale-dp/xlm-roberta-email-classifier")

email_text = "I'd like to return the item I purchased last week."
inputs = tokenizer(email_text, return_tensors="pt")
outputs = model(**inputs)

predicted_class_id = outputs.logits.argmax().item()
label_map = {
    'Billing and Payments': 0,
    'Customer Service': 1,
    'General Inquiry': 2,
    'Human Resources': 3,
    'IT Support': 4,
    'Product Support': 5,
    'Returns and Exchanges': 6,
    'Sales and Pre-Sales': 7,
    'Service Outages and Maintenance': 8,
    'Technical Support': 9
}
predicted_label = list(label_map.keys())[list(label_map.values()).index(predicted_class_id)]
print(predicted_label)
```