SetFit with sentence-transformers/paraphrase-mpnet-base-v2
This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
- Fine-tuning a Sentence Transformer with contrastive learning.
- Training a classification head with features from the fine-tuned Sentence Transformer.
Model Details
Model Description
Model Sources
Model Labels
| Label |
Examples |
| order tracking |
- 'What is the delivery status for my order placed using phone number 123456789?'
- 'I ordered the Cake Decorating Kit 4 days ago, can you provide the tracking information?'
- 'I ordered the Cake Stands 2 days ago with order no 54321 how long will it take to deliver?'
|
| general faq |
- 'How do the traditional hand-woven Banarasi sarees from HKV Benaras differ from those made by machine-driven industries?'
- 'What are the key factors to consider when developing a personalized diet plan for weight loss?'
- "Are there any scientific studies that support Green Tea's role in preventing Alzheimer's and Parkinson's diseases?"
|
| product policy |
- 'How do you use the information collected through tracking tools like Google Analytics and cookies?'
- 'How does bakeyy handle returns for items that were purchased with a thank you discount?'
- 'What is the procedure for returning a product that was part of a special occasion promotion?'
|
| product discoverability |
- 'What is the price of the organic honey?'
- 'Variety of cookie boxes'
- 'what apparells do you have from Drew House'
|
| product faq |
- 'What is the price of the bestseller honey?'
- 'Do you offer any bulk discounts on organic honey?'
- 'Are the big plum cake boxes available in packs of 30?'
|
Evaluation
Metrics
Uses
Direct Use for Inference
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
model = SetFitModel.from_pretrained("Shankhdhar/classifier_woog_firstbud_updated")
preds = model("cookie boxes with dividers")
Training Details
Training Set Metrics
| Training set |
Min |
Median |
Max |
| Word count |
4 |
11.9760 |
28 |
| Label |
Training Sample Count |
| general faq |
24 |
| order tracking |
34 |
| product discoverability |
50 |
| product faq |
50 |
| product policy |
50 |
Training Hyperparameters
- batch_size: (16, 16)
- num_epochs: (2, 2)
- max_steps: -1
- sampling_strategy: oversampling
- body_learning_rate: (2e-05, 1e-05)
- head_learning_rate: 0.01
- loss: CosineSimilarityLoss
- distance_metric: cosine_distance
- margin: 0.25
- end_to_end: False
- use_amp: False
- warmup_proportion: 0.1
- seed: 42
- eval_max_steps: -1
- load_best_model_at_end: True
Training Results
| Epoch |
Step |
Training Loss |
Validation Loss |
| 0.0005 |
1 |
0.2048 |
- |
| 0.0235 |
50 |
0.2874 |
- |
| 0.0470 |
100 |
0.126 |
- |
| 0.0705 |
150 |
0.0388 |
- |
| 0.0940 |
200 |
0.0786 |
- |
| 0.1175 |
250 |
0.0049 |
- |
| 0.1410 |
300 |
0.0048 |
- |
| 0.1646 |
350 |
0.0018 |
- |
| 0.1881 |
400 |
0.0011 |
- |
| 0.2116 |
450 |
0.0004 |
- |
| 0.2351 |
500 |
0.0006 |
- |
| 0.2586 |
550 |
0.0005 |
- |
| 0.2821 |
600 |
0.0012 |
- |
| 0.3056 |
650 |
0.0004 |
- |
| 0.3291 |
700 |
0.0003 |
- |
| 0.3526 |
750 |
0.0002 |
- |
| 0.3761 |
800 |
0.0002 |
- |
| 0.3996 |
850 |
0.0002 |
- |
| 0.4231 |
900 |
0.0002 |
- |
| 0.4466 |
950 |
0.0008 |
- |
| 0.4701 |
1000 |
0.0002 |
- |
| 0.4937 |
1050 |
0.0003 |
- |
| 0.5172 |
1100 |
0.0001 |
- |
| 0.5407 |
1150 |
0.0002 |
- |
| 0.5642 |
1200 |
0.0001 |
- |
| 0.5877 |
1250 |
0.0001 |
- |
| 0.6112 |
1300 |
0.0001 |
- |
| 0.6347 |
1350 |
0.0004 |
- |
| 0.6582 |
1400 |
0.0002 |
- |
| 0.6817 |
1450 |
0.0001 |
- |
| 0.7052 |
1500 |
0.0002 |
- |
| 0.7287 |
1550 |
0.0001 |
- |
| 0.7522 |
1600 |
0.0001 |
- |
| 0.7757 |
1650 |
0.0001 |
- |
| 0.7992 |
1700 |
0.0001 |
- |
| 0.8228 |
1750 |
0.0001 |
- |
| 0.8463 |
1800 |
0.0001 |
- |
| 0.8698 |
1850 |
0.0001 |
- |
| 0.8933 |
1900 |
0.0001 |
- |
| 0.9168 |
1950 |
0.0001 |
- |
| 0.9403 |
2000 |
0.0001 |
- |
| 0.9638 |
2050 |
0.0001 |
- |
| 0.9873 |
2100 |
0.0002 |
- |
| 1.0108 |
2150 |
0.0001 |
- |
| 1.0343 |
2200 |
0.0001 |
- |
| 1.0578 |
2250 |
0.0001 |
- |
| 1.0813 |
2300 |
0.0001 |
- |
| 1.1048 |
2350 |
0.0001 |
- |
| 1.1283 |
2400 |
0.0 |
- |
| 1.1519 |
2450 |
0.0001 |
- |
| 1.1754 |
2500 |
0.0 |
- |
| 1.1989 |
2550 |
0.0001 |
- |
| 1.2224 |
2600 |
0.0007 |
- |
| 1.2459 |
2650 |
0.0001 |
- |
| 1.2694 |
2700 |
0.0001 |
- |
| 1.2929 |
2750 |
0.0001 |
- |
| 1.3164 |
2800 |
0.0001 |
- |
| 1.3399 |
2850 |
0.0001 |
- |
| 1.3634 |
2900 |
0.0001 |
- |
| 1.3869 |
2950 |
0.0001 |
- |
| 1.4104 |
3000 |
0.0001 |
- |
| 1.4339 |
3050 |
0.0001 |
- |
| 1.4575 |
3100 |
0.0001 |
- |
| 1.4810 |
3150 |
0.0001 |
- |
| 1.5045 |
3200 |
0.0001 |
- |
| 1.5280 |
3250 |
0.0001 |
- |
| 1.5515 |
3300 |
0.0001 |
- |
| 1.5750 |
3350 |
0.0001 |
- |
| 1.5985 |
3400 |
0.0001 |
- |
| 1.6220 |
3450 |
0.0001 |
- |
| 1.6455 |
3500 |
0.0001 |
- |
| 1.6690 |
3550 |
0.0001 |
- |
| 1.6925 |
3600 |
0.0001 |
- |
| 1.7160 |
3650 |
0.0 |
- |
| 1.7395 |
3700 |
0.0001 |
- |
| 1.7630 |
3750 |
0.0001 |
- |
| 1.7866 |
3800 |
0.0 |
- |
| 1.8101 |
3850 |
0.0001 |
- |
| 1.8336 |
3900 |
0.0001 |
- |
| 1.8571 |
3950 |
0.0 |
- |
| 1.8806 |
4000 |
0.0001 |
- |
| 1.9041 |
4050 |
0.0001 |
- |
| 1.9276 |
4100 |
0.0001 |
- |
| 1.9511 |
4150 |
0.0001 |
- |
| 1.9746 |
4200 |
0.0001 |
- |
| 1.9981 |
4250 |
0.0001 |
- |
Framework Versions
- Python: 3.10.13
- SetFit: 1.0.3
- Sentence Transformers: 3.0.1
- Transformers: 4.39.0
- PyTorch: 2.2.2+cu121
- Datasets: 2.19.2
- Tokenizers: 0.15.2
Citation
BibTeX
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}