Efficient Few-Shot Learning Without Prompts
Paper
•
2209.11055
•
Published
•
4
This is a SetFit model that can be used for Text Classification. This SetFit model uses JohanHeinsen/Old_News_Segmentation_SBERT_V0.1 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.
The model has been trained using an efficient few-shot learning technique that involves:
| Label | Examples |
|---|---|
| 0 |
|
| 1 |
|
| Label | Accuracy | F1 | Precision | Recall |
|---|---|---|---|---|
| all | 0.98 | 0.9371 | 0.9437 | 0.9306 |
First install the SetFit library:
pip install setfit
Then you can load this model and run inference.
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("2) En Mandsperson, 19-24 Aar gl., lidt under Middelhøide, blond, uden Skjæg, rødmusset, sort Klædesfrakke og sort, flad Kaskjet, – sigtes for Tyveriet Nr. 765. (II).")
| Training set | Min | Median | Max |
|---|---|---|---|
| Word count | 7 | 56.6019 | 1181 |
| Label | Training Sample Count |
|---|---|
| 0 | 861 |
| 1 | 189 |
| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 0.0003 | 1 | 0.2644 | - |
| 0.0130 | 50 | 0.2902 | - |
| 0.0260 | 100 | 0.1297 | - |
| 0.0390 | 150 | 0.0355 | - |
| 0.0519 | 200 | 0.02 | - |
| 0.0649 | 250 | 0.0086 | - |
| 0.0779 | 300 | 0.0039 | - |
| 0.0909 | 350 | 0.0024 | - |
| 0.1039 | 400 | 0.0019 | - |
| 0.1169 | 450 | 0.0007 | - |
| 0.1299 | 500 | 0.0001 | - |
| 0.1429 | 550 | 0.0001 | - |
| 0.1558 | 600 | 0.0001 | - |
| 0.1688 | 650 | 0.0001 | - |
| 0.1818 | 700 | 0.0001 | - |
| 0.1948 | 750 | 0.0 | - |
| 0.2078 | 800 | 0.0 | - |
| 0.2208 | 850 | 0.0 | - |
| 0.2338 | 900 | 0.0 | - |
| 0.2468 | 950 | 0.0 | - |
| 0.2597 | 1000 | 0.0 | - |
| 0.2727 | 1050 | 0.0 | - |
| 0.2857 | 1100 | 0.0 | - |
| 0.2987 | 1150 | 0.0 | - |
| 0.3117 | 1200 | 0.0 | - |
| 0.3247 | 1250 | 0.0 | - |
| 0.3377 | 1300 | 0.0 | - |
| 0.3506 | 1350 | 0.0 | - |
| 0.3636 | 1400 | 0.0 | - |
| 0.3766 | 1450 | 0.0 | - |
| 0.3896 | 1500 | 0.0 | - |
| 0.4026 | 1550 | 0.0 | - |
| 0.4156 | 1600 | 0.0 | - |
| 0.4286 | 1650 | 0.0 | - |
| 0.4416 | 1700 | 0.0 | - |
| 0.4545 | 1750 | 0.0 | - |
| 0.4675 | 1800 | 0.0 | - |
| 0.4805 | 1850 | 0.0 | - |
| 0.4935 | 1900 | 0.0 | - |
| 0.5065 | 1950 | 0.0 | - |
| 0.5195 | 2000 | 0.0 | - |
| 0.5325 | 2050 | 0.0 | - |
| 0.5455 | 2100 | 0.0 | - |
| 0.5584 | 2150 | 0.0 | - |
| 0.5714 | 2200 | 0.0 | - |
| 0.5844 | 2250 | 0.0 | - |
| 0.5974 | 2300 | 0.0 | - |
| 0.6104 | 2350 | 0.0 | - |
| 0.6234 | 2400 | 0.0 | - |
| 0.6364 | 2450 | 0.0 | - |
| 0.6494 | 2500 | 0.0 | - |
| 0.6623 | 2550 | 0.0 | - |
| 0.6753 | 2600 | 0.0 | - |
| 0.6883 | 2650 | 0.0 | - |
| 0.7013 | 2700 | 0.0 | - |
| 0.7143 | 2750 | 0.0 | - |
| 0.7273 | 2800 | 0.0 | - |
| 0.7403 | 2850 | 0.0 | - |
| 0.7532 | 2900 | 0.0 | - |
| 0.7662 | 2950 | 0.0 | - |
| 0.7792 | 3000 | 0.0 | - |
| 0.7922 | 3050 | 0.0 | - |
| 0.8052 | 3100 | 0.0 | - |
| 0.8182 | 3150 | 0.0 | - |
| 0.8312 | 3200 | 0.0 | - |
| 0.8442 | 3250 | 0.0 | - |
| 0.8571 | 3300 | 0.0 | - |
| 0.8701 | 3350 | 0.0 | - |
| 0.8831 | 3400 | 0.0 | - |
| 0.8961 | 3450 | 0.0 | - |
| 0.9091 | 3500 | 0.0 | - |
| 0.9221 | 3550 | 0.0 | - |
| 0.9351 | 3600 | 0.0 | - |
| 0.9481 | 3650 | 0.0 | - |
| 0.9610 | 3700 | 0.0 | - |
| 0.9740 | 3750 | 0.0 | - |
| 0.9870 | 3800 | 0.0 | - |
| 1.0 | 3850 | 0.0 | - |
@article{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
Base model
CALDISS-AAU/DA-BERT_Old_News_V1