SetFit with nomic-ai/modernbert-embed-base

This is a SetFit model that can be used for Text Classification. This SetFit model uses nomic-ai/modernbert-embed-base as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
0
  • 'there has been significant reconstruction and development, especially in the north of the country, and afghanistan’s gross national product has tripled over the past few years.'
  • 'a number of commentators wrongly analysed the debate of last february as the end of the alliance.'
  • 'china has the right to, as all other nations to exercise their forces.'
1
  • 'but we also need to take into account the security consequence for us here by the rise of china, investing in hypersonic glide vehicles, long range … significantly increasing their nuclear arsenals.'
  • 'as a first step, we are proposing mutual briefings on exercises and nuclear policies in the nato-russia council.'
  • "We underscore that Russia's irresponsible nuclear rhetoric is unacceptable and that any use of nuclear weapons would meet with unequivocal international condemnation and severe consequences."

Evaluation

Metrics

Label Accuracy
all 0.9168

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("fefofico/nuclear_trained")
# Run inference
preds = model("so, this is a modernization of the nuclear deterrent we have for many years.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 2 24.7149 132
Label Training Sample Count
0 1017
1 856

Training Hyperparameters

  • batch_size: (20, 20)
  • num_epochs: (20, 20)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 3
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0018 1 0.258 -
0.0890 50 0.2535 -
0.1779 100 0.2445 -
0.2669 150 0.2423 -
0.3559 200 0.2315 -
0.4448 250 0.2077 -
0.5338 300 0.1586 -
0.6228 350 0.136 -
0.7117 400 0.1016 -
0.8007 450 0.0879 -
0.8897 500 0.0641 -
0.9786 550 0.0523 -
1.0676 600 0.0456 -
1.1566 650 0.0358 -
1.2456 700 0.0243 -
1.3345 750 0.0197 -
1.4235 800 0.0173 -
1.5125 850 0.0103 -
1.6014 900 0.0105 -
1.6904 950 0.0118 -
1.7794 1000 0.0202 -
1.8683 1050 0.0124 -
1.9573 1100 0.0118 -
2.0463 1150 0.0074 -
2.1352 1200 0.0045 -
2.2242 1250 0.0036 -
2.3132 1300 0.0068 -
2.4021 1350 0.0032 -
2.4911 1400 0.0012 -
2.5801 1450 0.0021 -
2.6690 1500 0.0021 -
2.7580 1550 0.0003 -
2.8470 1600 0.0025 -
2.9359 1650 0.0003 -
3.0249 1700 0.0002 -
3.1139 1750 0.0002 -
3.2028 1800 0.0001 -
3.2918 1850 0.0001 -
3.3808 1900 0.0001 -
3.4698 1950 0.0001 -
3.5587 2000 0.0003 -
3.6477 2050 0.0001 -
3.7367 2100 0.0004 -
3.8256 2150 0.0009 -
3.9146 2200 0.0001 -
4.0036 2250 0.0006 -
4.0925 2300 0.0005 -
4.1815 2350 0.0001 -
4.2705 2400 0.0001 -
4.3594 2450 0.0001 -
4.4484 2500 0.0001 -
4.5374 2550 0.0001 -
4.6263 2600 0.0001 -
4.7153 2650 0.0001 -
4.8043 2700 0.0001 -
4.8932 2750 0.0 -
4.9822 2800 0.0003 -
5.0712 2850 0.0 -
5.1601 2900 0.0 -
5.2491 2950 0.0 -
5.3381 3000 0.0 -
5.4270 3050 0.0 -
5.5160 3100 0.0 -
5.6050 3150 0.0002 -
5.6940 3200 0.0 -
5.7829 3250 0.0 -
5.8719 3300 0.0001 -
5.9609 3350 0.0 -
6.0498 3400 0.0 -
6.1388 3450 0.0 -
6.2278 3500 0.0 -
6.3167 3550 0.0 -
6.4057 3600 0.0 -
6.4947 3650 0.0 -
6.5836 3700 0.0 -
6.6726 3750 0.0 -
6.7616 3800 0.0 -
6.8505 3850 0.0 -
6.9395 3900 0.0 -
7.0285 3950 0.0 -
7.1174 4000 0.0 -
7.2064 4050 0.0 -
7.2954 4100 0.0 -
7.3843 4150 0.0 -
7.4733 4200 0.0 -
7.5623 4250 0.0 -
7.6512 4300 0.0 -
7.7402 4350 0.0 -
7.8292 4400 0.0 -
7.9181 4450 0.0 -
8.0071 4500 0.0 -
8.0961 4550 0.0 -
8.1851 4600 0.0 -
8.2740 4650 0.0 -
8.3630 4700 0.0 -
8.4520 4750 0.0 -
8.5409 4800 0.0 -
8.6299 4850 0.0 -
8.7189 4900 0.0 -
8.8078 4950 0.0 -
8.8968 5000 0.0 -
8.9858 5050 0.0 -
9.0747 5100 0.0 -
9.1637 5150 0.0 -
9.2527 5200 0.0 -
9.3416 5250 0.0 -
9.4306 5300 0.0 -
9.5196 5350 0.0 -
9.6085 5400 0.0 -
9.6975 5450 0.0 -
9.7865 5500 0.0 -
9.8754 5550 0.0 -
9.9644 5600 0.0 -
10.0534 5650 0.0 -
10.1423 5700 0.0 -
10.2313 5750 0.0 -
10.3203 5800 0.0 -
10.4093 5850 0.0 -
10.4982 5900 0.0 -
10.5872 5950 0.0 -
10.6762 6000 0.0 -
10.7651 6050 0.0 -
10.8541 6100 0.0 -
10.9431 6150 0.0 -
11.0320 6200 0.0 -
11.1210 6250 0.0 -
11.2100 6300 0.0 -
11.2989 6350 0.0 -
11.3879 6400 0.0 -
11.4769 6450 0.0 -
11.5658 6500 0.0 -
11.6548 6550 0.0 -
11.7438 6600 0.0 -
11.8327 6650 0.0 -
11.9217 6700 0.0 -
12.0107 6750 0.0 -
12.0996 6800 0.0 -
12.1886 6850 0.0 -
12.2776 6900 0.0 -
12.3665 6950 0.0 -
12.4555 7000 0.0 -
12.5445 7050 0.0 -
12.6335 7100 0.0 -
12.7224 7150 0.0 -
12.8114 7200 0.0 -
12.9004 7250 0.0 -
12.9893 7300 0.0 -
13.0783 7350 0.0 -
13.1673 7400 0.0 -
13.2562 7450 0.0 -
13.3452 7500 0.0 -
13.4342 7550 0.0 -
13.5231 7600 0.0 -
13.6121 7650 0.0 -
13.7011 7700 0.0 -
13.7900 7750 0.0 -
13.8790 7800 0.0 -
13.9680 7850 0.0 -
14.0569 7900 0.0 -
14.1459 7950 0.0 -
14.2349 8000 0.0 -
14.3238 8050 0.0 -
14.4128 8100 0.0 -
14.5018 8150 0.0 -
14.5907 8200 0.0 -
14.6797 8250 0.0 -
14.7687 8300 0.0 -
14.8577 8350 0.0 -
14.9466 8400 0.0 -
15.0356 8450 0.0 -
15.1246 8500 0.0 -
15.2135 8550 0.0 -
15.3025 8600 0.0 -
15.3915 8650 0.0 -
15.4804 8700 0.0 -
15.5694 8750 0.0 -
15.6584 8800 0.0 -
15.7473 8850 0.0 -
15.8363 8900 0.0 -
15.9253 8950 0.0 -
16.0142 9000 0.0 -
16.1032 9050 0.0 -
16.1922 9100 0.0 -
16.2811 9150 0.0 -
16.3701 9200 0.0 -
16.4591 9250 0.0 -
16.5480 9300 0.0 -
16.6370 9350 0.0 -
16.7260 9400 0.0 -
16.8149 9450 0.0 -
16.9039 9500 0.0 -
16.9929 9550 0.0 -
17.0819 9600 0.0 -
17.1708 9650 0.0 -
17.2598 9700 0.0 -
17.3488 9750 0.0 -
17.4377 9800 0.0 -
17.5267 9850 0.0 -
17.6157 9900 0.0 -
17.7046 9950 0.0 -
17.7936 10000 0.0 -
17.8826 10050 0.0 -
17.9715 10100 0.0 -
18.0605 10150 0.0 -
18.1495 10200 0.0 -
18.2384 10250 0.0 -
18.3274 10300 0.0 -
18.4164 10350 0.0 -
18.5053 10400 0.0 -
18.5943 10450 0.0 -
18.6833 10500 0.0 -
18.7722 10550 0.0 -
18.8612 10600 0.0 -
18.9502 10650 0.0 -
19.0391 10700 0.0 -
19.1281 10750 0.0 -
19.2171 10800 0.0 -
19.3060 10850 0.0 -
19.3950 10900 0.0 -
19.4840 10950 0.0 -
19.5730 11000 0.0 -
19.6619 11050 0.0 -
19.7509 11100 0.0 -
19.8399 11150 0.0 -
19.9288 11200 0.0 -

Framework Versions

  • Python: 3.12.12
  • SetFit: 1.1.3
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.8.0+cu126
  • Datasets: 4.0.0
  • Tokenizers: 0.22.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
3
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for fefofico/nuclear_trained

Finetuned
(96)
this model

Paper for fefofico/nuclear_trained

Evaluation results