SetFit with nomic-ai/modernbert-embed-base

This is a SetFit model that can be used for Text Classification. This SetFit model uses nomic-ai/modernbert-embed-base as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: nomic-ai/modernbert-embed-base
Classification head: a LogisticRegression instance
Maximum Sequence Length: 8192 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
0	'there has been significant reconstruction and development, especially in the north of the country, and afghanistan’s gross national product has tripled over the past few years.' 'a number of commentators wrongly analysed the debate of last february as the end of the alliance.' 'china has the right to, as all other nations to exercise their forces.'
1	'but we also need to take into account the security consequence for us here by the rise of china, investing in hypersonic glide vehicles, long range … significantly increasing their nuclear arsenals.' 'as a first step, we are proposing mutual briefings on exercises and nuclear policies in the nato-russia council.' "We underscore that Russia's irresponsible nuclear rhetoric is unacceptable and that any use of nuclear weapons would meet with unequivocal international condemnation and severe consequences."

Label

Examples

'there has been significant reconstruction and development, especially in the north of the country, and afghanistan’s gross national product has tripled over the past few years.'
'a number of commentators wrongly analysed the debate of last february as the end of the alliance.'
'china has the right to, as all other nations to exercise their forces.'

'but we also need to take into account the security consequence for us here by the rise of china, investing in hypersonic glide vehicles, long range … significantly increasing their nuclear arsenals.'
'as a first step, we are proposing mutual briefings on exercises and nuclear policies in the nato-russia council.'
"We underscore that Russia's irresponsible nuclear rhetoric is unacceptable and that any use of nuclear weapons would meet with unequivocal international condemnation and severe consequences."

Evaluation

Metrics

Label	Accuracy
all	0.9168

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("fefofico/nuclear_trained")
# Run inference
preds = model("so, this is a modernization of the nuclear deterrent we have for many years.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	2	24.7149	132

Label	Training Sample Count
0	1017
1	856

Training Hyperparameters

batch_size: (20, 20)
num_epochs: (20, 20)
max_steps: -1
sampling_strategy: oversampling
num_iterations: 3
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0018	1	0.258	-
0.0890	50	0.2535	-
0.1779	100	0.2445	-
0.2669	150	0.2423	-
0.3559	200	0.2315	-
0.4448	250	0.2077	-
0.5338	300	0.1586	-
0.6228	350	0.136	-
0.7117	400	0.1016	-
0.8007	450	0.0879	-
0.8897	500	0.0641	-
0.9786	550	0.0523	-
1.0676	600	0.0456	-
1.1566	650	0.0358	-
1.2456	700	0.0243	-
1.3345	750	0.0197	-
1.4235	800	0.0173	-
1.5125	850	0.0103	-
1.6014	900	0.0105	-
1.6904	950	0.0118	-
1.7794	1000	0.0202	-
1.8683	1050	0.0124	-
1.9573	1100	0.0118	-
2.0463	1150	0.0074	-
2.1352	1200	0.0045	-
2.2242	1250	0.0036	-
2.3132	1300	0.0068	-
2.4021	1350	0.0032	-
2.4911	1400	0.0012	-
2.5801	1450	0.0021	-
2.6690	1500	0.0021	-
2.7580	1550	0.0003	-
2.8470	1600	0.0025	-
2.9359	1650	0.0003	-
3.0249	1700	0.0002	-
3.1139	1750	0.0002	-
3.2028	1800	0.0001	-
3.2918	1850	0.0001	-
3.3808	1900	0.0001	-
3.4698	1950	0.0001	-
3.5587	2000	0.0003	-
3.6477	2050	0.0001	-
3.7367	2100	0.0004	-
3.8256	2150	0.0009	-
3.9146	2200	0.0001	-
4.0036	2250	0.0006	-
4.0925	2300	0.0005	-
4.1815	2350	0.0001	-
4.2705	2400	0.0001	-
4.3594	2450	0.0001	-
4.4484	2500	0.0001	-
4.5374	2550	0.0001	-
4.6263	2600	0.0001	-
4.7153	2650	0.0001	-
4.8043	2700	0.0001	-
4.8932	2750	0.0	-
4.9822	2800	0.0003	-
5.0712	2850	0.0	-
5.1601	2900	0.0	-
5.2491	2950	0.0	-
5.3381	3000	0.0	-
5.4270	3050	0.0	-
5.5160	3100	0.0	-
5.6050	3150	0.0002	-
5.6940	3200	0.0	-
5.7829	3250	0.0	-
5.8719	3300	0.0001	-
5.9609	3350	0.0	-
6.0498	3400	0.0	-
6.1388	3450	0.0	-
6.2278	3500	0.0	-
6.3167	3550	0.0	-
6.4057	3600	0.0	-
6.4947	3650	0.0	-
6.5836	3700	0.0	-
6.6726	3750	0.0	-
6.7616	3800	0.0	-
6.8505	3850	0.0	-
6.9395	3900	0.0	-
7.0285	3950	0.0	-
7.1174	4000	0.0	-
7.2064	4050	0.0	-
7.2954	4100	0.0	-
7.3843	4150	0.0	-
7.4733	4200	0.0	-
7.5623	4250	0.0	-
7.6512	4300	0.0	-
7.7402	4350	0.0	-
7.8292	4400	0.0	-
7.9181	4450	0.0	-
8.0071	4500	0.0	-
8.0961	4550	0.0	-
8.1851	4600	0.0	-
8.2740	4650	0.0	-
8.3630	4700	0.0	-
8.4520	4750	0.0	-
8.5409	4800	0.0	-
8.6299	4850	0.0	-
8.7189	4900	0.0	-
8.8078	4950	0.0	-
8.8968	5000	0.0	-
8.9858	5050	0.0	-
9.0747	5100	0.0	-
9.1637	5150	0.0	-
9.2527	5200	0.0	-
9.3416	5250	0.0	-
9.4306	5300	0.0	-
9.5196	5350	0.0	-
9.6085	5400	0.0	-
9.6975	5450	0.0	-
9.7865	5500	0.0	-
9.8754	5550	0.0	-
9.9644	5600	0.0	-
10.0534	5650	0.0	-
10.1423	5700	0.0	-
10.2313	5750	0.0	-
10.3203	5800	0.0	-
10.4093	5850	0.0	-
10.4982	5900	0.0	-
10.5872	5950	0.0	-
10.6762	6000	0.0	-
10.7651	6050	0.0	-
10.8541	6100	0.0	-
10.9431	6150	0.0	-
11.0320	6200	0.0	-
11.1210	6250	0.0	-
11.2100	6300	0.0	-
11.2989	6350	0.0	-
11.3879	6400	0.0	-
11.4769	6450	0.0	-
11.5658	6500	0.0	-
11.6548	6550	0.0	-
11.7438	6600	0.0	-
11.8327	6650	0.0	-
11.9217	6700	0.0	-
12.0107	6750	0.0	-
12.0996	6800	0.0	-
12.1886	6850	0.0	-
12.2776	6900	0.0	-
12.3665	6950	0.0	-
12.4555	7000	0.0	-
12.5445	7050	0.0	-
12.6335	7100	0.0	-
12.7224	7150	0.0	-
12.8114	7200	0.0	-
12.9004	7250	0.0	-
12.9893	7300	0.0	-
13.0783	7350	0.0	-
13.1673	7400	0.0	-
13.2562	7450	0.0	-
13.3452	7500	0.0	-
13.4342	7550	0.0	-
13.5231	7600	0.0	-
13.6121	7650	0.0	-
13.7011	7700	0.0	-
13.7900	7750	0.0	-
13.8790	7800	0.0	-
13.9680	7850	0.0	-
14.0569	7900	0.0	-
14.1459	7950	0.0	-
14.2349	8000	0.0	-
14.3238	8050	0.0	-
14.4128	8100	0.0	-
14.5018	8150	0.0	-
14.5907	8200	0.0	-
14.6797	8250	0.0	-
14.7687	8300	0.0	-
14.8577	8350	0.0	-
14.9466	8400	0.0	-
15.0356	8450	0.0	-
15.1246	8500	0.0	-
15.2135	8550	0.0	-
15.3025	8600	0.0	-
15.3915	8650	0.0	-
15.4804	8700	0.0	-
15.5694	8750	0.0	-
15.6584	8800	0.0	-
15.7473	8850	0.0	-
15.8363	8900	0.0	-
15.9253	8950	0.0	-
16.0142	9000	0.0	-
16.1032	9050	0.0	-
16.1922	9100	0.0	-
16.2811	9150	0.0	-
16.3701	9200	0.0	-
16.4591	9250	0.0	-
16.5480	9300	0.0	-
16.6370	9350	0.0	-
16.7260	9400	0.0	-
16.8149	9450	0.0	-
16.9039	9500	0.0	-
16.9929	9550	0.0	-
17.0819	9600	0.0	-
17.1708	9650	0.0	-
17.2598	9700	0.0	-
17.3488	9750	0.0	-
17.4377	9800	0.0	-
17.5267	9850	0.0	-
17.6157	9900	0.0	-
17.7046	9950	0.0	-
17.7936	10000	0.0	-
17.8826	10050	0.0	-
17.9715	10100	0.0	-
18.0605	10150	0.0	-
18.1495	10200	0.0	-
18.2384	10250	0.0	-
18.3274	10300	0.0	-
18.4164	10350	0.0	-
18.5053	10400	0.0	-
18.5943	10450	0.0	-
18.6833	10500	0.0	-
18.7722	10550	0.0	-
18.8612	10600	0.0	-
18.9502	10650	0.0	-
19.0391	10700	0.0	-
19.1281	10750	0.0	-
19.2171	10800	0.0	-
19.3060	10850	0.0	-
19.3950	10900	0.0	-
19.4840	10950	0.0	-
19.5730	11000	0.0	-
19.6619	11050	0.0	-
19.7509	11100	0.0	-
19.8399	11150	0.0	-
19.9288	11200	0.0	-

Framework Versions

Python: 3.12.12
SetFit: 1.1.3
Sentence Transformers: 5.1.2
Transformers: 4.57.1
PyTorch: 2.8.0+cu126
Datasets: 4.0.0
Tokenizers: 0.22.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for fefofico/nuclear_trained

Base model

answerdotai/ModernBERT-base

Finetuned

nomic-ai/modernbert-embed-base

Finetuned

(96)

this model

Paper for fefofico/nuclear_trained

Efficient Few-Shot Learning Without Prompts

Paper • 2209.11055 • Published Sep 22, 2022 • 4

Evaluation results

Accuracy on Unknown
test set self-reported

0.917