nlbse25_python / README.md
fabiancpl's picture
Update README.md
1b8f032 verified
---
tags:
- setfit
- sentence-transformers
- text-classification
- generated_from_setfit_trainer
widget: []
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: text-classification
library_name: setfit
inference: true
license: mit
datasets:
- NLBSE/nlbse25-code-comment-classification
language:
- en
base_model:
- sentence-transformers/all-MiniLM-L6-v2
---
# Python comment classifier
This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Python code comment classification.
The model has been trained using few-shot learning that involves:
1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
2. Training a classification head with features from the fine-tuned model.
## Model Description
- **Model Type:** SetFit
- **Classification head:** [RandomForestClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html)
## Sources
- **Repository:** [GitHub](https://github.com/fabiancpl/sbert-comment-classification/)
- **Paper:** [Evaluating the Performance and Efficiency of Sentence-BERT for Code Comment Classification](https://ieeexplore.ieee.org/document/11029440)
- **Dataset:** [HF Dataset](https://huggingface.co/datasets/NLBSE/nlbse25-code-comment-classification)
## How to use it
First, install the depencies:
```bash
pip install setfit scikit-learn
```
Then, load the model and run inferences:
```python
from setfit import SetFitModel
# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("fabiancpl/nlbse25_python")
# Run inference
preds = model("This function sorts a list of numbers.")
```
## Cite as
```bibtex
@inproceedings{11029440,
author={Peña, Fabian C. and Herbold, Steffen},
booktitle={2025 IEEE/ACM International Workshop on Natural Language-Based Software Engineering (NLBSE)},
title={Evaluating the Performance and Efficiency of Sentence-BERT for Code Comment Classification},
year={2025},
pages={21-24},
doi={10.1109/NLBSE66842.2025.00010}}
```