πŸ“ T5-small fine-tuned on XSUM for Summarization

This model is a T5-small fine-tuned on the Extreme Summarization (XSUM) dataset for abstractive summarization.
It generates a one-sentence summary for a given news article.

πŸ“¦ Model Details

  • Base model: T5-small
  • Task: Abstractive summarization
  • Language: English
  • Dataset: XSUM
  • Fine-tuning steps: 1 epoch or 12753 steps
  • Max input length: 1024 tokens
  • Max target length: 128 tokens

πŸš€ How to Use

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained("ShahzebKhoso/T5-small-xsum")
tokenizer = AutoTokenizer.from_pretrained("ShahzebKhoso/T5-small-xsum")

text = "The Prime Minister held a meeting today with the cabinet..."
inputs = tokenizer("summarize: " + text, return_tensors="pt", truncation=True)

summary_ids = model.generate(**inputs, max_length=64, min_length=10, length_penalty=2.0, num_beams=4)
print(tokenizer.decode(summary_ids[0], skip_special_tokens=True))

πŸ“Š Evaluation Results

πŸ“Š Evaluation Results

Evaluated on the XSUM validation set:

Metric Score
ROUGE-1 28.591
ROUGE-2 7.8217
ROUGE-L 22.38
ROUGE-Lsum 22.382
Gen. Len. 19.71

⚠️ Limitations and Bias

  • The model is trained only on English news articles from the XSUM dataset.

  • May hallucinate facts not present in the source text.

  • Summaries are very short (one sentence), consistent with XSUM style.

Downloads last month
2
Safetensors
Model size
60.5M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train ShahzebKhoso/T5-small-xsum