🌍 LoRA-mBART50: English ↔ Telugu Translation (Few-shot)

This model is a parameter-efficient fine-tuned version of facebook/mbart-large-50-many-to-many-mmt using LoRA (Low-Rank Adaptation) via the Hugging Face PEFT library.

It is fine-tuned in a few-shot setting on the HackHedron English-Telugu Parallel Corpus using just 1% of the data (~4.3k pairs).

🧠 Model Details

Base model: facebook/mbart-large-50-many-to-many-mmt
Languages: en_XX ↔ te_IN
Technique: LoRA (r=8, α=32, dropout=0.1)
Training regime: 3 epochs, batch size 8, learning rate 5e-4
Library: 🤗 PEFT (peft), transformers, datasets

📚 Dataset

Source: HackHedron/English_Telugu_Parallel_Corpus
Size used: 4338 sentence pairs (~1%)
Format:
- english: Source text
- telugu: Target translation

💻 Usage

Load Adapter with Base mBART

from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
from peft import PeftModel

# Load base model & tokenizer
base_model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-50-many-to-many-mmt")
tokenizer = MBart50TokenizerFast.from_pretrained("your-username/lora-mbart-en-te")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "your-username/lora-mbart-en-te")

# Set source and target languages
tokenizer.src_lang = "en_XX"
tokenizer.tgt_lang = "te_IN"

# Prepare input
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
generated_ids = model.generate(**inputs, forced_bos_token_id=tokenizer.lang_code_to_id["te_IN"])
translation = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(translation)

🔧 Training Configuration

Setting	Value
Base Model	mBART-50
LoRA r	8
LoRA Alpha	32
Dropout	0.1
Optimizer	AdamW
Batch Size	8
Epochs	3
Mixed Precision	fp16

🚀 Applications

English ↔ Telugu translation for low-resource settings
Mobile/Edge inference with minimal memory
Foundation for multilingual LoRA adapters

⚠️ Limitations

Trained on limited data (1% subset)
Translation quality may vary on unseen or complex sentences
Only supports en_XX and te_IN (Telugu) at this stage

📎 Citation

If you use this model, please cite the base model:

@inproceedings{liu2020mbart,
  title={Multilingual Denoising Pre-training for Neural Machine Translation},
  author={Liu, Yinhan and others},
  booktitle={ACL},
  year={2020}
}

🧑‍💻 Author

Fine-tuned by Koushik Reddy, ML & DL Enthusiast | NLP | LoRA | mBART | Hugging Face

Connect: Hugging Face

Downloads last month: -

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Koushim/lora-mbart-en-te

Paper for Koushim/lora-mbart-en-te

LoRA: Low-Rank Adaptation of Large Language Models

Paper • 2106.09685 • Published Jun 17, 2021 • 57