Sinhala Nawarasa Emotion Classifier (SinLlama LoRA)
A Sinhala emotion classification model based on the classical Nawarasa framework.
This LoRA adapter is fine-tuned on top of polyglots/SinLlama_v01.
language: - si license: llama3 tags: - text-classification - emotion-recognition - sinhala - nawarasa - nlp - unsloth - lora base_model: polyglots/SinLlama_v01
Sinhala Nawarasa Emotion Classifier (SinLlama LoRA)
Model Details
Model Description
This model is a culturally grounded emotion classification system for Sinhala text, based on the classical Nawarasa framework. It is a LoRA adapter fine-tuned on top of polyglots/SinLlama_v01 (a Sinhala-extended Llama-3-8B model).
The objective of this model is to accurately identify the dominant emotional category expressed in Sinhala song lyrics and poetic text. It maps complex poetic metaphors and idioms into one of the 9 classical aesthetic emotions (Nawarasa).
- Developed by: ovinduG
- Model Type: Decoder-only autoregressive LLM (Instruction Fine-Tuned via LoRA)
- Language: Sinhala (si)
- Base Model:
polyglots/SinLlama_v01 - Fine-Tuning Framework: Unsloth
- Inference Optimized For: 4-bit quantized GPU inference
The 9 Nawarasa Categories Supported
- Shringara – Romance / Love
- Hasya – Humor / Comedy
- Karuna – Sadness / Compassion / Pathos
- Roudhra – Anger / Fury
- Veera – Heroism / Bravery
- Bhayanakam – Fear / Terror
- Bhibatsa – Disgust / Aversion
- Adbhutha – Wonder / Amazement
- Shantha – Peace / Serenity
Training Details
Dataset
The model was trained on a custom, manually annotated dataset of Sinhala song lyrics and poetic segments.
Each segment was:
- Carefully reviewed
- Assigned a single dominant Nawarasa label
- Evaluated based on semantic meaning, emotional tone, metaphor usage, and Sri Lankan cultural context
Fine-Tuning Methodology
- Prompt Format: Alpaca Instruction Format
- Technique: Parameter-Efficient Fine-Tuning (PEFT) using LoRA
- Precision:
bfloat16 - Quantization: 4-bit (QLoRA-style loading during inference)
- Epochs: 3
- Training Objective: Single-label dominant rasa classification
Intended Use
Direct Use
This model is intended for:
- NLP researchers
- Computational literary analysis
- Sinhala sentiment & emotion analysis
- Digital humanities research
- Music/lyric emotional tagging systems
Example Applications
- Sinhala song emotion tagging
- Cultural AI research
- Poetry classification
- Literary corpus analysis
- Emotion-aware Sinhala chat systems
Out-of-Scope Usage
This model is not designed for:
- Multi-label emotion detection
- Psychological diagnosis
- Political persuasion analysis
- Legal or medical decision-making
- Real-time safety-critical systems
Limitations & Bias
Class Imbalance
Classical Sinhala poetry and modern lyrics skew toward:
- Shringara (Love)
- Karuna (Sadness)
As a result, the model may statistically favor these classes when handling ambiguous or emotionally mixed text.
Prompt Sensitivity
Because this model was trained using an Alpaca-style instruction format, performance depends heavily on correct prompt formatting. Deviating from the expected format may reduce classification accuracy.
Cultural Context Dependency
The model performs best on:
- Sinhala poetic language
- Song lyrics
- Culturally contextual metaphors
Performance may degrade on:
- Informal social media Sinhala
- Code-mixed Sinhala-English
- Highly modern slang
Base Model Inheritance
As a LoRA adapter built on Llama-3 architecture, it inherits the general limitations of large language models, including:
- Hallucination under improper prompting
- Sensitivity to formatting
- Bias inherited from base training data
How to Get Started
For best performance and memory efficiency, use Unsloth for inference.
1️⃣ Install Dependencies
pip install "unsloth @ git+https://github.com/unslothai/unsloth.git"
pip install transformers huggingface_hub torch
---
# 📦 Installation
```bash
pip install "unsloth @ git+https://github.com/unslothai/unsloth.git"
pip install transformers huggingface_hub torch
🚀 Load the Model
from unsloth import FastLanguageModel
import torch
# Load the base model + LoRA adapter
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="ovinduG/sinllama-nawarasa-lora",
max_seq_length=2048,
dtype=torch.bfloat16,
load_in_4bit=True,
resize_model_vocab=139336, # Required for SinLlama's extended vocabulary
)
# Enable optimized inference
FastLanguageModel.for_inference(model)
🧾 Define the Alpaca Prompt
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
පහත පදය තුළ පවතින ප්රමුඛ රසය නවරස අනුව හඳුනාගන්න: Shringara, Hasya, Karuna, Roudhra, Veera, Bhayanakam, Bhibatsa, Adbhutha, Shantha.
### Input:
{}
### Response:
"""
🧠 Prediction Function
def predict_nawarasa(lyric_text):
prompt = alpaca_prompt.format(lyric_text)
inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens=10,
use_cache=True,
temperature=0.1, # Low temperature for classification stability
do_sample=False
)
decoded = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
return decoded.split("### Response:\n")[-1].strip()
🧪 Example Usage
lyric = "කඩුව අතට ගෙන පෙරමුණටම යන්නෙමි, මව්බිම වෙනුවෙන් රුධිරය දන් දෙන්නෙමි"
prediction = predict_nawarasa(lyric)
print("Predicted Nawarasa:", prediction)
# Expected Output:
# Veera
🎭 Supported Nawarasa Categories
- Shringara (Love)
- Hasya (Humor)
- Karuna (Sadness)
- Roudhra (Anger)
- Veera (Heroism)
- Bhayanakam (Fear)
- Bhibatsa (Disgust)
- Adbhutha (Wonder)
- Shantha (Peace)
⚙️ Model Notes
- Base Model:
polyglots/SinLlama_v01 - Fine-Tuning: LoRA (Parameter Efficient Fine-Tuning)
- Framework: Unsloth
- Quantization: 4-bit loading supported
- Precision: bfloat16
- Optimized for GPU inference
📬 Contact
For research collaboration, issues, or feature requests:
- Open a discussion on the Hugging Face repository
- Developer: ovinduG