YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Sinhala Nawarasa Emotion Classifier (SinLlama LoRA)

A Sinhala emotion classification model based on the classical Nawarasa framework.
This LoRA adapter is fine-tuned on top of polyglots/SinLlama_v01.


language: - si license: llama3 tags: - text-classification - emotion-recognition - sinhala - nawarasa - nlp - unsloth - lora base_model: polyglots/SinLlama_v01

Sinhala Nawarasa Emotion Classifier (SinLlama LoRA)

Model Details

Model Description

This model is a culturally grounded emotion classification system for Sinhala text, based on the classical Nawarasa framework. It is a LoRA adapter fine-tuned on top of polyglots/SinLlama_v01 (a Sinhala-extended Llama-3-8B model).

The objective of this model is to accurately identify the dominant emotional category expressed in Sinhala song lyrics and poetic text. It maps complex poetic metaphors and idioms into one of the 9 classical aesthetic emotions (Nawarasa).

  • Developed by: ovinduG
  • Model Type: Decoder-only autoregressive LLM (Instruction Fine-Tuned via LoRA)
  • Language: Sinhala (si)
  • Base Model: polyglots/SinLlama_v01
  • Fine-Tuning Framework: Unsloth
  • Inference Optimized For: 4-bit quantized GPU inference

The 9 Nawarasa Categories Supported

  1. Shringara – Romance / Love
  2. Hasya – Humor / Comedy
  3. Karuna – Sadness / Compassion / Pathos
  4. Roudhra – Anger / Fury
  5. Veera – Heroism / Bravery
  6. Bhayanakam – Fear / Terror
  7. Bhibatsa – Disgust / Aversion
  8. Adbhutha – Wonder / Amazement
  9. Shantha – Peace / Serenity

Training Details

Dataset

The model was trained on a custom, manually annotated dataset of Sinhala song lyrics and poetic segments.

Each segment was:

  • Carefully reviewed
  • Assigned a single dominant Nawarasa label
  • Evaluated based on semantic meaning, emotional tone, metaphor usage, and Sri Lankan cultural context

Fine-Tuning Methodology

  • Prompt Format: Alpaca Instruction Format
  • Technique: Parameter-Efficient Fine-Tuning (PEFT) using LoRA
  • Precision: bfloat16
  • Quantization: 4-bit (QLoRA-style loading during inference)
  • Epochs: 3
  • Training Objective: Single-label dominant rasa classification

Intended Use

Direct Use

This model is intended for:

  • NLP researchers
  • Computational literary analysis
  • Sinhala sentiment & emotion analysis
  • Digital humanities research
  • Music/lyric emotional tagging systems

Example Applications

  • Sinhala song emotion tagging
  • Cultural AI research
  • Poetry classification
  • Literary corpus analysis
  • Emotion-aware Sinhala chat systems

Out-of-Scope Usage

This model is not designed for:

  • Multi-label emotion detection
  • Psychological diagnosis
  • Political persuasion analysis
  • Legal or medical decision-making
  • Real-time safety-critical systems

Limitations & Bias

Class Imbalance

Classical Sinhala poetry and modern lyrics skew toward:

  • Shringara (Love)
  • Karuna (Sadness)

As a result, the model may statistically favor these classes when handling ambiguous or emotionally mixed text.

Prompt Sensitivity

Because this model was trained using an Alpaca-style instruction format, performance depends heavily on correct prompt formatting. Deviating from the expected format may reduce classification accuracy.

Cultural Context Dependency

The model performs best on:

  • Sinhala poetic language
  • Song lyrics
  • Culturally contextual metaphors

Performance may degrade on:

  • Informal social media Sinhala
  • Code-mixed Sinhala-English
  • Highly modern slang

Base Model Inheritance

As a LoRA adapter built on Llama-3 architecture, it inherits the general limitations of large language models, including:

  • Hallucination under improper prompting
  • Sensitivity to formatting
  • Bias inherited from base training data

How to Get Started

For best performance and memory efficiency, use Unsloth for inference.


1️⃣ Install Dependencies

pip install "unsloth @ git+https://github.com/unslothai/unsloth.git"
pip install transformers huggingface_hub torch

---

# 📦 Installation

```bash
pip install "unsloth @ git+https://github.com/unslothai/unsloth.git"
pip install transformers huggingface_hub torch

🚀 Load the Model

from unsloth import FastLanguageModel
import torch

# Load the base model + LoRA adapter
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="ovinduG/sinllama-nawarasa-lora",
    max_seq_length=2048,
    dtype=torch.bfloat16,
    load_in_4bit=True,
    resize_model_vocab=139336,  # Required for SinLlama's extended vocabulary
)

# Enable optimized inference
FastLanguageModel.for_inference(model)

🧾 Define the Alpaca Prompt

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
පහත පදය තුළ පවතින ප්‍රමුඛ රසය නවරස අනුව හඳුනාගන්න: Shringara, Hasya, Karuna, Roudhra, Veera, Bhayanakam, Bhibatsa, Adbhutha, Shantha.

### Input:
{}

### Response:
"""

🧠 Prediction Function

def predict_nawarasa(lyric_text):
    prompt = alpaca_prompt.format(lyric_text)
    inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
    
    outputs = model.generate(
        **inputs,
        max_new_tokens=10,
        use_cache=True,
        temperature=0.1,   # Low temperature for classification stability
        do_sample=False
    )
    
    decoded = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
    return decoded.split("### Response:\n")[-1].strip()

🧪 Example Usage

lyric = "කඩුව අතට ගෙන පෙරමුණටම යන්නෙමි, මව්බිම වෙනුවෙන් රුධිරය දන් දෙන්නෙමි"

prediction = predict_nawarasa(lyric)
print("Predicted Nawarasa:", prediction)

# Expected Output:
# Veera

🎭 Supported Nawarasa Categories

  • Shringara (Love)
  • Hasya (Humor)
  • Karuna (Sadness)
  • Roudhra (Anger)
  • Veera (Heroism)
  • Bhayanakam (Fear)
  • Bhibatsa (Disgust)
  • Adbhutha (Wonder)
  • Shantha (Peace)

⚙️ Model Notes

  • Base Model: polyglots/SinLlama_v01
  • Fine-Tuning: LoRA (Parameter Efficient Fine-Tuning)
  • Framework: Unsloth
  • Quantization: 4-bit loading supported
  • Precision: bfloat16
  • Optimized for GPU inference

📬 Contact

For research collaboration, issues, or feature requests:

  • Open a discussion on the Hugging Face repository
  • Developer: ovinduG

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support