You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

DiiaLLM-270m-it-sft-full-21-10

Base model: google/gemma-3-270m-it
Task: Supervised fine-tuning (SFT) for a Ukrainian e-government assistant (Diia)
Primary language: Ukrainian (uk)
Release date: 2025-10-21
Owner: dovcharenko (private use)

A compact (270M) instruction-tuned chat model specialized for Diia product support: concise guidance, portal/app navigation, and polite clarifications in Ukrainian. Designed to pair well with retrieval (RAG) and tools for time-sensitive facts (fees, phone numbers, region lists, eligibility rules).

TL;DR

Why this model? Lightweight helper for Ukrainian public-service flows where latency/cost matter.
Good at: Clear, short answers; step-by-step “how to” instructions; polite confirmations; follow-ups.
Not for: Legal authority or live policy updates — use RAG/tools for that.

Evaluation (LLM-as-Judge, Ukrainian)

Evaluation on 100 internal prompts (diia_eval.jsonl) using a Ukrainian LLM-as-Judge with two criteria (1–10):

Accuracy & Completeness
Clarity & Fluency (answers not in Ukrainian are penalized)

Model	N (accuracy)	Accuracy mean	Accuracy median	N (clarity)	Clarity mean	Clarity median
dovcharenko/DiiaLLM-270m-it-sft-full-21-10	100	5.52	6	100	7.39	8
`google/gemma-3-4b-it`	98	4.91	5	98	7.24	8

Method. For each (question, reference) pair, the model generated an answer; a judge model produced strict-JSON scores.

Caveat. LLM-as-Judge correlates with human ratings but does not replace human review—especially for sensitive/edge cases.

Intended Use

Use cases: FAQ-style guidance for Diia services (e.g., FOP/sole proprietor, documents, social assistance, vehicle services), step-by-step navigation, clarifying questions, courteous closings.
Out of scope: Binding legal advice; case-specific decisions; storing personal data without proper safeguards.

Training Summary

Base: google/gemma-3-270m-it (chat)
Objective: SFT on Ukrainian dialogs from Diia support scenarios
Formatting: Gemma-3 chat template via tokenizer.apply_chat_template
Hardware: Colab GPU (L4/T4; bf16 preferred if available)
Typical recipe:
- Context length: 2048
- Small per-device batch (use gradient accumulation)
- Epochs: 1–2 (monitor overfit)
- LR: 2e-4 (cosine), warmup 3%
- Gradient checkpointing: on
- Optional: LoRA (q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj) for lower VRAM and safer iteration

Data note: Prefer single-turn instruction→answer pairs or curated multi-turn where context is essential. Avoid excessive chit-chat unless it reflects desired behavior.

Quick Start (Inference)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

MODEL_ID = "dovcharenko/DiiaLLM-270m-it-sft-full-21-10"  # private repo
tok = AutoTokenizer.from_pretrained(MODEL_ID)
if tok.pad_token_id is None:
    tok.pad_token_id = tok.eos_token_id

model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.bfloat16 if torch.cuda.get_device_capability(0)[0] >= 8 else torch.float16,
    device_map="auto"
).eval()

chat = [{"role": "user", "content": "Як подати нульову декларацію ФОП?"}]
prompt = tok.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inputs = tok(prompt, return_tensors="pt").to(model.device)

with torch.inference_mode():
    out = model.generate(
        **inputs,
        max_new_tokens=192,
        do_sample=False,  # deterministic for prod
        eos_token_id=tok.eos_token_id,
        pad_token_id=tok.pad_token_id,
        use_cache=True,
    )

answer = tok.decode(out[0, inputs["input_ids"].shape[1]:], skip_special_tokens=True).strip()
print(answer)

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

BF16

Model tree for dovcharenko/DiiaLLM-270m-it-sft-full-21-10

Base model

google/gemma-3-270m

Finetuned

google/gemma-3-270m-it

Finetuned

(957)

this model

Evaluation results

Accuracy & Completeness (LLM-as-Judge) on diia_eval (internal)
test set self-reported

5.520
Clarity & Fluency (LLM-as-Judge) on diia_eval (internal)
test set self-reported

7.390