XLM-RoBERTa is fine-tuned on Mizo FiNERVINER dataset for Fine-grained Named Entity Recognition.
This model is part of the AWED-FiNER collection, as presented in the paper AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers.
- GitHub Repository: https://github.com/PrachuryyaKaushik/AWED-FiNER
- Paper: https://huggingface.co/papers/2601.10161
The tagset of MultiCoNER2 is a fine-grained tagset. The fine to coarse level mapping of the tags are as follows:
- Location (LOC) : Facility, OtherLOC, HumanSettlement, Station
- Creative Work (CW) : VisualWork, MusicalWork, WrittenWork, ArtWork, Software
- Group (GRP) : MusicalGRP, PublicCORP, PrivateCORP, AerospaceManufacturer, SportsGRP, CarManufacturer, ORG
- Person (PER) : Scientist, Artist, Athlete, Politician, Cleric, SportsManager, OtherPER
- Product (PROD) : Clothing, Vehicle, Food, Drink, OtherPROD
- Medical (MED) : Medication/Vaccine, MedicalProcedure, AnatomicalStructure, Symptom, Disease
Model performance:
Precision: 80.33
Recall: 81.83
F1: 81.07
Training Parameters:
Epochs: 6
Optimizer: AdamW
Learning Rate: 5e-5
Weight Decay: 0.01
Batch Size: 64
Contributors
Prachuryya Kaushik
Prof. Ashish Anand
FiNERVINER is a part of the AWED-FiNER collection. Please check: Paper | Agentic Tool | Interactive Demo
Sample Usage
The AWED-FiNER agentic tool can be used to interact with expert models trained using this framework. Below is an example:
pip install smolagents gradio_client
from tool import AWEDFiNERTool
tool = AWEDFiNERTool(
space_id="prachuryyaIITG/AWED-FiNER"
)
result = tool.forward(
text="Jude Bellingham joined Real Madrid in 2023.",
language="English"
)
print(result)
Citation
If you use this model, please cite the following papers:
@inproceedings{kaushik2026finerviner,
title={FiNERVINER: Fine-grained Named Entity Recognition for Vulnerable languages of India's North Eastern Region},
author={Kaushik, Prachuryya and Anand, Ashish},
booktitle={Proceedings of the Fifteenth Language Resources and Evaluation Conference},
volume={15},
year={2026}
}
@inproceedings{kaushik-anand-2025-classer,
title = "{CLASSER}: Cross-lingual Annotation Projection enhancement through Script Similarity for Fine-grained Named Entity Recognition",
author = "Kaushik, Prachuryya and
Anand, Ashish",
booktitle = "Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics",
month = dec,
year = "2025",
address = "Mumbai, India",
publisher = "The Asian Federation of Natural Language Processing and The Association for Computational Linguistics",
url = "https://aclanthology.org/2025.ijcnlp-long.94/",
pages = "1745--1760",
ISBN = "979-8-89176-298-5",
}
@misc{kaushik2026awedfineragentswebapplications,
title={AWED-FiNER: Agents, Web applications, and Expert Detectors for Fine-grained Named Entity Recognition across 36 Languages for 6.6 Billion Speakers},
author={Prachuryya Kaushik and Ashish Anand},
year={2026},
eprint={2601.10161},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2601.10161},
}
@inproceedings{kaushik2026sampurner,
title={SampurNER: Fine-grained Named Entity Recognition Dataset for 22 Indian Languages},
author={Kaushik, Prachuryya and Anand, Ashish},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={40},
year={2026}
}
@inproceedings{fetahu2023multiconer,
title={MultiCoNER v2: a Large Multilingual dataset for Fine-grained and Noisy Named Entity Recognition},
author={Fetahu, Besnik and Chen, Zhiyu and Kar, Sudipta and Rokhlenko, Oleg and Malmasi, Shervin},
booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
pages={2027--2051},
year={2023}
}
- Downloads last month
- 42
Model tree for prachuryyaIITG/FiNERVINER_Mizo_XLM
Base model
FacebookAI/xlm-roberta-large