UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition
Paper
•
2308.03279
•
Published
•
23
Description: This model was trained on the combination of two data sources: (1) ChatGPT-generated Pile-NER-type data, and (2) 40 supervised datasets in the Universal NER benchmark (see Fig. 4 in paper), where we randomly sample 10K instances from the train split of each dataset. Note that CrossNER and MIT datasets are excluded from training for OOD evaluation.
Check our paper for more information. Check our repo about how to use the model.
The template for inference instances is as follows:
This model and its associated data are released under the CC BY-NC 4.0 license. They are primarily used for research purposes.
@article{zhou2023universalner,
title={UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition},
author={Wenxuan Zhou and Sheng Zhang and Yu Gu and Muhao Chen and Hoifung Poon},
year={2023},
eprint={2308.03279},
archivePrefix={arXiv},
primaryClass={cs.CL}
}