crossroderick commited on
Commit
0209b4e
·
1 Parent(s): dcddc1a

Slight readme update

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -26,7 +26,7 @@ model-index:
26
  type: loss
27
  value: 0.2869
28
  ---
29
- # DalaT5 T5 Fine-Tuned on Cyrillic-to-Latin Kazakh 🇰🇿
30
 
31
  > 'Dala' means 'steppe' in Kazakh - a nod to where the voice of this model might echo.
32
 
@@ -101,7 +101,7 @@ print(output)
101
 
102
  Тәуелсіз жоба болғанына қарамастан, DalaT5 өте маңызды үш деректер жиынтығын пайдаланады / Despite being an independent project, DalaT5 makes use of three very important datasets:
103
 
104
- - The first ~1.8 million records of the Kazakh subset of the CC100 dataset by [Conneau et al. (2020)](https://paperswithcode.com/paper/unsupervised-cross-lingual-representation-1)
105
  - The raw, Kazakh-focused part of the [Kazakh Parallel Corpus (KazParC)](https://huggingface.co/datasets/issai/kazparc) from Nazarbayev University's Institute of Smart Systems and Artificial Intelligence (ISSAI), graciously made available on Hugging Face
106
  - The Wikipedia dump of articles in the Kazakh language, obtained via the `wikiextractor` Python package
107
 
 
26
  type: loss
27
  value: 0.2869
28
  ---
29
+ # DalaT5 - T5 Fine-Tuned on Cyrillic-to-Latin Kazakh 🇰🇿
30
 
31
  > 'Dala' means 'steppe' in Kazakh - a nod to where the voice of this model might echo.
32
 
 
101
 
102
  Тәуелсіз жоба болғанына қарамастан, DalaT5 өте маңызды үш деректер жиынтығын пайдаланады / Despite being an independent project, DalaT5 makes use of three very important datasets:
103
 
104
+ - The first ~2 million records of the Kazakh subset of the CC100 dataset by [Conneau et al. (2020)](https://paperswithcode.com/paper/unsupervised-cross-lingual-representation-1)
105
  - The raw, Kazakh-focused part of the [Kazakh Parallel Corpus (KazParC)](https://huggingface.co/datasets/issai/kazparc) from Nazarbayev University's Institute of Smart Systems and Artificial Intelligence (ISSAI), graciously made available on Hugging Face
106
  - The Wikipedia dump of articles in the Kazakh language, obtained via the `wikiextractor` Python package
107