dalat5 / src /data

Commit History

Fix with the correct model files
96f0b49

crossroderick commited on

Major (v5) training update
bdd4daa

crossroderick commited on

Readme and tokeniser update
8a2143a

crossroderick commited on

Fourth iteration with 1.9 million training records
9fea118

crossroderick commited on

Pre-v4 readme and support files update
252a85f

crossroderick commited on

Major update with 1.6 million training records
a48965a

crossroderick commited on

Updated the readme and get_data.sh, and added a requirements file
6cbc4c0

crossroderick commited on

Delete src/data/clean_corpus.jsonl
c37d421
verified

crossroderick commited on

Delete src/data/kkwiki-latest-pages-articles.xml.bz2
bc6b470
verified

crossroderick commited on

Delete src/data/kazakh_latin_corpus.jsonl
d145d71
verified

crossroderick commited on

Minor update to the "get_data.sh" file
e1a03df

crossroderick commited on

Minor update to the get_data.sh file
42fea0f

crossroderick commited on

Training update with more data and 2 epochs
03c9e83

crossroderick commited on

Fixed character mapping, training with 8 epochs
508f442

crossroderick commited on

Upload folder using huggingface_hub
cb301d1
verified

crossroderick commited on