The first neural machine translation system for the Erzya language
Paper • 2209.09368 • Published • 1
How to use slone/fastText-LID-323 with fastText:
from huggingface_hub import hf_hub_download
import fasttext
model = fasttext.load_model(hf_hub_download("slone/fastText-LID-323", "model.bin"))This is a fastText-based language classification model from the paper The first neural machine translation system for the Erzya language.
It supports 323 languages used in Wikipedia (as of July 2022), and has extended support of the Erzya (myv) and Moksha (mdf) languages.
Example usage:
import fasttext
import urllib.request
import os
model_path = 'lid.323.ftz'
url = 'https://huggingface.co/slone/fastText-LID-323/resolve/main/lid.323.ftz'
if not os.path.exists(model_path):
urllib.request.urlretrieve(url, model_path) # or just download it manually
model = fasttext.load_model(model_path)
languages, scores = model.predict("эрзянь кель", k=3) # k is the number of returned hypotheses
The model was trained on texts of articles randomly sampled from Wikipedia. It works better with sentences and longer texts than with words, and may be sensitive to noise.