Language modelling

Issue-25-Improving-Neural-MT-with-Cross-Language-Model-Pretraining

Author: Dr. Rohit Gupta, Sr. Machine Translation Scientist @ Iconic

One of the reasons for the success of Neural MT, and deep learning techniques in general, is the more effective and efficient utilization of large amounts of training data without too much overhead in terms of the time it takes to infer, and the size of the resulting models. This also opens the door to...

Read More
Issue-24-Exploring-language-models-for-Neural-MT

Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic

Monolingual language models were a critical part of Phrase-based Statistical Machine Translation systems. They are also used in unsupervised Neural MT systems (unsupervised means that no parallel data is available to supervise training, in other words only monolingual data is used). However, they are not used in standard supervised Neural MT engines and training language...

Read More
Issue-23-Unbiased-Neural-MT

Author: Raj Patel, Machine Translation Scientist @ Iconic

A recent topic of conversation and interest in the area of Neural MT - and Artificial Intelligence in general - is gender bias. Neural models are trained using large text corpora which inherently contain social biases and stereotypes, and as a consequence, translation models inherit these biases. In this article, we’ll try to understand how gender...

Read More
Issue-3-Improving-vocabulary-coverage

Author: Raj Nath Patel, Machine Translation Scientist @ Iconic

Machine Translation typically operates with a fixed vocabulary, i.e. it knows how to translate a finite number of words. This is obviously an issue, because translation is an open vocabulary problem: we might want to translate any possible word! This is a particular issue for Neural MT where the vocabulary needs to be limited at the...

Read More