Terminology / Vocabulary coverage

NMT Issue 49 Representation Bottleneck in Neural MT

Author: Raj Patel, Machine Translation Scientist @ Iconic In Neural MT, lexical features are fed to the network as lexical representations (aka word embeddings) to the first layer of the encoder and refined as propagate through the deep network of hidden layers. In this post we’ll try to understand how the lexical representation is affected as it goes deeper in the network and investigate if it affects...

Read More

Author: Dr. Raj Patel, Machine Translation Scientist @ Iconic

As has been covered a number of times in this series, Neural MT requires good data for training, and acquiring such data for new languages can be costly and not always feasible. One approach in Neural MT literature for improving translation quality for low-resource language is transfer-learning. A common practice is to reuse the model...

Read More

Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic

In many commercial MT use cases, being able to use custom terminology is a key requirement in terms of accuracy of the translation. The ability to guarantee the translation of specific input words and phrases is conveniently handled in Statistical MT (SMT) frameworks such as Moses. Because SMT is performed as a sequence of distinct...

Read More

Author: Raj Nath Patel, Machine Translation Scientist @ Iconic

Machine Translation typically operates with a fixed vocabulary, i.e. it knows how to translate a finite number of words. This is obviously an issue, because translation is an open vocabulary problem: we might want to translate any possible word! This is a particular issue for Neural MT where the vocabulary needs to be limited at the...

Read More