Data creation


Author: Dr. Patrik Lambert, Machine Translation Scientist @ Iconic

“Garbage in, Garbage out” - noisy data is a big problem for all machine learning tasks, and MT is no different. By noisy data, we mean bad alignments, poor translations, misspellings, and other inconsistencies in the data used to train the systems. Statistical MT systems are more robust, and can cope with up to 10% noise in...

