It has always been fun to play with automatic translators and see what they come up with. Kids are even doing it with Siri and other voice recognition systems. We have come an incredible distance but still – Russia is (well, was now) translating Russia as “Mordor” and the surname of the Russian Foreign Minister Sergey Lavrov as “sad little horse.” This interesting and slightly humorous information came to us from BBC in their article, “Google translated Russia to ‘Mordor’ in ‘automated’ error.”
It more properly is translated as “Laurels” except in the context of the minister of course where it is his surname. The press is calling this a bug but it isn’t really. It is the nature of the automatic translation generation using the content of millions of documents. The source materials were heavily based on Ukrainian press documents at the time of the annexation of the Crimea. The Ukrainian press was full of anger toward Russian and the printed word carried that slant. The same is true when we use automatic translations to capture information from an enemy chat stream or press. They are full of innuendos and special word usage. It makes the systems very easy to “trick”or manipulate whenever the subject knows they are being tracked. If they understand the algorithms – Bayesian, neural net and statistical vectors used to auto generate the text – then it is easy to lead them. The intelligence and information communities should remember that the bad guys know how the systems works and have a vested interest in not being tracked. Code words have been used for many years to try to get around those who would break a system and understand secret missives.
What is the answer? To my mind controlled vocabularies, cultivated with rule bases, are much more reliable than the automatic systems. Translators which use huge dictionaries behind them are also more reliable than those which depend only on text chatter to pull the definitions of a term.
Google has fixed a bug in the online tool after it began translating “Russian Federation” to “Mordor”. Mordor is the name of a fictional region nicknamed “Land of Shadow” in JRR Tolkien’s Lord of the Rings books. In addition, “Russians” was translated to “occupiers” and the surname of Sergey Lavrov, the country’s Foreign Minister, to “sad little horse”. The errors had been introduced to Google Translate’s Ukrainian-to-Russian service automatically, Google said. The terms mirror language used by some Ukrainians following Moscow’s annexation of Crimea in 2014. Screenshots of the erroneous translations have appeared on social networks in recent days.
‘Automatic translator’
“Google Translate is an automatic translator – it works without the intervention of human translators, using technology instead,” said Google in its statement. Although translations are managed automatically, it is possible for users to suggest alternative translations manually.
However, the BBC understands that this was not how the errors were introduced. Google said that Translate worked by looking for patterns in hundreds of millions of documents but translation remained difficult as the meaning of words was tied to the context in which they were used. “This means that not all translations are perfect, and there will sometimes be mistakes or mistranslations,” the statement added. “We always work to correct these as quickly as possible when they are brought to our attention.” The bug now appears to have been fixed.
Marjorie Hlava, President
Access Innovations, Inc.
Sponsored by Access Innovations, the world leader in taxonomies, metadata, and semantic enrichment to make your content findable.