By Sandra Bologna
Long ago the world had one language and few words. One day, a group of architects decided to write a manual containing sensitive information on the design of a tower they were building in their city. The tower was to reach the sky and would ultimately determine their greatness. Their pride and confidence took over and they soon ignored their boss. As punishment, their boss scattered the architects across the entire earth and made them all speak different languages. This created much confusion, and so the city was named Babel. Many years passed and no-one could unlock the secrets of the tower, at least until the birth of the great Babel Fish.
What is Babel Fish and why is it so great? Babel Fish belongs to a larger category of translation called Machine Translation. Machine Translation will give you a rough translation of that German document that’s been sitting on your desk baffling you, in less than one minute. How’s that for great?
As amazing as that sounds, Machine Translation is not perfect, and it does have its drawbacks. So how do you know if Machine Translation is right for you? Researching MT software and reading feedback from actual users will help you get the full picture. For starters, I’ve outlined the major points below.
For those of you new to this term, Machine Translation (MT) is the automatic translation of text from one language (source language) into another language (target language) without human intervention. In general, MT use is grouped into two categories. Figuring out which of these two categories best suits your needs is a first step in determining if MT is right for you:
- MT-enabled (Unassisted MT): the automatic translation of text with no human post-editing. This can produce a translation that is unpolished, but is extremely useful for material that would be impossible or inconvenient for human translation due to overwhelming volume, time-consuming nature, immediate turn-around requirements, and/or the expense of human translators.
- MT-enhanced (Assisted MT): automatically translating text with the intent of using a human translator for post-editing. Used in the form of Computer-Aided Translation, Assisted MT is useful for creating a base translation for proofreaders, which drastically decreases the amount of time they have to spend translating.
When is MT useful?
MT works well for translations where source documents are controlled, such as technical documents. Controlled authoring avoids ambiguity; clear and concise source text produces clear and concise machine translation. Documents to be machine translated should feature both of these traits. Please see Basic Controlled Authoring Methods: Getting Ready for Machine Translation.
Weather reports and stock market data use controlled authoring. According to Steve Silberman, “The classic example of MT that works is the Météo system, developed in Montreal, which has been translating Canada’s weather bulletins between English and French on a daily basis since 1977. In the world of Météo discourse, ‘front’ always means a weather system.”
Large Repetitious Documents
Large volumes of documents, particularly those with much repetition, are ideal for MT use. Machine Translations usually contain terminology dictionaries that can be tailored to fit the subject material and updated and modified as needed. This is a good thing, because constantly updating highly repetitious documents leads to translator attrition. According to Steve Silberman, “The translation of forecasts was so boring that before Météo took over, the Canadian government had a hard time keeping translators on the job for more than a couple of months.”
When Human Translation is Impossible
Extremely large volumes of material with impractical turn-around times where translations must be updated frequently make human translation impossible. As one member of webmasterworld.com wrote “I run a site full time for a company and we use the machine translation service …90% of our content is dynamically generated each week from a database of about 12000 new products each week so it would be a huge translation job where we’d need full time staff on doing it. The machine translation works quite well for us and gets customers who have no clue of English. We also use the machine translation type text in box for a translation for all email contact with them – even though the translation is vague!” What is MT used for?
To use MT for obtaining a rough idea of the source text content is called ‘gisting’ (from the phrase ‘get the gist of it’). Individuals or corporations who must obtain information from documents in a foreign language use MT for gisting purposes when they don’t need an official translation or to determine if an official translation is necessary. Gisting is the most popular use of MT in use today.
Depending on the language, a translator can translate approximately 250 words per hour. Let’s say that you outsource your weather report indicating a sunny forecast to a French translator. Two hours later you receive the translation, but now it’s raining. You outsource again. Let’s face it – data is constantly changing. MT provides translation of real-time data, such as weather reports and stock prices quickly. For real-time information, delays are not acceptable, and the cost of human translation would again be enormous due to the high volume of data.
Think about the dozens of emails you receive and send in one day. Now think about a US company who receives hundreds of emails weekly from their international client in Italy who doesn’t know English. This demonstrates only one aspect where human translation would be out of the question.
Emails, instant messaging, and chat all require extremely fast turnaround. Translation needs to be immediate and needs to be available 24/7. Since translators cannot produce immediate translation, are not free, and live in different time zones, it is impossible to have these forms of communication translated by human translators. MT is available 24 hours a day regardless of multiple time zones and can produce the high-volume automatic translations necessary for real-time communication. MT for communication purposes also increases privacy of confidential information by eliminating third-parties such as translators and editors. It is ideal for companies working with international vendors who receive emails and data in foreign languages.
Assimilation refers to translating material from a variety of languages into one target language. Translating foreign text into your language is necessary for intelligence gathering. MT allows you to identify what information is relevant in documents written in a foreign language with little to no delay. MT can automatically translate large volume of material that would be impossible, time-consuming, or prohibitively expensive for human translators.
Dissemination is the need to transform material in one language into several other languages. The traditional process of localization is a prime example. MT for this purpose is used as human-assisted MT. It can speed up the localization process by providing a draft translation for human translators to edit instead of requiring them to start from scratch. Since MT automatically maintains consistency of terminology, it also saves translators time in having to research and check terminology.
Right now you’re probably wondering why you should still bother using human translators; MT easily replaces them, right?
No. MT will not replace human translators. As I mentioned before, MT works well for technical documents because they use controlled authoring, and the MT dictionary can be tailored to their specific terminology. MT does not work as well for literary works. The machine translation of Romeo and Juliet would produce a trainwreck of text, leaving Shakespeare that much more difficult to understand. It is difficult for MT to properly translate such documents because literary texts are not structured and often use word play, metaphors or other non-literal phrases. Human translators, on the other hand, have the ability to grasp the message of the text, and can properly translate the material even if it is conveyed imprecisely.
This is not to say that human translators always create perfect translations, for even the best-qualified translator will not know the source text better than the author. Still, using highly qualified, professional translators will produce better translations than MT software. MT systems have a more limited knowledge of grammar and vocabulary than human translators and MT dictionaries are limited to what developers were able to implement, which is generally much less than what is necessary. It is important to determine what your needs are and what you plan to accomplish with a MT system. <
What are the costs of MT?
When you purchase your MT system, the initial costs will be in the license, customization, annual fees, and maintenance fees. Initially, the cost is high, but using MT regularly for repetitious, large volume documents pays off quickly.
For five languages, the initial cost and maintenance could be close to $154,000, but let’s look at the long-term cost. Let’s say that in one year you translated 1,000,000 words. After only the second year of using MT, the total cost for 1,000,000 words would be $116,450 ($100,000 for revision, $7,000 for maintenance, $9,450 for the annual fee) and would take about 250 days to complete. The cost to have the same 1,000,000 words translated by human translators into five languages at a rate of $0.10 per word would be $500,000 and would take about 400 days to complete.
MT runs at a fixed cost independent of volume; this means you can end up saving money over time due to reduced translation cost, reduced delivery time, around the clock availability, and consistency in terminology.
Most commercial MT systems are Transfer-based MT systems. This type of MT lets linguists build grammar rules for the system. The system can then analyze the source language text, map grammatical structures to the target language, and then generate the translation.
However, Transfer-based systems are time-consuming and expensive to develop. When the rules have not yet been developed, poor analysis of sentences will result. Also, this approach can take up to two years to develop since it is knowledge-intensive.
Another type of MT system is Data-driven MT. Only a few commercial MT systems use this method. This method uses statistical methods to calculate which parts of the source and target languages match by gathering large numbers of example translations. The dictionary and translation correspondences are built automatically since text can range from single words to entire sentences. This method may only take a few weeks to develop, but the output is generally of lesser quality.
It is also important to realize that MT systems cannot handle every language combination. Generally, MT systems can translate common language combinations such as French to German or English to French. But rarer language combinations such as Japanese to Swahili have not been developed.
Basic Controlled Authoring Methods: Getting Ready for Machine Translation
Have you decided to buy a Machine Translation system, but can’t produce good translations from your new purchase? MT requires a controlled authoring writing style. Here are a few points on using MT efficiently.
- The most important rule for MT writing is: limit sentence length. Sentences longer than 25 words often become ambiguous and too complex for MT to correctly translate. Keeping sentences to a minimum word length will improve the quality of the output.
- Avoid metaphors, jokes, slang, puns, idiomatic expressions and regional or national expressions. Since these are often translated literally, they tend to lose their meaning, creating an unintelligible translation. The literal translation of ‘break a leg’, for example, will not make sense to the target reader.Instead of: “You say that your sales will increase by 10 times by the end of this year? Don’t count your chickens before they hatch.”
Use: “You say that your sales will increase by 10 times by the end of the year? Do not be too confident. Wait until you get the final results.”
Instead of: “Don’t get me wrong; I love sports, but I hate basketball.”
Use: “Do not misunderstand me; I love sports, but I hate basketball.”
- Avoid abbreviations, acronyms, contractions, and common Latin terms (etc., i.e., e.g.) as these do not always have equivalents in different languages. Spell out the entire word instead. Machine Translations do not always recognize abbreviations and will leave them untranslated.Instead of: Sr, Jr, FDA, TV, etc.,
Use: Senior, Junior, Food and Drug Administration (FDA), Television, et cetera
- Keep pronouns to a minimum. The meaning of pronouns can be lost after translation because different languages use different word orders and gender-specific languages may use different genders for certain objects. For example, in French ‘il’ could mean ‘he’ or ‘it’, so your subject may be unclear to a French reader. Replace pronouns with nouns wherever possible.Instead of: He is interesting. It is interesting.
When translated into French, this becomes: Il est intéressant. Il est intéressant.
Use: Marc is interesting. The book is interesting.
This avoids ambiguity when translated, becoming: Marc est intéressant. Il est intéressant.
- Use simple, direct sentences with basic grammatical construction.Ensure that the sentence structure is grammatically correct and do not omit words.Instead of: Make sure you use grammatically correct sentence structure.
Use: Make sure that you use grammatically correct sentence structure.
- Avoid ambiguity. To produce a clear translation, reduce the amount of words and sentences with multiple meanings.Words: The word ‘right’ can mean ‘correct’ or ‘right’ in terms of direction (right or left).
Sentences: The sentence ‘They fed her dog biscuits’ can be understood as ‘she was fed dog biscuits by them or ‘her dog was fed biscuits by them’.
Instead of: ‘They fed her dog biscuits’ to mean ‘her dog was fed biscuits by them’
Use: They fed biscuits to her dog.
Instead of: ‘They fed her dog biscuits’ to mean ‘she was fed dog biscuits by them’
Use: They fed her some dog biscuits.
- Avoid compound verbs as they are often mistranslated. Use a thesaurus to simplify uncommon usages.
- Use the International Standard Date Format (International Standard Date Format) for writing dates. Date order varies from country to country, but the standard numerical year-month-day (YYYY – MM – DD) format will eliminate problems arising from translating dates.
- Use the infinitive form of the verb rather than present participles because present participles do not always have equivalents in all languages.Instead of: Click here for selecting the icons and viewing the images.
Use: Click here to select the icons and to view the images.
- Include a list for the translator of all words that should remain in the source language. These can be anything from proper names and titles to product or company names.
- After completing the source document, run a draft through the machine translation and back into the source language to see where problems may be occurring.
Following the above points will prevent many common translation problems from occurring.
Machine Translation, though useful in certain cases, is still not, and may never be the one-size-fits-all solution for translation needs. Any translation used for commercial or professional purposes must be at the very least checked and double-checked by human translators, if not translated by human translators altogether. For those other cases where the benefits of using an MT far outweigh the drawbacks, MT may be that key that unlocks the mystery of languages. And so, as the story goes, with a little help from the Fish, architects all across the globe were able to read and understand the secrets of the tower and climb to the top.