Home    Services    Company    Resources    Employment    Ordering  
 

December 2004

   Babble Not – Machine Translation for the Technical Communicator

by Sandra Bologna

Long ago the world had one language and few words. One day, a group of architects decided to write a manual containing sensitive information on the structures of a tower they were building in their city. The tower was to reach the sky and would ultimately determine their greatness. Their pride and confidence took over and they soon ignored their boss. As punishment, their boss scattered the architects across the entire earth and made them all speak different languages where no one architect could understand the other. This created much confusion; and so the city was named Babel. Many years went by, but still confusion followed and since the manual was only written in one language, no one could unlock the secrets of the tower, until one day, the great Babel Fish was born.

What is Babel Fish and why is it so great? Babel Fish belongs to a larger category of translation called Machine Translation. Machine Translation will give you a rough translation of that German document that’s been sitting on your desk baffling you, in less than one minute. How’s that for great.

As amazing as that sounds, not everything is perfect, and Machine Translation does have its drawbacks. So how do you know if Machine Translation is right for you? Researching MT software and reading feedback from actual users from various sites on the web will help in getting the full picture on MT. For starters, I’ve outlined the major points below.

Those of you who are new to this term, Machine Translation (MT) is the automatic translation of text from one language (source language) into another language (target language) without human intervention.

In general, MT use is grouped into two categories. Figuring out which of these two categories best suits your needs is a first step in determining if MT is right for you:


1) MT enabled applications: Also referred to as Unassisted MT, is the automatic translation of text with no human post-editing. This can produce translation that is unpolished, but is extremely useful for material that would be impossible or inconvenient for human translation due to too much volume, time-consuming documents, immediate turn-around, and the expense of human translators.

2) MT enhanced applications: Also referred to as Assisted MT is automatically translating text with the intent of using a human translator for post-editing. Used in the form of Computer-Aided Translation, this is useful for creating a base translation for proofreaders, which allows them to spend less time on translating the document and more time on fine-tuning the document.


When is MT useful?

  • Controlled environment
    MT works well for translations where source documents are controlled, such as technical documents or data. Controlled authoring will avoid ambiguity. Clear and concise source text will produce clear and concise machine translation. Documents with the intent of getting machine translated should be written using controlled English authoring. Please see Basic controlled authoring methods: Getting Ready for Machine Translation.

    Weather reports and stock market data use controlled English authoring. “The classic example of MT that works is the Météo system, developed in Montreal, which has been translating Canada's weather bulletins between English and French on a daily basis since 1977. In the world of Météo discourse, ‘front’ always means a weather system.” (Steve Silberman, “Talking to Strangers,” Wired, May 2000)

  • Large documents with repetitions
    Large volumes of documents, particularly those with repetition are ideal for MT use. You would benefit from using MT for documents that contain a certain amount of consistency throughout, such as technical documents. Some Machine Translations contain terminology dictionaries that can be tailored to fit the subject material and updated and modified as needed. The highly repetitive documents that continually have to be updated are boring and monotonous for human translators to work on. “The translation of forecasts was so boring that before Météo took over, the Canadian government had a hard time keeping translators on the job for more than a couple of months.” (Steve Silberman, “Talking to Strangers,” Wired, May 2000)

  • When human translation is impossible
    Extremely large volumes of material and text with impractical turn-around times where translations must be up-to date and relatively ‘automatic’ make human translation impossible. As one member of webmasterworld.com wrote “I run a site full time for a company and we use the machine translation service - I describe it on the site as a 'vague translation' - 90% of our content is dynamically generated each week from a database of about 12000 new products each week so it would be a huge translation job where we'd need full time staff on doing it. The machine translation works quite well for us and gets customers who have no clue of English. We also use the machine translation type text in box for a translation for all email contact with them - even though the translation is vague!”


What is MT used for?

  • Gisting
    To use MT for getting a rough idea of the source text content is the most common form of translation in terms of volume and is referred to as ‘gisting’. Individuals or corporations who must obtain information from documents in a foreign language, without the need to have the document officially translated, use MT for getting a ‘gist’ of what the source content says. This allows them to get a general idea of the context of the document and to determine if an actual human translation is necessary.

  • Real-Time Translation
    Depending on the language, a translator can translate approximately 250 words per hour. You outsource your weather report indicating a sunny forecast to a French translator. Two hours later you receive the translation, only now it’s raining. You outsource again. Let’s face it, data is constantly changing; it’s a fact. MT provides translation of real-time data, such as weather reports, data feeds, and stock prices automatically. For real-time information, delays are not acceptable, and cost for human translation would again be enormous due to the high-volume of data

  • Communication
    Think about the dozens of emails you receive and send in one day. Now think about a US company who receives hundreds of emails weekly from their International client in Italy who doesn’t speak their language. This demonstrates only one aspect where human translation would be out of the question.

    Emails, instant messaging, chat, these all require extremely fast turnaround. Translation needs to be immediate and needs to be available 24/7. Since translators cannot produce immediate translation, are not free, and live in different time zones, it is impossible to have these forms of communication translated by human translators. MT is available 24 hours a day regardless of multiple time zones and can produce automatic high-volume translations, which is necessary for real-time communication. MT for communication purposes also increases privacy of confidential information by eliminating the third or fourth parties such as translators and editors, ideal for companies working with international vendors who receive emails and data in foreign languages.

  • Assimilation
    Assimilation refers to translation in which an individual or organization wants to convert material that is in a variety of languages into his or her own language. Translating foreign text into your language is necessary for intelligence gathering. MT allows you to identify what information is relevant in documents written in a foreign language with little to no delay. MT can automatically translate large volume of material often impossible, time-consuming, and expensive for human translators to perform.

  • Dissemination
    Dissemination is the need to communicate your own material that is written in one language, into a variety of other languages to the world. The traditional process of localization is a prime example. MT for this purpose is used as human assisted MT. It can speed up the localization process by providing a draft translation for human translators to edit instead of starting from scratch. Since MT automatically maintains consistency of terminology, it also saves translators time in having to research and check terminology.

Right now you’re probably wondering why you should still even bother to use human translators, after all MT easily replaces them, right?

No. MT will not replace human translators. As I mentioned before, MT does work well for text such as technical documents for example, because they are written using controlled authoring, and the MT dictionary can be tailored to the highly repetitive and specific terminology. MT does not work well for literary works however. The machine translation of ‘Where art thou Romeo’ for instance, would produce an entirely unintelligible translation of Romeo and Juliet, leaving Shakespeare that much more complicated to understand. It is difficult for MT to properly translate these documents since literary texts are not structured and often follow freestyle of writing. Human translators on the other hand have the ability to grasp the message of the text, properly translating the material even if it is conveyed on paper very poorly.

This is not to say that human translators always create perfect translations, after all, the author knows best. As qualified as a translator is or as developed as the MT system is, no one will know the source text better than the author. Still, using highly qualified, professional translators will produce better translation compared to MT software. MT systems have a more limited knowledge of grammar and vocabulary than a human translator does and MT dictionaries are limited to what developers were able to implement, which is generally much less than what is necessary. It is important to determine what your needs are and what you plan to accomplish with a MT system.

What are the costs of MT?

When you purchase your MT, the initial costs will be in the licence, customization and annual and maintenance fees. Initially, the cost is large, but if you are considering using MT regularly for your repetitious, large volume documents, then the long-term outlook of cost will reduce.

For five languages, the initial cost and maintenance could be close to $154,000 for five languages, but let’s look at the long-term cost. Let’s say that in one year you translated 1,000,000 words. After only the second year of using your MT, for 5 languages, the total cost for 1,000,000 words of translation using MT software would be $116,450 ($100,000 for revision, $7,000 for maintenance, $9,450 for the annual fee) and would take about 250 days to complete. The cost to have the same 1,000,000 words translated by human translators into 5 languages at a rate of $0.10 per word would be $500,000 and would take about 400 days to complete.

MT run at a fixed cost independent of volume, this means you can end up saving money in the long run in terms of reduced translation cost, reduced delivery time, around the clock availability, consistency in terminology and improved documentation.

Most commercial MT systems are Transfer-Based MT systems. This type of MT lets linguists build grammar rules for the system. The system can then analyze the source language text, map grammatical structures to the target language and then generate the translation.

The drawbacks of this method are that it is time-consuming and expensive to develop this system. When the rules have not yet been developed, poor analysis of sentences will result. Also, this approach can take up to two years to develop since it is rule-based and knowledge-intensive.

Another type of MT system is Data-Driven MT: Only a few commercial MT systems use this type of MT system. This method uses statistical methods to calculate which parts of the source language go with which parts of the target language by gathering large numbers of example translations together. The dictionary and translation correspondences are built automatically since text can range from single words to entire sentences. This method may only take a few weeks to develop, but it is time-consuming and the output is generally not as good.

It is also important to realize that MT systems cannot handle any language combination. Generally, MT systems can translate specific language combinations such as French to German or English to French. But more rare language combinations such as Japanese to Swahili have not been developed.


Basic controlled authoring methods: Getting Ready for Machine Translation.

So you’ve decided to buy a Machine Translation, but you can’t seem to produce good translations from your new purchase? MT requires controlled authoring writing style. Here are a few points on how to efficiently use MT.

  1. One of the most important rules for MT writing is to limit sentence length. Sentences longer than 25 words often become ambiguous and too complex for MT to correctly translate. Keeping sentences to a minimum word length will improve the quality of the output.

  2. Avoid metaphors, jokes, slang, puns, idiomatic expressions and regional or national expressions. Since these are often translated literally, they tend to lose their meaning, creating an unintelligible translation. To the target speaker, the literal translation of ‘break a leg’, for example, will not make sense.

    Examples:

    Instead of: “You say that you’re sales will increase by 10 times by the end of this year? Don’t count your chickens before they hatch.”
    Use: “You say that you’re sales will increase by 10 times by the end of the year? Do not be too confident, wait until you get the final results.”

    Instead of: “Don’t get me wrong; I love sports, but I hate basketball. “
    Use: “Do not misunderstand me; I love sports, but I hate basketball.”

  3. Try not to use abbreviations, acronyms, contractions, and common Latin terms (etc., i.e., e.g., as these do not always have equivalents in different languages). Spell out the entire word instead. Machine Translations do not always recognize abbreviations and will leave them untranslated.

    Examples:

    Instead of: Sr, Jr, FDA, TV, etc.,
    Use: Senior, Junior, Food and Drug Administration (FDA), Television, et cetera

  4. Try to keep pronouns to a minimum. The meaning of pronouns can be lost after translation since different languages use different word orders and gender specific languages may use different genders for certain objects. For example, in French ‘il’ could mean ‘he’ or ‘it’, so it may be unclear to a French reader what you are referring to. Try replacing pronouns with nouns instead.

    Example:

    Instead of: He is interesting. It is interesting.
    Which when translated into French becomes: Il est intéressant. Il est intéressant.
    Use: Marc is interesting. The book is interesting.
    Which avoids ambiguity when translated, becoming: Marc est intéressant. Il est intéressant.

  5. Use plain English. Use sentence structure that is grammatically correct and do not use words that can be omitted. Use simple, direct sentences with basic grammatical construction.

    Example:

    Instead of: Make sure you use grammatically correct sentence structure.
    Use: Make sure that you use grammatically correct sentence structure.

  6. Avoid ambiguity. To produce a clear translation, reduce the amount of words and sentences that can have multiple meanings.

    Examples:

    Words: The word ‘right’ can mean ‘correct’ or ‘right’ in terms of direction (right or left).
    Sentences: The sentence ‘They fed her dog biscuits’ can be understood as ‘she was fed dog biscuits by them or ‘her dog was fed biscuits by them’.

    Instead of: ‘They fed her dog biscuits’ to mean ‘her dog was fed biscuits by them’
    Use: They fed biscuits to her dog.

    Instead of: ‘They fed her dog biscuits’ to mean ‘she was fed dog biscuits by them’
    Use: They fed her some dog biscuits.

  7. Avoid compound verbs as they are often mistranslated. Use a thesaurus to simplify uncommon usages.

  8. Use the International Standard Date Format (International Standard Date Format) for writing dates. Since date order varies from country to country, the standard numerical year-month-day (YYYY – MM – DD) format will simplify the problems that can arise from translating dates.

  9. Use the infinitive form of the verb rather than present participles since present participles do not always have equivalents in all languages.

    Example:

    Instead of: Click here for selecting the icons and viewing the images.
    Use: Click here to select the icons and to view the images.

  10. Include a list for the translator of all words that should remain in English. These can be anything from proper names and titles to product or company names.

  11. After completing the source document, run a draft through the machine translation and back into the source language to see where problems may be occurring.


Following these basic points will solve many of the problems before the document reaches the translation phase which will greatly reduce time and stress!


 

  Languages used in the U.S./Canada to access Internet websites


Many of the people who live in U.S. or Canada access the Internet outside the English language (that is, when they access it from home; at work they most likely access the Net in English). Here you have some figures for Americans who spoke other languages at home besides English (2000 U.S. Census figures):

· Spanish: 26.7 M
· Chinese: 2.0 M
· French: 1.4 M
· German: 1.2 M
· Tagalig: 1.1 M
· Korean: 894 K
· Italian: 880 K
· Russian: 684 K
· Polish: 654 K
· Arabic: 596 K
· Portuguese: 582 K
· Japanese: 468 K
· Greek 309 K
· Farsi: 287 K
· Hebrew: 189 K
· Scandinavian languages: 139 K

 
  TRANSLATION JOKES - English butchery across the Word


- In a Bucharest hotel lobby: The lift is being fixed for the next day.During that time we regret that you will be unbearable.

- In a Leipzig elevator: Do not enter the lift backwards, and only when lit up.

- In a Belgrade hotel elevator: To move the cabin, push button for wishing floor. If the cabin should enter more persons, each one should press number of wishing floor. Driving is then going alphabetically by national order.

- In a Paris hotel elevator: Please leave your values at the front desk.

- In a hotel in Athens: Visitors are expected to complain at the office between the hours of 9 and 11 A.M. daily.

- In the lobby of a Moscow hotel across from a Russian Orthodox monastery: You are welcome to visit the cemetery where famous Russian and Soviet composers, artists, and writers are buried daily except Thursday.

- On the menu of a Swiss restaurant: Our wines leave nothing to hope for.

- From a Japanese information booklet about using a hotel air conditioner: Cooles and Heates: If you want just condition of warm in your room, please control yourself.

 
 


Do you have feedback or comments about our site or newsletter? wintranslation.com is interested in hearing from you. Click here.

About wintranslation.com: We are a technical translation bureau based in Canada. Some of our services include:

 

 

Google
www www.wintranslation.com

copyright 2008 - wintranslation.com - Translation services