Home
> Resources
> Article
Center
Babel Not: Machine Translation for
the Technical Communicator
By Sandra
Bologna
Long ago the world had one language and few words. One day,
a group of architects decided to write a manual containing sensitive
information on the design of a tower they were building in their
city. The tower was to reach the sky and would ultimately determine
their greatness. Their pride and confidence took over and they
soon ignored their boss. As punishment, their boss scattered
the architects across the entire earth and made them all speak
different languages. This created much confusion, and so the
city was named Babel. Many years passed and no-one could unlock
the secrets of the tower, at least until the birth of the great
Babel Fish.
What is Babel Fish and why is it so great? Babel Fish belongs
to a larger category of translation called Machine Translation.
Machine Translation will give you a rough translation of that
German document that's been sitting on your desk baffling
you, in less than one minute. How's that for great?
As amazing as that sounds, Machine Translation is not perfect,
and it does have its drawbacks. So how do you know if Machine
Translation is right for you? Researching MT software and
reading feedback from actual users will help you get the full
picture. For starters, Ive outlined the major points
below.
For those of you new to this term, Machine Translation (MT)
is the automatic translation of text from one language (source
language) into another language (target language) without
human intervention. In general, MT use is grouped into two
categories. Figuring out which of these two categories best
suits your needs is a first step in determining if MT is right
for you:
- MT-enabled (Unassisted MT): the automatic translation
of text with no human post-editing. This can produce a translation
that is unpolished, but is extremely useful for material
that would be impossible or inconvenient for human translation
due to overwhelming volume, time-consuming nature, immediate
turn-around requirements, and/or the expense of human translators.
- MT-enhanced (Assisted MT): automatically translating
text with the intent of using a human translator for post-editing.
Used in the form of Computer-Aided Translation, Assisted
MT is useful for creating a base translation for proofreaders,
which drastically decreases the amount of time they have
to spend translating.
MT works well for translations where source documents are
controlled, such as technical documents. Controlled authoring
avoids ambiguity; clear and concise source text produces clear
and concise machine translation. Documents to be machine translated
should feature both of these traits. Please see Basic
Controlled Authoring Methods: Getting Ready for Machine Translation.
Weather reports and stock market data use controlled authoring.
According to Steve Silberman, "The classic example of
MT that works is the Météo system, developed
in Montreal, which has been translating Canada's weather bulletins
between English and French on a daily basis since 1977. In
the world of Météo discourse, front
always means a weather system."
Large volumes of documents, particularly those with much
repetition, are ideal for MT use. Machine Translations usually
contain terminology dictionaries that can be tailored to fit
the subject material and updated and modified as needed. This
is a good thing, because constantly updating highly repetitious
documents leads to translator attrition. According to Steve
Silberman, "The
translation of forecasts was so boring that before Météo
took over, the Canadian government had a hard time keeping
translators on the job for more than a couple of months."
Extremely large volumes of material with impractical turn-around
times where translations must be updated frequently make human
translation impossible. As one member of webmasterworld.com
wrote "I run a site full time for a company and we use
the machine translation service
90% of our content is
dynamically generated each week from a database of about 12000
new products each week so it would be a huge translation job
where we'd need full time staff on doing it. The machine translation
works quite well for us and gets customers who have no clue
of English. We also use the machine translation type text
in box for a translation for all email contact with them -
even though the translation is vague!" What is MT used
for?
To use MT for obtaining a rough idea of the source text
content is called gisting (from the phrase get
the gist of it). Individuals or corporations who must
obtain information from documents in a foreign language use
MT for gisting purposes when they dont need an official
translation or to determine if an official translation is
necessary. Gisting is the most popular use of MT in use today.
Depending on the language, a translator can translate approximately
250 words per hour. Lets say that you outsource your
weather report indicating a sunny forecast to a French translator.
Two hours later you receive the translation, but now its
raining. You outsource again. Lets face it - data is
constantly changing. MT provides translation of real-time
data, such as weather reports and stock prices quickly. For
real-time information, delays are not acceptable, and the
cost of human translation would again be enormous due to the
high volume of data.
Think about the dozens of emails you receive and send in
one day. Now think about a US company who receives hundreds
of emails weekly from their international client in Italy
who doesnt know English. This demonstrates only one
aspect where human translation would be out of the question.
Emails, instant messaging, and chat all require extremely
fast turnaround. Translation needs to be immediate and needs
to be available 24/7. Since translators cannot produce immediate
translation, are not free, and live in different time zones,
it is impossible to have these forms of communication translated
by human translators. MT is available 24 hours a day regardless
of multiple time zones and can produce the high-volume automatic
translations necessary for real-time communication. MT for
communication purposes also increases privacy of confidential
information by eliminating third-parties such as translators
and editors. It is ideal for companies working with international
vendors who receive emails and data in foreign languages.
Assimilation refers to translating material from a variety
of languages into one target language. Translating foreign
text into your language is necessary for intelligence gathering.
MT allows you to identify what information is relevant in
documents written in a foreign language with little to no
delay. MT can automatically translate large volume of material
that would be impossible, time-consuming, or prohibitively
expensive for human translators.
Dissemination is the need to transform material in one language
into several other languages. The traditional process of localization
is a prime example. MT for this purpose is used as human-assisted
MT. It can speed up the localization process by providing
a draft translation for human translators to edit instead
of requiring them to start from scratch. Since MT automatically
maintains consistency of terminology, it also saves translators
time in having to research and check terminology.
Right now youre probably wondering why you should
still bother using human translators; MT easily replaces them,
right?
No. MT will not replace human translators. As I mentioned
before, MT works well for technical documents because they
use controlled authoring, and the MT dictionary can be tailored
to their specific terminology. MT does not work as well for
literary works. The machine translation of Romeo and Juliet
would produce a trainwreck of text, leaving Shakespeare that
much more difficult to understand. It is difficult for MT
to properly translate such documents because literary texts
are not structured and often use word play, metaphors or other
non-literal phrases. Human translators, on the other hand,
have the ability to grasp the message of the text, and can
properly translate the material even if it is conveyed imprecisely.
This is not to say that human translators always create
perfect translations, for even the best-qualified translator
will not know the source text better than the author. Still,
using highly qualified, professional translators will produce
better translations than MT software. MT systems have a more
limited knowledge of grammar and vocabulary than human translators
and MT dictionaries are limited to what developers were able
to implement, which is generally much less than what is necessary.
It is important to determine what your needs are and what
you plan to accomplish with a MT system. <
When you purchase your MT system, the initial costs will
be in the license, customization, annual fees, and maintenance
fees. Initially, the cost is high, but using MT regularly
for repetitious, large volume documents pays off quickly.
For five languages, the initial cost and maintenance could
be close to $154,000, but lets look at the long-term
cost. Lets say that in one year you translated 1,000,000
words. After only the second year of using MT, the total cost
for 1,000,000 words would be $116,450 ($100,000 for revision,
$7,000 for maintenance, $9,450 for the annual fee) and would
take about 250 days to complete. The cost to have the same
1,000,000 words translated by human translators into five
languages at a rate of $0.10 per word would be $500,000 and
would take about 400 days to complete.
MT runs at a fixed cost independent of volume; this means
you can end up saving money over time due to reduced translation
cost, reduced delivery time, around the clock availability,
and consistency in terminology.
Most commercial MT systems are Transfer-based MT systems.
This type of MT lets linguists build grammar rules for the
system. The system can then analyze the source language text,
map grammatical structures to the target language, and then
generate the translation.
However, Transfer-based systems are time-consuming and expensive
to develop. When the rules have not yet been developed, poor
analysis of sentences will result. Also, this approach can
take up to two years to develop since it is knowledge-intensive.
Another type of MT system is Data-driven MT. Only a few
commercial MT systems use this method. This method uses statistical
methods to calculate which parts of the source and target
languages match by gathering large numbers of example translations.
The dictionary and translation correspondences are built automatically
since text can range from single words to entire sentences.
This method may only take a few weeks to develop, but the
output is generally of lesser quality.
It is also important to realize that MT systems cannot handle
every language combination. Generally, MT systems can translate
common language combinations such as French to German or English
to French. But rarer language combinations such as Japanese
to Swahili have not been developed.
Have you decided to buy a Machine Translation system, but
cant produce good translations from your new purchase?
MT requires a controlled authoring writing style. Here are
a few points on using MT efficiently.
- The most important rule for MT writing is: limit sentence
length. Sentences longer than 25 words often become ambiguous
and too complex for MT to correctly translate. Keeping sentences
to a minimum word length will improve the quality of the
output.
- Avoid metaphors, jokes, slang, puns, idiomatic expressions
and regional or national expressions. Since these are often
translated literally, they tend to lose their meaning, creating
an unintelligible translation. The literal translation of
break a leg, for example, will not make sense
to the target reader.
Instead of: "You say that your sales will increase
by 10 times by the end of this year? Dont count your
chickens before they hatch."
Use: "You say that your sales will increase
by 10 times by the end of the year? Do not be too confident.
Wait until you get the final results."
Instead of: "Dont get me wrong; I love
sports, but I hate basketball."
Use: "Do not misunderstand me; I love sports,
but I hate basketball."
- Avoid abbreviations, acronyms, contractions, and common
Latin terms (etc., i.e., e.g.) as these do not always have
equivalents in different languages. Spell out the entire
word instead. Machine Translations do not always recognize
abbreviations and will leave them untranslated.
Instead of: Sr, Jr, FDA, TV, etc.,
Use: Senior, Junior, Food and Drug Administration
(FDA), Television, et cetera
- Keep pronouns to a minimum. The meaning of pronouns can
be lost after translation because different languages use
different word orders and gender-specific languages may
use different genders for certain objects. For example,
in French il could mean he or it,
so your subject may be unclear to a French reader. Replace
pronouns with nouns wherever possible.
Instead of: He is interesting. It is interesting.
When translated into French, this becomes: Il est intéressant.
Il est intéressant.
Use: Marc is interesting. The book is interesting.
This avoids ambiguity when translated, becoming: Marc est
intéressant. Il est intéressant.
- Use simple, direct sentences with basic grammatical construction.Ensure
that the sentence structure is grammatically correct and
do not omit words.
Instead of: Make sure you use grammatically correct
sentence structure.
Use: Make sure that you use grammatically correct
sentence structure.
- Avoid ambiguity. To produce a clear translation, reduce
the amount of words and sentences with multiple meanings.
Words: The word right can mean correct
or right in terms of direction (right or left).
Sentences: The sentence They fed her dog biscuits
can be understood as she was fed dog biscuits by them
or her dog was fed biscuits by them.
Instead of: They fed her dog biscuits
to mean her dog was fed biscuits by them
Use: They fed biscuits to her dog.
Instead of: They fed her dog biscuits
to mean she was fed dog biscuits by them
Use: They fed her some dog biscuits.
- Avoid compound verbs as they are often mistranslated.
Use a thesaurus to simplify uncommon usages.
- Use the International Standard Date Format (International
Standard Date Format) for writing dates. Date order varies
from country to country, but the standard numerical year-month-day
(YYYY MM DD) format will eliminate problems
arising from translating dates.
- Use the infinitive form of the verb rather than present
participles because present participles do not always have
equivalents in all languages.
Instead of: Click here for selecting the icons and
viewing the images.
Use: Click here to select the icons and to view the
images.
- Include a list for the translator of all words that should
remain in the source language. These can be anything from
proper names and titles to product or company names.
- After completing the source document, run a draft through
the machine translation and back into the source language
to see where problems may be occurring.
Following the above points will prevent many common translation
problems from occurring.
Machine Translation, though useful in certain cases, is
still not, and may never be the one-size-fits-all solution
for translation needs. Any translation used for commercial
or professional purposes must be at the very least checked
and double-checked by human translators, if not translated
by human translators altogether. For those other cases where
the benefits of using an MT far outweigh the drawbacks, MT
may be that key that unlocks the mystery of languages. And
so, as the story goes, with a little help from the Fish, architects
all across the globe were able to read and understand the
secrets of the tower and climb to the top.
Need
a quote? What you should have ready, before you request your
translation quote.
Translation
word count – why do word counts vary from agency to
agency?
|