Long ago the world had one language and few words. One day,
a group of architects decided to write a manual containing
sensitive information on the structures of a tower they were
building in their city. The tower was to reach the sky and
would ultimately determine their greatness. Their pride and
confidence took over and they soon ignored their boss. As
punishment, their boss scattered the architects across the
entire earth and made them all speak different languages where
no one architect could understand the other. This created
much confusion; and so the city was named Babel. Many years
went by, but still confusion followed and since the manual
was only written in one language, no one could unlock the
secrets of the tower, until one day, the great Babel Fish
was born.
What is Babel Fish and why is it so great? Babel Fish belongs
to a larger category of translation called Machine Translation.
Machine Translation will give you a rough translation of that
German document that’s been sitting on your desk baffling
you, in less than one minute. How’s that for great.
As amazing as that sounds, not everything is perfect, and
Machine Translation does have its drawbacks. So how do you
know if Machine Translation is right for you? Researching
MT software and reading feedback from actual users from various
sites on the web will help in getting the full picture on
MT. For starters, I’ve outlined the major points below.
Those of you who are new to this term, Machine Translation
(MT) is the automatic translation of text from one language
(source language) into another language (target language)
without human intervention.
In general, MT use is grouped into two categories. Figuring
out which of these two categories best suits your needs is
a first step in determining if MT is right for you:
1) MT enabled applications: Also referred to as
Unassisted MT, is the automatic translation of text with
no human post-editing. This can produce translation that
is unpolished, but is extremely useful for material that
would be impossible or inconvenient for human translation
due to too much volume, time-consuming documents, immediate
turn-around, and the expense of human translators.
2) MT enhanced applications: Also referred to
as Assisted MT is automatically translating text with the
intent of using a human translator for post-editing. Used
in the form of Computer-Aided Translation, this is useful
for creating a base translation for proofreaders, which
allows them to spend less time on translating the document
and more time on fine-tuning the document.
When is MT useful?
- Controlled environment
MT works well for translations where source documents are
controlled, such as technical documents or data. Controlled
authoring will avoid ambiguity. Clear and concise source
text will produce clear and concise machine translation.
Documents with the intent of getting machine translated
should be written using controlled English authoring. Please
see Basic controlled authoring methods: Getting Ready for
Machine Translation.
Weather reports and stock market data use controlled
English authoring. “The classic example of MT that
works is the Météo system, developed in
Montreal, which has been translating Canada's weather
bulletins between English and French on a daily basis
since 1977. In the world of Météo discourse,
‘front’ always means a weather system.”
(Steve Silberman, “Talking to Strangers,”
Wired, May 2000)
- Large documents with repetitions
Large volumes of documents, particularly those with repetition
are ideal for MT use. You would benefit from using MT for
documents that contain a certain amount of consistency throughout,
such as technical documents. Some Machine Translations contain
terminology dictionaries that can be tailored to fit the
subject material and updated and modified as needed. The
highly repetitive documents that continually have to be
updated are boring and monotonous for human translators
to work on. “The translation of forecasts was so boring
that before Météo took over, the Canadian
government had a hard time keeping translators on the job
for more than a couple of months.” (Steve Silberman,
“Talking to Strangers,” Wired, May 2000)
- When human translation is impossible
Extremely large volumes of material and text with impractical
turn-around times where translations must be up-to date
and relatively ‘automatic’ make human translation
impossible. As one member of webmasterworld.com wrote “I
run a site full time for a company and we use the machine
translation service - I describe it on the site as a 'vague
translation' - 90% of our content is dynamically generated
each week from a database of about 12000 new products each
week so it would be a huge translation job where we'd need
full time staff on doing it. The machine translation works
quite well for us and gets customers who have no clue of
English. We also use the machine translation type text in
box for a translation for all email contact with them -
even though the translation is vague!”
What is MT used for?
- Gisting
To use MT for getting a rough idea of the source text content
is the most common form of translation in terms of volume
and is referred to as ‘gisting’. Individuals
or corporations who must obtain information from documents
in a foreign language, without the need to have the document
officially translated, use MT for getting a ‘gist’
of what the source content says. This allows them to get
a general idea of the context of the document and to determine
if an actual human translation is necessary.
- Real-Time Translation
Depending on the language, a translator can translate approximately
250 words per hour. You outsource your weather report indicating
a sunny forecast to a French translator. Two hours later
you receive the translation, only now it’s raining.
You outsource again. Let’s face it, data is constantly
changing; it’s a fact. MT provides translation of
real-time data, such as weather reports, data feeds, and
stock prices automatically. For real-time information, delays
are not acceptable, and cost for human translation would
again be enormous due to the high-volume of data
- Communication
Think about the dozens of emails you receive and send in
one day. Now think about a US company who receives hundreds
of emails weekly from their International client in Italy
who doesn’t speak their language. This demonstrates
only one aspect where human translation would be out of
the question.
Emails, instant messaging, chat, these all require extremely
fast turnaround. Translation needs to be immediate and
needs to be available 24/7. Since translators cannot produce
immediate translation, are not free, and live in different
time zones, it is impossible to have these forms of communication
translated by human translators. MT is available 24 hours
a day regardless of multiple time zones and can produce
automatic high-volume translations, which is necessary
for real-time communication. MT for communication purposes
also increases privacy of confidential information by
eliminating the third or fourth parties such as translators
and editors, ideal for companies working with international
vendors who receive emails and data in foreign languages.
- Assimilation
Assimilation refers to translation in which an individual
or organization wants to convert material that is in a variety
of languages into his or her own language. Translating foreign
text into your language is necessary for intelligence gathering.
MT allows you to identify what information is relevant in
documents written in a foreign language with little to no
delay. MT can automatically translate large volume of material
often impossible, time-consuming, and expensive for human
translators to perform.
- Dissemination
Dissemination is the need to communicate your own material
that is written in one language, into a variety of other
languages to the world. The traditional process of localization
is a prime example. MT for this purpose is used as human
assisted MT. It can speed up the localization process by
providing a draft translation for human translators to edit
instead of starting from scratch. Since MT automatically
maintains consistency of terminology, it also saves translators
time in having to research and check terminology.
Right now you’re probably wondering why you should
still even bother to use human translators, after all MT easily
replaces them, right?
No. MT will not replace human translators. As I mentioned
before, MT does work well for text such as technical documents
for example, because they are written using controlled authoring,
and the MT dictionary can be tailored to the highly repetitive
and specific terminology. MT does not work well for literary
works however. The machine translation of ‘Where art
thou Romeo’ for instance, would produce an entirely
unintelligible translation of Romeo and Juliet, leaving Shakespeare
that much more complicated to understand. It is difficult
for MT to properly translate these documents since literary
texts are not structured and often follow freestyle of writing.
Human translators on the other hand have the ability to grasp
the message of the text, properly translating the material
even if it is conveyed on paper very poorly.
This is not to say that human translators always create perfect
translations, after all, the author knows best. As qualified
as a translator is or as developed as the MT system is, no one
will know the source text better than the author. Still, using
highly qualified, professional translators will produce better
translation compared to MT software. MT systems have a more
limited knowledge of grammar and vocabulary than a human translator
does and MT dictionaries are limited to what developers were
able to implement, which is generally much less than what is
necessary. It is important to determine what your needs are
and what you plan to accomplish with a MT system.
What are the costs of MT?
When you purchase your MT, the initial costs will be in the
licence, customization and annual and maintenance fees. Initially,
the cost is large, but if you are considering using MT regularly
for your repetitious, large volume documents, then the long-term
outlook of cost will reduce.
For five languages, the initial cost and maintenance could
be close to $154,000 for five languages, but let’s look
at the long-term cost. Let’s say that in one year you
translated 1,000,000 words. After only the second year of
using your MT, for 5 languages, the total cost for 1,000,000
words of translation using MT software would be $116,450 ($100,000
for revision, $7,000 for maintenance, $9,450 for the annual
fee) and would take about 250 days to complete. The cost to
have the same 1,000,000 words translated by human translators
into 5 languages at a rate of $0.10 per word would be $500,000
and would take about 400 days to complete.
MT run at a fixed cost independent of volume, this means
you can end up saving money in the long run in terms of reduced
translation cost, reduced delivery time, around the clock
availability, consistency in terminology and improved documentation.
Most commercial MT systems are Transfer-Based MT systems.
This type of MT lets linguists build grammar rules for the
system. The system can then analyze the source language text,
map grammatical structures to the target language and then
generate the translation.
The drawbacks of this method are that it is time-consuming
and expensive to develop this system. When the rules have
not yet been developed, poor analysis of sentences will result.
Also, this approach can take up to two years to develop since
it is rule-based and knowledge-intensive.
Another type of MT system is Data-Driven MT: Only a few commercial
MT systems use this type of MT system. This method uses statistical
methods to calculate which parts of the source language go
with which parts of the target language by gathering large
numbers of example translations together. The dictionary and
translation correspondences are built automatically since
text can range from single words to entire sentences. This
method may only take a few weeks to develop, but it is time-consuming
and the output is generally not as good.
It is also important to realize that MT systems cannot handle
any language combination. Generally, MT systems can translate
specific language combinations such as French to German or
English to French. But more rare language combinations such
as Japanese to Swahili have not been developed.
Basic controlled authoring methods: Getting Ready for
Machine Translation.
So you’ve decided to buy a Machine Translation, but
you can’t seem to produce good translations from your
new purchase? MT requires controlled authoring writing style.
Here are a few points on how to efficiently use MT.
- One of the most important rules for MT writing is to
limit sentence length. Sentences longer than 25 words often
become ambiguous and too complex for MT to correctly translate.
Keeping sentences to a minimum word length will improve
the quality of the output.
- Avoid metaphors, jokes, slang, puns, idiomatic expressions
and regional or national expressions. Since these are often
translated literally, they tend to lose their meaning, creating
an unintelligible translation. To the target speaker, the
literal translation of ‘break a leg’, for example,
will not make sense.
Examples:
Instead of: “You say that you’re sales will
increase by 10 times by the end of this year? Don’t
count your chickens before they hatch.”
Use: “You say that you’re sales will increase
by 10 times by the end of the year? Do not be too confident,
wait until you get the final results.”
Instead of: “Don’t get me wrong; I love sports,
but I hate basketball. “
Use: “Do not misunderstand me; I love sports, but
I hate basketball.”
- Try not to use abbreviations, acronyms, contractions,
and common Latin terms (etc., i.e., e.g., as these do not
always have equivalents in different languages). Spell out
the entire word instead. Machine Translations do not always
recognize abbreviations and will leave them untranslated.
Examples:
Instead of: Sr, Jr, FDA, TV, etc.,
Use: Senior, Junior, Food and Drug Administration (FDA),
Television, et cetera
- Try to keep pronouns to a minimum. The meaning of pronouns
can be lost after translation since different languages
use different word orders and gender specific languages
may use different genders for certain objects. For example,
in French ‘il’ could mean ‘he’ or
‘it’, so it may be unclear to a French reader
what you are referring to. Try replacing pronouns with nouns
instead.
Example:
Instead of: He is interesting. It is interesting.
Which when translated into French becomes: Il est intéressant.
Il est intéressant.
Use: Marc is interesting. The book is interesting.
Which avoids ambiguity when translated, becoming: Marc
est intéressant. Il est intéressant.
- Use plain English. Use sentence structure that is grammatically
correct and do not use words that can be omitted. Use simple,
direct sentences with basic grammatical construction.
Example:
Instead of: Make sure you use grammatically correct sentence
structure.
Use: Make sure that you use grammatically correct sentence
structure.
- Avoid ambiguity. To produce a clear translation, reduce
the amount of words and sentences that can have multiple
meanings.
Examples:
Words: The word ‘right’ can mean ‘correct’
or ‘right’ in terms of direction (right or left).
Sentences: The sentence ‘They fed her dog biscuits’
can be understood as ‘she was fed dog biscuits by
them or ‘her dog was fed biscuits by them’.
Instead of: ‘They fed her dog biscuits’ to
mean ‘her dog was fed biscuits by them’
Use: They fed biscuits to her dog.
Instead of: ‘They fed her dog biscuits’ to
mean ‘she was fed dog biscuits by them’
Use: They fed her some dog biscuits.
- Avoid compound verbs as they are often mistranslated.
Use a thesaurus to simplify uncommon usages.
- Use the International Standard Date Format (International
Standard Date Format) for writing dates. Since date order
varies from country to country, the standard numerical year-month-day
(YYYY – MM – DD) format will simplify the problems
that can arise from translating dates.
- Use the infinitive form of the verb rather than present
participles since present participles do not always have
equivalents in all languages.
Example:
Instead of: Click here for selecting the icons and viewing
the images.
Use: Click here to select the icons and to view the images.
- Include a list for the translator of all words that should
remain in English. These can be anything from proper names
and titles to product or company names.
- After completing the source document, run a draft through
the machine translation and back into the source language
to see where problems may be occurring.
Following these basic points will solve many of the problems
before the document reaches the translation phase which will
greatly reduce time and stress!