Home
> Resources
> Article
Center
Translation word count – why do word counts vary from
agency to agency?
By Sandra Bologna, June 2005
Why is it that when you submit a request for translation
to a number of translation agencies, you don’t always
get the same word count from each agency? Do these agencies
just pick the numbers out of thin air, or is there logic behind
the varying word counts?
Since the cost and timeframe for translation is based on
the volume of your document, it’s important to obtain
the most accurate estimate. As there is no set system or rule
currently in place for what tools should be used to produce
word counts, you should at the very least be familiar with
what might be going on and why you can’t seem to get
two agencies to produce one word count.
From generating a word count using the Word Count tool in
MS Word, or using a counting tool such as PractiCount and
Invoice, to using translation tools such as Trados or Wordfast,
it’s hard to produce similar results when each of these
tools have their own way of counting. Even though agencies
may use these various systems to produce their word counts,
a word count generated from Trados will not be the same as
a count that has been generated from MS Word.
Microsoft Word
Microsoft Word assumes that everything between spaces is a
word. This means that it will count numbers and symbols. Translation
tools on the other hand, will not include these characters
when generating a word count, since it is assumed that they
don’t actually require translation. There is however,
some debate on whether numbers or symbols should be counted.
Since they don’t require translation, some think that
they shouldn’t be included, while others argue that
the translator still has to check and revise each number and
symbol, which would warrant them to be included in the count,
especially with documents containing a lot of data.
For example, ‘1 2 3 @# + 4’ would be considered
six words by Microsoft Word and zero words by Trados.
Also, MS Word does not include text from text boxes, auto-shapes,
headers, footers and comments.

For a document containing only a few text boxes with little
text, this wouldn’t be a great problem but for documents
with many text boxes, this would have a serious effect on
the word count. We’ve received projects in the past
where due to the nature of the project, the documents had
been written entirely in separate text boxes. Since these
were considerably large documents, not using an appropriate
method for the word count could have proven detrimental.
MS Word does not include text in embedded objects (also called
OLE: Object Linking and Embedding) such as an Excel worksheet
embedded in a Word document, or diagrams with text. Just like
text boxes, or a longer document with a lot of data containing
many embedded complex Excel sheets, the low word count numbers
you would generate from MS Word would be significant.

For HTML files, the drop down text options would not be counted
in MS Word, especially in files that contain a form with predefined
options for a drop down box. The HTML page title, button text,
and text in meta tags would also not be included in an MS
Word count.
PractiCount and Invoice
PractiCount and Invoice is a word count analysis tool that
allows you to produce a line and character count in single
files, folders, and subfolders. PractiCount and Invoice counts
the text in text boxes, headers, footers, footnotes, endnotes,
comments, and OLE objects. It also supports various file formats
such as; DOC, RTF, XLS, CSV, PPT, PPS, WPD, PDF, HTM, HTML,
SHTML, XML, SGML, ASP, and PHP. Using a word count analysis
tool such as PractiCount and Invoice will automatically eliminate
some of the problems that could occur with MS Word’s
Word Count tool.
Translation Tools
Wordfast is a translation tool that counts
footnotes, endnotes, headers, and footers. Trados,
another translation tool, goes one step further and also counts
non-grouped text fields, something Wordfast does not do. These
tools don’t count text in grouped text fields or embedded
objects.
Then there is the issue of tags. MS Word counts each tag
as a word, which would produce an inaccurate word count, whereas
Trados ignores all tags. Wordfast counts each group of internal
tags as one word, which some say is accurate because the translator
has to work with the tag. A benefit of using a translation
tool to produce a word count is that a translation tool also
allows you to count the number of repetitions in a document,
which would result in a quote with a reduced rate.
On average, MS Word produces the lowest word count compared
to Trados and Wordfast. Wordfast and PractiCount and Invoice
produce higher word counts.
Agencies will generate your word count depending on the document.
If it is a simple Word file with only straight non-repetitious
text, a count using MS Word’s word counting tool works
fine. For other documents that are more complex and require
further analysis, either a word counting analysis tool or
a translation tool should be used. Still, with such varying
methods of producing a word count, it’s not rare that
quotes received from different agencies may contain different
word counts.
Need
a quote? What you should have ready, before you request your
translation quote.
Babel
Not: Machine Translation for the Technical Communicator
You are welcome to republish this article once you place the
following text and link at the end of the article.
Simply copy and paste the html code below in your web page:
|