Translation word count – why do word counts vary from agency to agency?

By Sandra Bologna

Why is it that when you submit a request for translation to a number of translation agencies, you don’t always get the same word count from each agency? Do these agencies just pick the numbers out of thin air, or is there logic behind the varying word counts?

Since the cost and timeframe for translation is based on the volume of your document, it’s important to obtain the most accurate estimate. As there is no set system or rule currently in place for what tools should be used to produce word counts, you should at the very least be familiar with what might be going on and why you can’t seem to get two agencies to produce one word count.

From generating a word count using the Word Count tool in MS Word, or using a counting tool such as PractiCount and Invoice, to using translation tools such as Trados or Wordfast, it’s hard to produce similar results when each of these tools have their own way of counting. Even though agencies may use these various systems to produce their word counts, a word count generated from Trados will not be the same as a count that has been generated from MS Word.

Microsoft Word
Microsoft Word assumes that everything between spaces is a word. This means that it will count numbers and symbols. Translation tools on the other hand, will not include these characters when generating a word count, since it is assumed that they don’t actually require translation. There is however, some debate on whether numbers or symbols should be counted. Since they don’t require translation, some think that they shouldn’t be included, while others argue that the translator still has to check and revise each number and symbol, which would warrant them to be included in the count, especially with documents containing a lot of data.

For example, ‘1 2 3 @# + 4’ would be considered six words by Microsoft Word and zero words by Trados.

Also, MS Word does not include text from text boxes, auto-shapes, headers, footers and comments.
For a document containing only a few text boxes with little text, this wouldn’t be a great problem but for documents with many text boxes, this would have a serious effect on the word count. We’ve received projects in the past where due to the nature of the project, the documents had been written entirely in separate text boxes. Since these were considerably large documents, not using an appropriate method for the word count could have proven detrimental.
MS Word does not include text in embedded objects (also called OLE: Object Linking and Embedding) such as an Excel worksheet embedded in a Word document, or diagrams with text. Just like text boxes, or a longer document with a lot of data containing many embedded complex Excel sheets, the low word count numbers you would generate from MS Word would be significant.

For HTML files, the drop down text options would not be counted in MS Word, especially in files that contain a form with predefined options for a drop down box. The HTML page title, button text, and text in meta tags would also not be included in an MS Word count.

PractiCount and Invoice
PractiCount and Invoice is a word count analysis tool that allows you to produce a line and character count in single files, folders, and subfolders. PractiCount and Invoice counts the text in text boxes, headers, footers, footnotes, endnotes, comments, and OLE objects. It also supports various file formats such as; DOC, RTF, XLS, CSV, PPT, PPS, WPD, PDF, HTM, HTML, SHTML, XML, SGML, ASP, and PHP. Using a word count analysis tool such as PractiCount and Invoice will automatically eliminate some of the problems that could occur with MS Word’s Word Count tool.

Translation Tools
Wordfast is a translation tool that counts footnotes, endnotes, headers, and footers. Trados, another translation tool, goes one step further and also counts non-grouped text fields, something Wordfast does not do. These tools don’t count text in grouped text fields or embedded objects.

Then there is the issue of tags. MS Word counts each tag as a word, which would produce an inaccurate word count, whereas Trados ignores all tags. Wordfast counts each group of internal tags as one word, which some say is accurate because the translator has to work with the tag. A benefit of using a translation tool to produce a word count is that a translation tool also allows you to count the number of repetitions in a document, which would result in a quote with a reduced rate.

On average, MS Word produces the lowest word count compared to Trados and Wordfast. Wordfast and PractiCount and Invoice produce higher word counts.

Agencies will generate your word count depending on the document. If it is a simple Word file with only straight non-repetitious text, a count using MS Word’s word counting tool works fine. For other documents that are more complex and require further analysis, either a word counting analysis tool or a translation tool should be used. Still, with such varying methods of producing a word count, it’s not rare that quotes received from different agencies may contain different word counts.

Felicia Bratu

Felicia Bratu is the operations manager of wintranslation, in charge of quality delivery and client satisfaction. As a veteran who has worked in many roles at the company since 2003, Felicia oversees almost every aspect of the company operations from recruitment to project management to localization engineering. She recently received certification as a Localization Project Manager as well as Post-Editing Certification for Machine Translation. Felicia holds a BSc. in Industrial Robotics from the University of Craiova, Romania.