Richard Ishida

By Huiping Iler
Published on April 16, 2007 in
Digital Web Magazine

Imagine you’ve hired an architect to help you plan your new home. As the walls go up, you realize that your architect has not planned for plumbing, electrical, heating, or cooling. This is a costly mistake. You immediately fire him and find one with a lot more experience.

In many ways, a world-savvy web designer is just like an experienced architect. S/he knows how to build an international website that can be easily translated into other languages without major reworking.

An English-centric web designer is like the architect who fails to plan ahead. He builds websites without considering a worldwide audience. Who pays for this ignorance? Your unsuspecting client—or you, if you’ve included it in your contract.

For example, text swell is one of the many things that can get in the way when you hire a web designer who doesn’t plan ahead. Some languages can be twenty to forty percent longer than the English language, while others are much shorter. A web designer who does not plan for this can end up costing you more money than you bargained for. Not only will you have to pay him, but also pay the person who has to go in and fix all of the truncated sentences.

Reaching out to the global marketplace requires forethought. Finding a web designer who can see the big picture is one of the keys to tapping into the worldwide market.

Richard Ishida sees the big picture. He is a world renowned expert on building world-ready websites.  As the Internationalization (I18n) Activity Lead for the World Wide Web Consortium (W3C), his role is to help put the ‘world wide’ in the world-wide web.

Probably the most oft-repeated comment I receive after an awareness-raising keynote is, “I had no idea!” This sort of thing isn’t usually taught in schools or universities, developers are often embedded in homogenous societies or find it hard to see past the large home user base, and people are kept really busy worrying about what’s right in front of their nose, much of the time.

I interviewed Richard recently, and asked him what he considers best practices for building international web sites.

Digital Web: Can you tell us a little bit about your background?

RI: After living in France for a while, I took a practical degree course geared to translation and interpreting (in French and Spanish, with a Russian subsidiary). This was followed by a postgraduate course in Computer Speech and Computational Linguistics at Cambridge University in the UK. My thesis described a Lisp program I developed for translating Swiss-French weather reports into English. After that, I worked as a software engineer and technical writer, and in 1989 I joined the UK translation group of Xerox to work on their computer-based translation process.

DW: How would you describe internationalization to those who are unfamiliar with the subject?

RI: I’d start by explaining localization. Localization refers to the adaptation of a product, application, or document content to meet the language, cultural, and other requirements of a specific target market (a locale). This relates to more than just translation.

Internationalization, on the other hand, is an approach to the design and development of a product, application, or document content that enables easy localization for target audiences that vary in culture, region, or language. The W3C site has more detailed descriptions.

Let me see if I can give you an example. Suppose that at the W3C we decided to use PHP to add a line to our HTML Validator. Suppose that, if your document fails to validate, the validator tells you how many errors you had. You might see a message like this:

Failed validation. There were 268 validation errors in the file myFirst.html.

The number and the file name need to be slipped into this sentence structure at the time of the validation, and we could do this using PHP.

Now suppose we decided to make the validator available in German. We’d need a sentence in German like this:

Die Datei myFirst.html enthält 268 Gültigkeitsfehler.

Notice how the position of the number and the file name have been reversed. I don’t want to frighten readers with the gory code details, but the fact is that, unless your PHP developer has properly internationalized their code, they are very likely to use an approach that prevents this reordering—making the translator’s life very complicated. (For more information about handling these composite messages, see these slides).

Here’s a very different example. Symbolism can be culture-specific. The check mark means correct or OK in many countries. In some countries, however, such as Japan, it can be used to mean that something is incorrect. Japanese localizers may need to convert check marks to circles (their symbol for correct) as part of the localization process.

If you have developed a marketing-type image containing check marks, have you considered how easy it will be for localizers to effect that change? If you want to make them happy, try using text for the check marks, or at least use separate layers in your master image file for the symbols and the background.

That’s what internationalization is about: anticipating the changes that need to take place in localization, and designing and developing in a way that minimizes fuss and cost for those critically tight schedules as you try to adapt your deliverable for additional markets or users.

DW: How did you become involved in internationalization?

RI: At Xerox, we usually had a struggle on our hands when it came to translating the user interfaces for the large Xerox printers and copiers. This was because the designers and developers of those products were, in the main, blissfully unaware of the ways in which their decisions could adversely affect the localization process.

For example, if you supply text for translation in random order, the translator may have to translate a word such as activated in isolation. In Spanish this could be translated in at least a dozen ways, depending on the context: activado, conectado, encendido, activada, etc…. Unless context is provided up front, there’s no way a translator can figure out how to translate this term appropriately without experimenting—and that incurs huge cost and delays.

This was a particular problem for Xerox as a whole because it, in turn, made an impact on time to market, cost, and quality in a large percentage of its market. And, every time a developer did something in his code or design that created an obstacle for translation, that delay and cost would be replicated for each of the languages we were dealing with.

So I began feeding back information about how their approach to design was making translation less efficient. That eventually became a full-time occupation. Since internationalization wasn’t something on many peoples’ radar in those days, this was eye-opening stuff, and before long I was delivering seminars and talks to people in many large multinational companies and organizations around the world.

DW: Can you tell us about the internationalization activities at the W3C?

RI: There are two W3C staff working on internationalization at the W3C—myself and Felix Sasaki—but the work is shared with participants from W3C member organizations and invited experts in three working groups and an interest group.

The core working group works with other W3C working groups (such as HTML, SVG, CSS, and Voice Browser groups), and acts as a liaison with other organizations (such as the Unicode Consortium) to ensure that the W3C’s specifications will work for users with different languages, scripts, and cultures. We also produce articles and other resources to help people understand and apply the international aspects of web technology.

The Internationalization Tag Set Working Group is preparing a new W3C standard and a set of guidelines for people who design XML document formats. This will include suggestions on what tags they should provide to both enable international use of their document format and make translation more efficient. We have another working group developing specifications related to normalization, internationalized resource identifiers, and language tagging.

DW: What is available through W3C’s website?

RI: The W3C home page gives you access to the W3C specifications and information about the working groups. As specifications are developed, the groups regularly publish working drafts that allow the public to track and comment on the directions they are taking. Lately, there’s been an increase in tutorials, too.

In the I18n activity, we’ve been working for some time to provide easily understandable information for people using web technologies internationally. It’s still very much a work in progress, but there’s a lot of useful stuff there already. We’re also trying to find ways to make it easy for people to find the information when they need it.

From the Internationalization Activity section of the W3C site, you can find articles, tutorials, best practices, and even test pages. We’re also developing some Getting Started material for those who are new to the topic, to ease you in gently. We use blog software to publish news and RSS feeds when new items become available.

We also have a public mailing list which people can auto-subscribe to, and where they can ask questions and receive notification of newly available materials. Alternatively, you could subscribe to one or more of the RSS feeds we have set up to receive notification of new materials, translations, etc.

I’d also like to thank the volunteers, such as WTB, who have, as of today, provided translations of more than twenty of our articles in one or more of nineteen languages.

DW: Does W3C do any outreach activities, such as live workshops or webinars?

RI: We tend to show up at other organizations’ events. I18n Working Group participants have long been heavily involved in the Internationalization and Unicode Conferences, held twice a year, where we deliver tutorials and papers. I have also delivered tutorials at the yearly World Wide Web conference, though this year I’ll be speaking at the @media conferences in London and San Francisco instead, following on from Molly Holzschlag’s popular talk on I18n last year.

I’ve also been trying to spread the word a little farther, recently delivering talks in places such as India, Bhutan, Nepal, and China. These are the kinds of places where people have requirements that have to be taken into account by specifications such as CSS, SVG, and XSL-FO. I’m trying to encourage them to check out our specifications and bring those requirements to the table where necessary.

DW: When I originally thought of interviewing you, I asked a few web designers what kind of questions they would want to ask, but people knew so little about internalization that they did not have many ideas. Do you have a perspective on why there is so little awareness?

RI: Well, it’s not uncommon. I think that much of it is down to the fact that people who develop content or code are rarely involved in the pain that is experienced when you have to port that to the international user, and rarely brought to task for it. Unlike the Web Accessibility Initiative, we don’t usually have the legal system on our side, either.

Probably the most oft-repeated comment I receive after an awareness-raising keynote is, “I had no idea!” This sort of thing isn’t usually taught in schools or universities, developers are often embedded in homogenous societies or find it hard to see past the large home user base, and people are kept really busy worrying about what’s right in front of their nose, much of the time.

If you are in one of those situations, you need to be able to step back and see the bigger picture to understand the need, but then you also need to find out what to do about it, and internationalization experts are still fairly specialized beasts.

DW: Our readers love practical examples. Can you think of any good internationalization tips that can be easily applied?

RI: We have recently introduced a quick-tips card that contains the following pointers for developing web content:

  • Encoding. Use Unicode wherever possible for content, databases, etc. Always declare the encoding of content in your documents.
  • Escapes. Use characters rather than escapes whenever you can. For example, instead of using á, á, or á, just use the character á. It makes non-English code much slimmer and easy to manage.
  • Language. Declare the language of content in your documents and indicate any internal language changes.
  • Presentation vs. content. Use stylesheets for presentational information. Restrict markup to semantics.
  • Images, animations, and examples. Check for translatability and inappropriate cultural bias.
  • Forms. Use an appropriate encoding on both form and server. Support local formats of names/addresses, times/dates, etc.
  • Text authoring. Use simple, concise text. Use care when composing sentences from multiple strings.
  • Navigation. On each page, include clearly visible navigation to localized pages or sites, using the target language.
  • Right-to-left text. For XHTML, add dir=”rtl” to the html tag. Only re-use it to change directionality.

You can get explanations of these tips and link to further reading at our site’s quicktips page.

DW: If a web developer asks you where he/she can learn more about how to develop world-friendly websites, what resources or training do you recommend other than the W3C website?

RI: I’m concerned about recommending specific sites, because I’m sure to forget some important ones. I noted, however, that the Web Standards Project” has just started a grassroots International Liaison Group. Then there’s the Internationalization and Unicode Conference. This is a great technical conference for people wanting to learn about internationalization, and we’ve had a web internationalization track running there for many years now. The next conference is planned for November 2007, in San Jose, California.

As W3C Internationalization Activity Lead, Richard Ishida’s job is to help ensure universal access to the Web, regardless of language, script or culture. In his scant spare time he enjoys learning about new writing systems, coding small but useful XHTML-based tools, reading about ancient/medieval world history, and taking photos.

Felicia Bratu

Felicia Bratu is the operations manager of wintranslation, in charge of quality delivery and client satisfaction. As a veteran who has worked in many roles at the company since 2003, Felicia oversees almost every aspect of the company operations from recruitment to project management to localization engineering. She recently received certification as a Localization Project Manager as well as Post-Editing Certification for Machine Translation. Felicia holds a BSc. in Industrial Robotics from the University of Craiova, Romania.