Home
> Resources
> Article
Center
What does XML mean to you?
XML means a lot of different things to different people.
Some think it is "the Answer to Life, the Universe, and
Everything", others describe it as generation X's childhood
sandpit dreams coming true. And although it might have already
reached mainstream, there is still a lot of urban myth floating
about when it comes to the question of what XML actually is.
So let us have a quick look why everybody seems to love XML,
why already more and more large corporations use XML to save
a lot of money, and finally how XML is going to creep into
the lives of everyday computer users - and to be honest, it'll
be a relief when that happens!
XML has been a bit of a buzz word over the last few years
- but today it seems that the buzz has settled and the community
of blog writers has moved on (probably to something even more
fascinating) - but have they?
Actually no, they have not. The opposite is the case. It
seems that the initial buzz about XML brought us a great number
of experimental users and uses - and heaps of abbreviations
too. (one still wonders if it was XSLT, SAX, DOM or TMX that
brought us RSS, CSS, and XLIFF?). So the truth is that besides
being a Petri dish for abbreviations, XML also went from fringe
to mainstream and the number of applications and XML based
extensions continues to grow. This is anything but surprising,
considering that XML stands for "Extensible Markup Language"
- which is a bulls-eye name.
Without becoming too technical let's say what the language
and the mark-up is about. The main reason for the hype is
that XML makes a file truly "cross-platform". OK,
you are right - that sounds like yet another quote from people
who hardly get away from their computers. So let us come from
a different angle: the data recorded, for example, when men
landed on the moon, is stored in a format that today's computers
simply cannot read. If you want to find out how much fuel
was left after landing on the moon, you'd have to use the
original hardware (probably some kind of tape recorder the
size of an average 4-bedroom house). And even if you can find
the hardware, you will still have to look through millions
of cubic metres of unmarked data streams to find what you
are looking for. If this example sounds too absurd, then try
something yourself: go and find one of those floppy disks
with all your personal documents in the bottom drawer (written
in the early 90s) - and open those files with your current
word processor. The effect is the same. And if you cannot
find a prehistoric disc, then just try to remember back when
was the last time that your word processor told you with a
sardonic smile that the 20MB sized Word document you want
to open is "corrupted" and asks if you are happy
just to go on without it? Now here is the buzz: with XML this
would not have happened (...well, of course it still happens,
but the effect is far less scary)!
The reason that XML copes much better in such situations
is twofold: on one hand, XML allows you to store all data
as text. On the other hand, it incorporates an ingenious "kind
of" filing system in which each single piece of information
or cluster of information is labelled in a more or less descriptive
way to explain what it is. So for example young Jane Doe's
name and birth date might be stored like this in XML:
<firstName>Jane</firstName>
<lastName>Doe</lastName>
<dateBirth>03-27-75</dateBirth>
With this kind of system, all you need is a tool with a fancy
abbreviation and you are set to do all you want to do with
it.
So even if this example doesn't look like much, the principle
is impressive enough for XML to turn from being the favourite
file format of the "Open Source" community to becoming
the favourite file format of Microsoft. To prove this point:
with its next Office Suite, Microsoft will say "good
bye" to their proprietary file formats (like MS Word's
.doc and Excel's .xls) and "hello" to a tailor-made
XML format (with its own set of new abbreviations).
As translators, we face up to language and cross-cultural
challenges as well as numerous computer issues. Our goal:
the struggle should be invisible to the final user i.e. you
the client. But technology did not play along. And users of
translations could see our struggles: remember the olden days
and maybe even more recently when more and more Asian web
pages popped up on the Internet literally 'overnight'. "Back
then" you might have come across garble like this when
viewing files that originated in Asian countries:
"ã€å¤§å ç²è
¨Šã€'ç¾?國
é§é¦ 尼拉å
¤§ä½¿é¤¨ç™¼è".
But what looks like a random "string of characters at
first glance is actually (did you guess?) the following Chinese
sentence:
"【大公網訊】美國駐馬尼拉大使館發表聲明稱,由於收到「可能的威脅資訊」"
The problem is: it is displayed wrongly.
I should explain that you just came across one of a translation
company's common nightmares: we've translated a sentence into
Chinese, but the application in which it is being used is
not able to display our translation correctly. The technical
issue is that every computer application (like Word, Internet
explorer and even your accounting software) is capable of
displaying a range of character sets. Historically, there
were fewer character sets - and the most common one was the
USA set called "ASCII". It contained all characters
necessary to display English sentences and punctuation. But
because it didn't contain, for example, French accents or
German Umlauts (or any Asian characters for that matter) -
ASCII was very limiting when it came to displaying translations.
The first uncoordinated solution brought the emergence of
a raft of character sets - almost one for each language. Some
were free and others were proprietary and had to be purchased.
Due to deep dissatisfaction about this, Unicode was developed
and agreed on as a standard. Unicode is a character set that
contains 65,536 characters - covering almost every language
and every language rule in the world. Of course 'our' XML
supports Unicode - (but it allows any other character set
too). The ingenious part of this solution is that XML allows
and even asks you to declare which character set the file
content format is based on. We love it for that, because we
can identify at one glance if you are likely to encounter
any problems with the translation and, if so, we can help
you to solve it before you know it. The reason we can promise
this has to do with the other issue we want to highlight today:
Portability doesn't necessarily mean that you can take your
XML file with you on holiday (although you actually can) but
rather that XML can be exchanged between applications rather
easily. The beauty of this is not only that different software
applications can do lots of different things with the data.
Furthermore XML became somewhat of a killer application because
it can be used across all kinds of platforms - be it Windows,
Linux, Apple or even your mobile phone. The other aspect of
this is "what" you store: XML is so flexible that
you can use it to "parallel store" source text with
the translated target text in the same file.
So not only you have a file that is quite likely to survive
version changes and hardware development, but you also have
options as to what you want to store. If you wanted to, you
could, for example, store the content in 23 languages in the
same file and publish your content in 23 languages from that
one source to all the applications you want to. File management
becomes a breeze and your publications can be easily managed
across languages.
Christof Schneider
Lingo24 Translation Services
|