Localization and Translation
Reinhard Schäler
1. Perspectives
Localization is the linguistic and cultural adaptation of digital content to the
requirements and the locale of a foreign market; it includes the provision of
services and technologies for the management of multilingualism across the
digital global information flow. Thus, localization activities include
translation (of digital material as diverse as user assistance, websites and
videogames) and a wide range of additional activities. Contrary to
definitions provided by the Localization Industry Standards Association,
LISA (2010), or Dunne (2006), this definition explicitly focuses on digital
content and includes the management of multilingualism as one of the
important localization activities.
The localization industry as it is known today emerged in the mid
1980s with the advent of personal computing. North American multinational
software publishers were scouting for new markets for products that had
already been proven highly successful in the USA. They identified these
new markets in Europe, concentrating their efforts initially on the richest
countries in the region: France, Italy, Germany and Spain – the so-called
FIGS countries. The localization service industry subsequently organised
itself into Single Language Vendors (SLVs) and Multi Language Vendors
(MLVs). In the mid 1990s, a dedicated localization tools industry emerged.
Following a continued period of growth, Beninatto and Kelly (2009)
estimate the language services market worldwide to be worth US$25 billion
by 2013. Many digital publishers, including companies such as Microsoft
and Oracle, now generate more then 60% of their overall revenues from
their international business divisions. Localization is an instrument for the
unlocking of global market opportunities for these companies and an
instrument of their globalization efforts. It is, therefore, not surprising that
their localization decision is never based on the number of speakers of a
particular language, but on the Gross National Product (GNP) of the market
they target. While publishers localize their digital content into Danish (5m
speakers approx.) they do not so for Amharic (17m speaker approx.) and
rarely if ever for Bengali (100m speakers approx.).
Translators working in the localization industry are among the most
innovative in their profession. In the early 1990s, they were the first to use
computer assisted translation tools for large-scale projects as both, the
characteristics of the material to be translated (very repetitive, large
volumes, often of a technical nature) and the environment in which it was
translated (highly computerised, experimenting with new technologies as
they emerged), were highly conducive for the progressive introduction of
advanced technologies such as electronic terminology databases and
translation memories.
In more recent years, Central Europe, China and India have become
the central hubs for the world wide localization industry mainly because of
the lower cost of employment in these regions (Niode 2009). It can
reasonably be expected that India and China will become more than just
cheap localization hubs for large foreign multinationals; they will very soon
become major publishers of digital content in their own right. According to
a report by Barboza (2008) for the New York Times, China surpassed the
USA in internet use. With a penetration rate of under 20%, the number of
Chinese internet users was with 253 million already bigger than that of the
USA which had already reached saturation point (with 70%). This
development will soon lead to fundamental changes in the localization
industry, which today still works with English as the default source
language.
2. Localization: more than just translation
In an attempt to make the concept more accessible to the lay person,
localization is often defined as “like translation, but more than that”. As
translation technologies and digital content have become almost ubiquitous,
the difference between translation and localization has become clouded and
somewhat difficult to define.
2.1. Characteristics
Today’s localization projects are far from being homogeneous. They can
deal with anything from relatively static, large-scale enterprise applications
such as database systems and applications, to rapidly changing web-based
content such as customer support information and relatively small size but
very frequent, ad hoc personal and perishable consumer-type content.
A typical enterprise localization project, for example, can involve
the translation of three million words, stored in 10,000 files to be translated
into up to one hundred languages, all to be made available within a very
short period of time (Schäler 2004). Content is often multimodal, it can
come as text, graphics, audio, or video, and can be stored in a large variety
of file formats. Content can be highly repetitive and is often leveraged from
previous versions of the same core product.
As digital publishers struggle with the ever increasing demand on
their capacities, they focus on standards, interoperability and process
improvements, introducing sophisticated translation management systems
(TMS). They also resort to internationalization and reuse of previously
translated material to achieve the required increase in efficiencies.
2.2. Internationalisation and reuse: prerequisites for on-time localization
Publishers approached localization often as an afterthought. Deltas, i.e. the
time period between the release of the original version of the software and
that of its localized version, of nine months were the norm. As the type of
digital content published changed (from applications to multimedia to web
content) so did its distribution to consumers and, subsequently, the demands
for on-time localization: customers now demand this content become
available in their own language without delay.
The two developments that made on-time localization or simship, the
simultaneous shipment (release) of digital content, in a number of different
languages and locales possible for the first time in the early 1990s were
internationalization and the re-use of previously localized material.
Internationalization, meaning the preparation of digital content for
use in different languages as well as for easy localization, dramatically
reduced the localization effort which publishers ideally wanted to reduce to
translation, eliminating as much as possible costly software re-engineering,
re-building and testing activities. Digital publishers had learned the hard
way about the high cost of “localization as an afterthought”, so the most
advanced of them decided to take localization “upstream”, closer to the
design and development teams, starting with a “smart” localization-friendly
design and development of that content. Typical localization issues, such as
the restricted or inappropriate encoding of characters, hard-coded strings or
concatenated strings, or ill-advised programmatic dependencies on specified
strings – such as the infamous “Y” in many a software’s message “Press
‘Y’ to continue” – could thus be eliminated, not just for one but for all
language versions of that product and ahead of localization.
Reuse of previous translations became the main strategy to cut down
on translation cost and time. Repetition processing, both within one single
version as well as across versions of the same core content, started in the
early 1990s when translation memory technologies were first introduced to
large-scale enterprise localization projects (Schäler 1994). In some projects,
reuse rates of 60% and higher can now be achieved, significantly cutting
down on translation cost and time.
2.3. Generic enterprise localization process
While each localization project represents its own, particular challenges
requiring a fine tuning of the localization process to be adapted, most
processes have core aspects in common.
Analysis
Prior to localization, a number of key questions need to be answered in
relation to the project on hand: Can the digital content be localized? – Some
digital content is so specific to its original market that localization would
require significant re-development that would make it financially not viable.
Is the content internationalized? – Some digital content does not support the
features of other language and writing systems. Is the content to be localized
accessible? – If localizable strings are hard-coded, i.e. embedded in the
original code or in an image, they cannot be accessed by standard
localization tools.
It is standard practice as part of the analysis to carry out a so-called
pseudo translation, i.e. the automatic replacement of strings within digital
content with strings containing characters of the target language. Pseudo
translation can demonstrate in an easy, low-cost way the effect localization
will have on the digital content in hand. The outcome of this phase is a
report summarizing the results of the analysis and containing
recommendations to the project teams on how to proceed.
Preparation
Following the successful completion of the analysis phase, project mangers,
engineers and language leads prepare the localization kit for translators and
engineers containing all the original source material, reference material such
as terminology databases, translation memories, style guides, and test
scripts, as well as a task outline, milestones, and financial plans. The
localization kit includes a description of all the deliverables, the
responsibilities of the stakeholders, and all contact details.
Translation
While translation is at the centre of this activity, not all of the translation is
necessarily done by translators. Some, or indeed all of it can be delivered
(semi-) automatically by sophisticated computer aided translation
technologies, including terminology database, translation memory (TM),
and machine translation (MT) systems. In cases where all of the source
material is pre-translated using, for example, a hybrid automated translation
system, it is not translation but post-editing that is required.
Translators also need to support computer assisted translation tools
and their associated language resources involving the maintenance of large
size and multiple terminology databases and TMs across products, versions
and clients, and the tuning and use of MT systems. While some platforms
and localization tools provide a visual translation environment allowing
translators to see the context and appearance of the strings that are being
translated, this is not always the case. Strings might have to be translated out
of context. Combined with a significant pressure to produce high-quality
translations within short time frames, this is a very stressful, “alienated”,
highly automated and technical translation environment for which
specialised training is required (Schäler 2007).
Engineering and testing
Following translation, digital content must always be re-assembled and
tested (or quality assured) for functionality, layout and linguistic
correctness. While properly internationalized digital content significantly
helps to cut down on the engineering and testing (QA) effort necessary,
translation can have an unexpected effect on the functionality and
appearance of the content (Jiménez-Crespo 2009). Even strings that have
been translated correctly can be corrupted when used by an application or a
browser for reasons not always apparent to translators, localization
engineers and testers, and can require significant efforts to be rectified
before the final product can be released.
Review
Following each localization project, a thorough review is conducted by the
localization teams involving both the client and the vendor site. The aim of
this review is to reinforce successful strategies and to avoid mistakes when
dealing with similar projects in the future.
3. The future of localization and translation
Discussions about localization and translation have for a long time orbited
around a rather predictable set of issues with the role of technology,
automation, standards, interoperability and efficiencies in translation and
localization featuring prominently (Genabith 2009). This is so because the
discussion about as well as the research into localization-related issues has
been dominated by the pragmatic, commercial agenda of the localization
industry, an industry driven almost exclusively by the desire to maximise
the short-term financial return on investment of multinational digital
publishers in the development of their digital content. This rather narrow
focus of current mainstream localization activities is beginning to expand.
This development is driven by people and organisations who have
recognised that localization and translation are important not just for
commercial, but also for social, cultural and political reasons; they can keep
people out of prison, enhance their standards of living, improve their health
and, in extreme cases, even save their lives.
A recent, though rather short-lived, example of such activity was the
reaction to the Haiti disaster in early 2010 when a large number of
localization service providers as well as an even larger number of
individuals volunteered their services to help the people of Haiti. The
reaction to this catastrophe drove truly innovative efforts in disaster relief
involving translation and localization, such as the 4636 multilingual
emergency text service reported by Ushahidi and Envisiongood. Still, there
is a clear urgency to explore more sustainable and long term alternatives to
current mainstream localization and translations, going beyond those that
react in an immediate and often uncoordinated and unsustainable way to
disasters.
Access to information and knowledge in your language using media
such as the world wide web is not a “nice to have” anymore, not an option;
it is a human right and should be recognised as such as De Varennes (2001)
points out. Initiatives to make localization and translation technologies and
services available to all, including to those who currently do not have access
to them because of geographical, social or financial reasons, have shown
very promising results. One of the most prominent examples is that of the
IDRC, the Canadian Government’s Development agency which has been
funding both the South East Asian (IDRC 2003) and the African (IDRC
2008) networks for localization. Another is the more recent The Rosetta
Foundation.
Perhaps it is not surprising and should have been expected that the
hottest and most promising topics in the current localization debate –
crowdsourcing, collaborative translation and wikifization – are again about
to be taken over by industry interests rather than by those of society, at a
time when they could start to support the educational, health, justice, and
financial information requirements of those most in need.
References
Barboza, David. 2008. “China Surpasses U.S. in Number of Internet Users.”
The New York Times. 26 July 2008.
http://www.nytimes.com/2008/07/26/business/worldbusiness/
26internet.html [Accessed 27 April 2010]
Beninatto, Renato S. and Kelly, N. 2009. Ranking of Top 30 Language
Services Companies.
http://www.commonsenseadvisory.com/Research/All_Users/
090513_QT_2009_top_30_lsps/tabid/1692/Default.aspx?
zoom_highlight=ranking [Accessed 27 April 2010]
De Varennes, F. 2001. “Language Rights as an Integral Part of Human
Rights.” IJMS: International Journal on Multicultural Societies. 3 (1): 15-
25. http://unesdoc.unesco.org/images/0014/001437/143789m.pdf#143762
[Accessed 10 May 2010]
Dunne, Keiran J. 2006. “Putting the Cart Behind the Horse - Rethinking
Localization Quality Management.” In Perspectives on Localization, Keiran
J. Dunne (ed.), 95-117. Amsterdam & Philadelphia: John Benjamins.
Genabith, Josef van. 2009. “Next Generation Localisation.” In Localisation
Focus – The International Journal of Localisation 8 (1): 4-10.
http://www.localisation.ie/resources/locfocus/vol8issue1.htm [Accessed 6
May 2010]
IDRC. 2003. PAN Localization: Building Local Language Computing
Capacity in Asia. http://www.idrc.ca/panasia/ev-51828-201-1-
DO_TOPIC.html [Accessed 27 April 2010]
IDRC. 2008. African Network for Localisation (Anloc).
http://www.idrc.ca/acacia/ev-122243-201-1-DO_TOPIC.html [Accessed 27
April 2010]
Jiménez-Crespo, M. A. 2009. “The evaluation of pragmatic and
functionalist aspects in localization: towards a holistic approach to Quality
Assurance.” In The Journal of Internationalisation and Localisation (IJIAL)
1: 60-93.
LISA. 2010. Localization. http://www.lisa.org/Localization.61.0.html
[Accessed 27 April 2010]
Niode, Pricilla. 2009. “Assessing the Southeast Asian Markets.” In
Multilingual Computing. September 2009: 49-52.
Schäler, R. 1994. “A Practical Evaluation of an Integrated Translation Tool
during a Large Scale Localisation Project.” In Proceedings of the 4th
Conference on Applied Natural Language Processing (ANLP-94). Stuttgart,
Germany (October 13-15).
Schäler, R. 2004. “Language Resources and Localisation.” In Proceedings
of the II International Workshop on Language Resources for Translation
Work, Research and Training. A satellite event of COLING (28 August
2004). http://www.mt-archive.info/Coling-2004-Schaler.pdf [Accessed 27
April 2010].
Schäler, R. 2007. “Translators and Localization.” In The Interpreter and
Translator Trainer 1: 119-135.