Linguistic Purism and Software Localization

People who translate MediaWiki and other pieces of software with which I am involved to languages of India, Philippines, African countries, and other places, often ask me: Should we translate to pure native neologisms, or reuse English words?

Linguistic purism is quite common in translation in general, and not only in software. I heard of a study that compared a corpus of texts written originally in Hebrew with a corpus of translated texts. The translated texts had considerably fewer loanwords. This may be surprising at first: how can translated texts have fewer loanwords?

But actually this makes sense: A translator is not creating a story, but re-telling a story that already exists. A good translator is supposed to understand the original story well first, and after understanding it, the translator is supposed to retell it in the target language. Many translators use the time that they don’t invest in creating the story itself to make the translation “purer” than the target language actually is.

A text that is originally written in Hebrew expresses how Hebrew-speaking people actually talk. Despite a history of creating many neologisms, some of which were successful, Hebrew speakers also use a lot of loanwords from English, Arabic, German, Russian, French, and other languages.

And that’s totally OK. Loanwords don’t “degrade” or “kill” a language, as some people say. Certainly not by themselves. English has many, many words from French, Latin, Norwegian, Greek, and other languages, and they didn’t kill it. Quite the contrary: English is one of the most successful languages in the world.

A good original writer creates verisimilitude, naturally or intentionally, by using actual living language. And actual living language has loan words. More in some languages, fewer in others, but it happens everywhere.

Software localization is a bit different from books, however. Books, both fiction and non-fiction, are art. At least to some degree, the language itself is an expressive tool there.

Software user interfaces are less about art and more about function. A piece of software is supposed to be usable and functional, and as easy and obvious to learn and use as possible. The less people need to learn it, the closer it is to perfection. And localized software is no different: it must, above all, be functional.

Everything else is secondary to usability. If the translation is beautiful, but the software can’t be used, the job is not done.

And this is the thing that is supposed to guide you when choosing whether to translate a term as a native word, possibly a neologism, or to transliterate a foreign term and make it a loanword: Will the user understand it and be able to use the software?

The choice is not as obvious as some people may think, however. Some people may think that loaning a word makes more sense because it’s already familiar, and this will make the software familiar.

But always ask yourself: Familiar to whom?

The translator, pretty much by definition, is supposed to know the source language, and to be able to use the software in the source language. Most often the source language is English. So most likely the translator is familiar with the source terminology.

But will the user be familiar with it?

The translated piece of software is supposed to be usable by people who aren’t familiar with that software in the source language, and, very importantly, by people who don’t know the source language at all.

So if you translate words like “log in”, “account”, “file”, “proxy”, “feed”, and so on, by simply transliterating them into the target language because they are “familiar” in this form, ask yourself: are they also familiar to people who don’t know English and who aren’t experienced with using computers?

Some Hebrew software localizers translate “proxy” as something like “intermediary server” (שרת מתווך), and some just write “proxy” in transliteration (פרוקסי). The rationale for “proxy” is usually this: “everyone knows what ‘proxy’ is, and no one knows what an intermediary server is”.

But is it really everyone? Or maybe it’s just you and your software developer friends?

To people who aren’t software developers, the function of “proxy” is pretty much as obscure as the function of “intermediary server”… or is it? Because the fully translated native term actually says something about what this thing does in familiar words.

Of course, if you are really sure that a foreign term is widely familiar to all people, then it’s OK to use, and often it’s better than using a “pure” neologism.

And that’s why I put “pure” in double quotes: The “purity” itself is not important. Functionality and usability are above all. Sometimes “purity” makes usability better. Sometimes it doesn’t. It’s not an exact science.

I’ll go even further: More often than many people would think, pondering the meaning and choosing a correct translation for a software user interface term may start fruitful conversations about the design of the original software and uncover usability flaws that affect everyone, including the people who use the software in English.

There are thousands of useful bug reports in MediaWiki and in other software projects in which I am involved that were filed by localizers who had translation difficulties. Many of these bugs were fixed, improving the experience for users in English and in all languages.

To sum up:

  • Purism shouldn’t be the most important goal, but it should be one of the optional tools that the translator uses.
  • Purism is neither absolutely bad nor absolutely good. It can be useful when used wisely in context, case by case.
  • Usability should be the most important goal of software localization.
  • Usability means usability for all, not just for your colleagues.
  • Localizers can improve the whole software in more ways than just translating strings.