Archive for the 'Russian' Category

Turkic Wikimedia Conference 2012, Almaty: Master Class, Kazakh in China and Developers’ Workshop

The translatewiki.net “master class”

On the morning of the second day of the Turkic Wikimedia Conference 2012 I held a translatewiki.net workshop. The participants called it a “master class” and I didn’t object :)

People sitting on benches. Amir Aharoni operating a notebook and a projector

Doing a "master class" in translatewiki.net

In the master class I demonstrated how to translate Wikimedia software. People opened accounts and started translating MediaWiki and the Wikipedia Mobile app. During the master class several issues were raised. Some of them turned out to be technical issues of translatewiki.net. I intent to find a solution soon.

Language support for Kazakh speakers in China

After the translatewiki.net master class I had a relatively short, but really fantastic meeting with Akytbek, a Kazakh speaker from North-Western China. He told me that two million Chinese Kazakhs are well-connected to the Internet and that they vigorously use the Kazakh language online. (According to official Chinese data, there are 1.25 millions Kazakhs in China, but whatever the number is, it’s a lot of people.) That is good, of course, but they only do it only in the Arabic alphabet, and not the Cyrillic, which is used in Kazakhstan. He said that there is a great potential of having many Chinese Kazakh contributors to Wikipedia, and that even though the Kazakh Wikipedia already supports the Arabic script, some improvements are needed to realize this potential.

People sitting together on benches and looking on a laptop computer

Working with Akytbek from China on Arabic script support for the Kazakh Wikipedia

I showed Akytbek our current language tools – the automatic script conversion, WebFonts and the Narayam typing tool, and we decided to work together to adapt them better for the needs of Chinese Kazakhs.

By the way, Akytbek didn’t speak any Russian and he knew little English, so another Kazakh speaker who knew Russian acted as an interpreter. This is yet another proof of the importance of never assuming anything about languages and people.

MediaWiki development workshop

According to the schedule, the same morning I was also supposed to hold a workshop for programmers that would introduce them to MediaWiki development. The workshop did not take place at its scheduled time – network problems spoiled the opportunity. However, as it is so important, we did not give up and held it later at the hotel where we were staying.

It was intense, and intensely good, too: Talented and experienced people from Turkmenistan, Kyrgyzstan, Bashkortostan and Kazakhstan sat and listened to me talking for two hours or so about MediaWiki configuration, special pages, i18n files, installation procedures, extensions, preferences, templates, bots, source control and so on. Because of the quality of the questions, I am sure that my presentation was understood. What made me really happy is that several people asked how they could contribute patches and new features.

To be continued…

Keyboards, Firefox, Chrome and Privacy

I hardly ever used Google Chrome because of a bug that made the Ctrl-arrow keyboard shortcut work incorrectly in right-to-left languages. This shortcut works makes the cursor jump a word to the left or to the right. In Hebrew and Arabic it would jump to the left when the right arrow was be pressed. It works well in most other programs, but since Chrome doesn’t use the operating system’s text editing capabilities, this worked incorrectly.

I write a lot of email, blog posts and Wikipedia articles and this keyboard shortcut is essential for me, so if it doesn’t work correctly in a program, i simply cannot use it and will use the competitor, in my case Firefox. Since i love Firefox anyway, it was not really a problem for me.

It took more than two years to do it, but this bug is more or less solved now and the fix will probably be released soon. I am now trying a preliminary version and the Ctrl-arrow shortcut seems to work correctly. However, as i expected, i quickly found other problems because of which i cannot use Google Chrome. Long story short, i cannot write Russian there. It’s not that it’s impossible – it’s just way too hard for me.

I could enable the Russian keyboard layout in my operating system, but it would be very hard to use for me. Keyboards sold in my country usually come with Latin and Hebrew letters printed on the keys and not Russian. It’s possible to buy a keyboard with Russian letters on it, and i did it once, but it didn’t help me much. You see, i write Russian several times a day, but less often than i write Hebrew or English, and the Russian layout is very different from the Latin layout, so i type in it very slowly even if i have the letters in front of my eyes.

Since 2006 my solution for this issue was the Transliterator add-on for Firefox, created by Alex Benenson (thank you so much, Alex). It was first called “ToCyrillic”, because it only helped with the Cyrillic alphabet, but later it was adapted to many other languages. It allows me to type Russian phonetically, so the Latin ‘b’ is automatically converted to Cyrillic ‘б’, ‘sh’ becomes ‘ш’ etc. It works everywhere in Firefox – websites’ input fields, the address bar, the dialog windows etc.

I couldn’t find anything like it for Chrome. It’s possible that i didn’t look well enough, but the add-ons i did find that claimed to do transliteration, phonetic typing or keyboard emulation either did something completely different or asked me to allow the add-on access my data on all websites and my tabs and browsing activity. I don’t understand why such an add-on would need access to my data and browsing activity – it is only supposed to translate the characters i type into other characters and forget it.

It’s possible that the message that tells me about these privacy implications is over-zealous and the add-ons in question don’t actually breach my privacy, but it is still weird to see them, so i didn’t install them.

So there – i still have a strong reason not to move to Google Chrome. It’s not really Google’s fault. In fact, i could myself develop an extension that does something that i want – the source and the API are open and it’s probably not a lot of work. But why would i waste even a minute of my time doing such a thing if i already have Firefox and its Transliterator add-on that work perfectly well? You could say that Google Chrome is faster and uses less memory; it is not quite true in the first place, and even if it would be true, i wouldn’t care about it, because being able to write the language i want is far more important than minor differences in performance.


As a side note, in some Google websites it’s possible to type in transliteration. However, it works only on these particular sites and needs the machine to be online, because it uses a web service to translate every word. That is weird software design and has rather unacceptable privacy implications.

Wikipedia already has phonetic typing support in Malayalam, Tamil and other languages and soon it is going to be deployed to other languages. It works in-place – it translates the text immediately in the browser letter by letter. Of course, it only works in one website; it would be better to help people to enable their native keyboard layouts rather than do it in only one website, but apparently doing it this way helps people start writing and searching immediately. More details on that soon.

Arab Inventors in Wikipedia

The famous provocative Russian designer and blogger Artemy Lebedev wrote in his blog today (my translation from Russian):

European (Christian) consciousness is built differently than the Eastern (Muslim).

The main unique property of the European culture is the ability to invent and create new things, technologies, items and products. Arab peoples are absolutely unable to invent something. Do we know anything Arabic? A television? A telephone? A car? At least one thing? My main complaint towards Islam is this – as a culture it is so egotistic, that I feel suffocated there.

Though very provocative in his use of language and in his criticism against ugly design, Lebedev is usually very secularist and anti-nationalistic. Sometimes, though, he does make some shocking and scathing remarks about ethnic and religious groups, such as this one.

It did make me think, however. Everybody knows that in the Middle Ages Arabs made many important advances in literature, medicine, astronomy, mathematics and other fields, but i really couldn’t think of an Arab inventor from the recent centuries. So i went to Wikipedia, opened Category:Inventors and descended to Category:Inventors by nationality.

There was only one Arab country listed: United Arab Emirates. Other prominent Muslim countries were Pakistan, Afghanistan, Iran and Turkey. Hmm. So i went to the page List of inventors, hoping that it would be more inclusive and easy to search. It didn’t help much – i found very few Arabs there, and they were mostly medieval characters.

And then i recalled that it’s the English Wikipedia. So i went to Category:Inventors by nationality in the Arabic Wikipedia. There i found several sub-categories for Arab countries: Saudi Arabia, Tunisia, Algeria, Lebanon and Egypt. There was no category for UAE, even though one existed in the English Wikipedia, and none of the categories i found in Arabic had an English counterpart; the one that existed for Algerian inventors was deleted a few months ago, because it was empty.

I went over the articles in these categories in the Arabic Wikipedia. Most of them didn’t have an English counterpart. There was an article in English about Hassan Kamel Al-Sabbah, a Lebanese engineer, so i created Category:Lebanese inventors for him and now there are two Arab countries under Category:Inventors by nationality in English.

There was also an article in English about Ahmed Zewail, an Egyptian chemist, and a couple of other scientists. All of them are probably great people, but reading the articles about them in English it seemed to me that even though it’s correct to call them “scientists” and maybe “discoverers”, they probably aren’t inventors. Of course, it’s possible that i misunderstood something, but it may also mean that for the people who tagged these people as “inventors”, this word had a somewhat different meaning. This may or may not mean that the Arabic word used in the category name, مخترع, covers both inventions and discoveries. The Al-Mawrid Arabic-English dictionary, which i use most of the time, says that this word means “inventor, creator, originator, innovator, maker, author”.


So, there’s a little lesson in cultural divide to be learned here. No, i don’t agree with Artemy Lebedev – i am certain that Arabs can and do invent things and the existence of articles about alleged inventors from Arab countries in the Arabic Wikipedia probably means that this is true. But currently chauvinistic people can take a look in the English Wikipedia, see that it has almost no Arab inventors and keep being sure that Arabs are, indeed, stupid and incapable of invention. Since Wikipedia is so easily available, they probably won’t bother to search for information elsewhere.

Unfortunately, my understanding of the Arab culture and language is too small, but surely there must be an Arab who will take this challenge and improve the coverage of Arab inventors in the Wikipedia in English and other languages.

One way to do this would be to run the script that i wrote for finding and categorizing articles without interlanguage links; if you know Arabic and Perl, please contact me and i’ll gladly help you to set it up for the Arabic Wikipedia.

Who is Albert Sánchez Piñol?

Who is Albert Sánchez Piñol? Let’s look at Wikipedias in different languages, translated into English, ordered by the English name of the language:

Basque: Albert Sánchez Piñol is a Catalan writer and anthropologist.

Catalan: Albert Sánchez Piñol is a Catalan anthropologist and writer who wrote the known works “The Cold Skin” (2002) and “Pandora in Congo” (2005).

Dutch: Albert Sánchez Piñol is a Spanish anthropologist and employee of the Center for African Studies of the University of Barcelona. (The rest of the article describes his work in the field of anthropology. The last sentence says that he writes in Catalan.)

English: Albert Sánchez Piñol (Catalan pronunciation: [əɫˈβɛrt ˈsantʃeθ piˈɲɔɫ]) is a Catalan Spanish author and anthropologist writing in the Catalan language.

German: Albert Sánchez Piñol is a Spanish anthropologist and writer. (Catalan is not mentioned in the article, but the article is included in the category “Literature (Catalan)”).

Italian: Albert Sánchez Piñol is a Spanish writer and anthropologist. (The fact that “The Cold Skin” was written in Catalan is mentioned towards the end.)

Norwegian: Albert Sánchez Piñol is a Spanish author and social anthropologist, writing in Catalan.

Polish: Albert Sánchez Piñol, a Spanish writer, a prosaist writing in the Catalan language. By education he is an anthropologist.

Russian: Albert Sánchez Piñol – a Catalan anthropologist and writer.

Spanish: Albert Sánchez Piñol is a Spanish writer and anthropologist. His literary work is written in Catalan.

(All articles say that he was born in Barcelona in 1965. Only English has an IPA transcription of the name, although it’s probably wrong.)

Japanese, Germans and Israelis of the world

Through i-iter i came upon this interesting post: Tamil, Kannada and the middle path. Tamil and Kannada are two important languages spoken in the south of India and their speakers are quite proud of their identity.

The article complains that not enough is being done for the linguistic normalization of non-Hindi languages in India. It was very interesting to read it and, being Israeli, i was surprised to see the compliments to “Japanese, Germans and Israelis of the world who aren’t wasting time tom-toming about antiquity, beauty or originality, but are instead investing their time, money and energy in using their languages for almost all known purposes”.

I was curious – why did they choose these three? Why not Russians and French, who use their languages for everything because many of them openly consider them to be better than all the others? Why not Catalans, whose language is in a political situation which is much more similar to that of Tamil and Kannada?

And why Israelis? Sure, we use Hebrew a lot; Hebrew Wikipedia, for example, is our pride. But i don’t think that we use Hebrew enough. For example, a lot of people (not all) write email in English. They write email in English even if they don’t know English well. They write email in English even though practically all the technical problems with encoding and bi-directionality were solved years ago. And they write email in English even if the email is about a topic for which Hebrew is perfectly suitable: one could argue that English is more convenient for writing about software or physics, but quite a lot of people write email in English just to to tell recent family news or to make an appointment.

I used to do that, too, but i made a conscious decision to stop writing email in English unless it is absolutely necessary. I tell all my friends about it. Some of them are indifferent and some of them – especially those in the software industry – say that Israel should have adopted English and not Hebrew as its language. Shame on them. Students think that i know English well, so they often ask me what is the most polite way to make an appointment with their professors in English, and i always tell them: “If your professor can read Hebrew, just write the email in Hebrew!”

Of course, there’s also the matter of university papers. In physics, for example, even though Hebrew is used in classroom, it goes for granted that papers at M.A.-level and higher are written only in English. The need for an English version is understandable, because in the world scale very few people would be able to read a paper in Hebrew, but i would imagine that it’s much better to write the paper in Hebrew and translate it. Yes, it would take time and probably money, but it is nevertheless useful and not just for the honor of the Hebrew language: it would actually advance science and education, because this way people would express themselves in their own language and think about physics instead of thinking about English.

Finally, there’s Facebook. For some reason many Israelis still use Facebook with the English interface – again, even though they don’t know English well, and even though they never read or write anything in English there. The translation of Facebook into Hebrew is terrible, and what’s especially frustrating is that i would gladly fix it, but i can’t, because the interface for submitting translation corrections is absolutely unusable. I nevertheless use Facebook in Hebrew, because it solves the bi-directionality problems – for example, the notorious problem with the punctuation marks appearing at the wrong end of the sentence. There was a newspaper report saying that Facebook influences Israeli children so much that they got used to writing the question mark at the beginning of the sentence – and that’s how they submit their homework! Some Israelis develop weird tricks to make the punctuation appear on the correct side of the sentence, for example by adding a letter after the period – compare “אתה בא לכדורגל בערב?י” and “אתה בא לכדורגל בערב?” – notice the placement of the question mark and the redundant letter in the first sentence. But they could simply switch to Hebrew. (And one day i will write an email to Facebook offices and tell them that they really should improve the translation.)

It’s quite pleasing to see that speakers of Kannada look up to us, but it doesn’t mean that we already did all we could to normalize Hebrew.

(And why am i writing this in English? Because i started writing it as a comment for that blog and it grew into a post by itself.)

Chronia Polla!

I received spam in Greek for the first time. I already received a lot of spam in Hebrew, Chinese, Korean, Japanese, Hindi, Russian, Arabic, Farsi and Armenian. It’s a shame really – in the ancient times Greek was far more important. Χρόνια Πολλά – Καλές Γιορτές!

You

YouTube may be a competitor to Wikipedia as one of the most massively multilingual sites on the web.

Many people who comment there don’t seem to care that English is the lingua franca of the web. They just write in Russian, Portuguese, Indonesian, Catalan and Croatian and it creates a soup of languages. And that is a Very Good Thing. It makes languages seen and promotes tolerance. Variety and tolerance are mighty good.

More Soviet Animation

Contact—This should be Ian Brown’s favorite animated film. The creators stole the theme from “Godfather” and based on it a short film about contact between an earthman and an alien. Avantgarde and funny.

Hedgehog in the Fog—The title says it all. If you think that it doesn’t say much, you are right. This beautiful short film has a lot of dialogue in Russian and it is translated, but don’t try too hard to understand the plot. Just enjoy the visuals.

There Once Was a Dog—before Russians and Ukrainians hated each other they made lovely colorful movies about each other. The fat wolf breaking through the fence is one of the most unforgettable moments of my childhood.

Film Film Film (part 1), Film Film Film (part 2)—Like “Ograblenie po…”, this is another masterpiece of “film about film”, a love poem to the cinema industry. The director running on the ceiling at 8:20 of part 2 is another unforgettable childhood moment.

Ograblenie po…

OK, good—run and watch this on YouTube before it is removed for copyright reasons!

This is one of the best Soviet animation films. It wonderfully parodies the styles of film in the respective countries. You don’t need to know any Russian to watch it. (There’s very little dialogue in it, and the version on YouTube has translation for all of it! Thank you, whoever you are.) I don’t need to add much more, you can read the rest in Wikipedia.

This film was shown very often on the Soviet TV. After i saw it many times, the censorship was quietly lifted from the final part, which parodies Soviet crime comedies and i remember very well how surprised i was when i saw it for the first time.

Swiss, part 2

OK, i couldn’t resist, here a few more comments about languages in that Slashdot article about languages:


[Learn] Girlspeak.

I’m currently living with four (4) girls (three daughters, wife) all of which are able to speak in riddles and conundrums that they themselves understand, while leaving me completely at a loss of any valuable information.

Interestingly enough, this Girlspeak language transcends cultural boundaries! It is simply amazing how two girls can communicate without actually knowing the native tongue of the other.


adding German to my curriculum tacked one extra semester onto my studies. To say it was not encouraged is understating the case: I was told not to waste my time. Years have passed and the rest of my studies are some vague blur involving plumbing; but I can still speak German.


I learned German for three years, thinking it might be good for science. I even stayed with a German family for six weeks one summer. What I discovered: The Germans mostly speak better English than 3 years worth of German, and they’re usually eager to practice it. Had I learned Spanish instead, at least I could converse with the gardeners around here.


Germany is the only place where I’ve asked a question in english to someone off the street and have the person turn around and walk away. Sure the french may berate you, but I’d rather like that. Choose your poison.


Russian is rarely spoken outside of Kaliningrad and Karlovy Vary, but is widely understood (though rarely very welcome.)



Follow

Get every new post delivered to your Inbox.

Join 1,392 other followers