The Case for Localizing Names, part 3

I love music.

In particular, I love Israeli music.

In the last few years, I usually have some files of Israeli music with me when I leave my home, or my country – on my laptop or on my phone (ripped from CDs that I own, which is legit as far as my interpretation of copyright law goes).

And sometimes people from other countries are curious about it and ask me to copy some files for them. This is a copyright issue, but I justify it by the fact that they hardly have a chance to purchase it where they live, so they aren’t really hurting the relevant market. But there’s something bigger: a technical issue with the artist and song names.

Hebrew is written in the Hebrew alphabet. CDs have artist names and song titles in Hebrew, with English translations or transliterations added only occasionally. When I rip CDs, I give the files names in Hebrew letters. Most people around the world don’t know the Hebrew alphabet, so looking for a song they like using these files will be impossible for them. They would only be able to enjoy them if they don’t mind listening to everything in a shuffle. And though the newest phones are able to display Hebrew correctly, some devices that people have are still unable to do that.

I actually recall myself renaming files en masse to let friends from other countries listen to some Israeli music and now the artists’ names.

I’m not sure how to resolve this robustly, but much like with email and social networks and with legal forms, songs could use titles in different languages or scripts. Maybe MusicBrainz or Wikidata could add a structured property for transliterated song titles, and music files could be identified like that. Maybe each music track could have multiple fields for titles in different languages.

It’s good not just for international exchange between friends, but for marketing, too – some cultures only listen to music in English and maybe in their own language, but some are OK with listening to music in a lot of languages, because they are all equally foreign.

Long story, song names must be more easily localizable than they are today.

How to make hummus

Preparation
Get a big food processor. A stick blender will work, but a big sturdy strong food processor that can work uninterrupted for a few minutes is better.
A cup of chickpeas

A cup of chickpeas

Get small chickpeas. (Big ones work, too, but the smaller they are, the softer they get, and it’s important.)
Wash with flowing water, and remove bad ones (black, stale, etc.)

Chickpeas in water

Chickpeas in water

Put chickpeas in water for at least 24 hours. Keep them in a refrigerator. Change the water every six hours or so. I usually have them in the water for two or three days. They will increase twice or more in size during this time, so use a large receptacle.

Peeling

Optionally, you may peel your chickpeas. It may make the final paste slightly smoother, but it’s very time-consuming.

Peeled vs unpeeled chickpeas

Peeled vs unpeeled chickpeas

Boiling
Boil the chickpeas in a pot on a small stove until they are soft. “Soft” means that you can crush them with your fingers or teeth as easily as a boiled green pea. This may take a few hours, depending on weather, water quality, type of pot, fire intensity, and of course the chickpeas themselves. Usually it takes me somewhere between two and four hours. I begin in the morning and it’s ready by lunch time. (Arabs frequently do it overnight and have it as breakfast.)

I’ve been told that using a pressure cooker can shorten the time a lot, but I never tried it. But covering the pot while boiling is certainly a good idea.
Mixing
For one cup of chickpeas you’ll need:
Salt, cumin, pepper, olive oil, tahini, lemon, garlic

Salt, cumin, pepper, olive oil, tahini, lemon, garlic

  • Half a cup or more of tahini. Try to get something produced in Israel or an Arab country – Palestine, Lebanon, Egypt. In Israel, Tahini from Nablus is very highly regarded. Uzbek or Turkish tahini may be OK, but I’m not sure. Get raw tahini: it should have nothing but sesame in the ingredients (and maybe oil, but even that is unnecessary). Don’t use “tahini salads”, “seasoned tahini”, or “tahini spreads” if they have anything except sesame.
  • Half a cup of olive oil.
  • Fresh cold water. Some people use the water in which the chickpeas were boiled, and it’s OK, but fresh cold water gives the final product brighter color. For the amount see below.
  • Squeezed lemon juice. Half a lemon may be enough, but it can go up to a whole lemon or even more if you like it.
  • A clove of garlic. Some people don’t use it – a matter of taste.
  • A pinch of cumin. Just a tiny little pinch – it gives enough taste. Too much of it won’t ruin the taste, but will darken the color.
  • Salt and black pepper to taste. Small pinches should be enough.
Put the garlic, the cumin and a couple of spoons of chickpeas (without water) in the food processor and grind for about a minute. Add olive oil, lemon juice, and a bit of tahini. Grind for a minute more. Check the consistency. It will still be far from the final product, but should start looking like a paste.
Let's start!!!

Let’s start!!!

Add a quarter of a cup of water and grind a bit more. From here on, keep adding chickpeas, tahini, water, salt and pepper. Be especially careful with water – too much of it will make the whole thing too liquid, so add it little by little until the consistency looks beautiful and tastes well. Adding a lot of tahini is usually a good thing, but also depends on your taste.

Adding tahini and pepper

Adding tahini and pepper

It may be a good idea not to grind all the chickpeas, but to keep some boiled ones and add them as a topping. In fact, many hummus restaurants serve plates of hummus with lots of non-ground chickpeas in the middle, but do make sure that they are very soft.

Grind, grind, grind, grind, grind!

Grind, grind, grind, grind, grind!

Serving
Most commonly, it’s spread on a plate and “wiped” with a pita, but knock yourself out and serve it any way that is tasty to you :)
Basic: with whole boiled chickpeas, parsley, olive oil, cumin and paprika

Basic: with whole boiled chickpeas, parsley, olive oil, cumin and paprika

Very often it is spread on the plate using a spoon in a few rounds so that most of it is close to the edges and the middle of the plate is mostly empty and filled with additions, such as:
  • Boiled soft chickpeas
  • Fried mushrooms
  • Fava beans
  • Hard-boiled egg
  • Baked eggplant
The universal toppings are a bit of olive oil, black pepper, paprika and turmeric.
Another version - with fried mushrooms and the chickpeas mixed in

Another version – with fried mushrooms and the chickpeas mixed in

Variations
  • A lot of people suggest adding a spoon of baking soda while boiling. They say that it makes the chickpeas softer. I tried it a few times, and it doesn’t hurt, but not really necessary either.
  • It’s OK to cheat by buying a can of preserved whole chickpeas if they are sold in your area. They are already soft, so you only need to boil them for a few minutes. It saves you a lot of time and the taste is fine.

The first ten or so times that I tried to do it, it was very far from brilliant. It can take years to become good at it. Don’t let it discourage you :)

The Stupidest Sentence I’ve Ever Read

The stupidest sentence I’ve ever read was not written by a child. Not by a religious demagogue. Not by a YouTube user. Not by a politician and not by a political opinion blogger. Not by somebody who discovered a fun folk etymology.

All such people are expected to write stupid sentences, but they are all understandable in their context. Even the religious demagogue. I just don’t expect anything smart there.

No, the single stupidest sentence that I’ve ever read was written by a Harvard Medical School professor.

“We all know that exercise makes us feel better, but most of us have no idea why.”

This is the opening sentence of a book called Spark!: How exercise will improve the performance of your brain by John Ratey and Eric Hagerman.

The rest of this book may well be good, but I just couldn’t get past this. Seriously? Seriously? Opening a book that purports to be scientific, even if popular, with a sentence that is so easily falsified is a complete non-starter for me.

Exercise doesn’t make me feel better. And I damn well know why. It makes me feel like I’m tired and bored. It makes my body hurt. If makes me think that I’m investing time and effort in something exceptionally pointless and negative while I could do something useful. It does not make me feel anything positive at all.

This book, which is supposed to convince me to do exercise, does precisely the opposite with its opening sentence: It makes me hate the thought of exercise even more.

I first read that sentence a couple of years ago. Today I saw the book on the shelf, and I am still convinced that it’s the stupidest one I’ve ever read. I don’t care about “setting the mood”. I don’t care that that’s how book marketing works. I like things that have meaning, and sadly this book throws meaning out the window right from the start.

Feel free to call me a lazy ass, but you’ll be missing the point.

Continuous Translation and Rewarding Volunteers

In November I gave a talk about how we do localization in Wikimedia at a localization meetup in Tel-Aviv, kindly organized by Eyal Mrejen from Wix.

I presented translatewiki.net and UniversalLanguageSelector. I quickly and quite casually said that when you submit a translation at translatewiki, the translation will be deployed to the live Wikipedia sites in your language within a day or two, after one of translatewiki.net staff members will synchronize the translations database with the MediaWiki source code repository and a scheduled job will copy the new translation to the live site.

Yesterday I attended another of those localization meetups, in which Wix developers themselves presented what they call “Continuous Translation”, similarly to “Continuous Integration“, a popular software deployment methodology. Without going into deep details, “Continuous Translation” as described by Wix is pretty much the same thing as what we have been doing in the Wikimedia world: Translators’ work is separated from coding; all languages are stored in the same way; the translations are validated, merged and deployed as quickly and as automatically as possible. That’s how we’ve been doing it since 2009 or so, without bothering to give this methodology a name.

So in my talk I mentioned it quickly and casually, and the Wix developers did most of their talk about it.

I guess that Wix are doing it because it’s good for their business. Wikimedia is also doing it because it’s good for our business, although our business is not about money, but about making end users and volunteer translators happy. Wikimedia’s main goal is to make useful knowledge accessible to all of humanity, and knowledge is more accessible if our website’s user interface is fully translated; and since we have to rely on volunteers for translation, we have to make them happy by making their work as comfortable and rewarding as possible. Quick deployments is one of those things that provide this rewarding feeling.

Another presentation in yesterday’s meetup was by Orit Yehezkel, who showed how localization is done in Waze, a popular traffic-aware GPS navigator app. It is a commercial product that relies on advertisement for revenue, but for the actual functionality of mapping, reporting traffic and localization, it relies on a loyal community of volunteers. One thing that I especially loved in this presentation is Orit’s explanation of why it is better to get the translations from the volunteer community rather than from a commercial translation service: “Our users understand our product better than anybody else”.

I’ve been always saying the same thing about Wikimedia: Wikimedia projects editors are better than anybody else in understanding the internal lingo, the functionality, the processes and hence – the context of all the details of the interface and the right way to translate them.

Link Wikipedia Articles in Different Languages

OK THIS IS AWESOME, and “awesome” is not a word that I use lightly.

As a gift for the second birthday of the Wikidata project, nice people at Google created a tool that helps people link articles in different languages that are not linked yet. They prepared a list with thousands of pairs of articles in different languages that are supposed to be about the same subject according to their automatic guesswork. The tool only shows such articles, and a human editor must check whether they actually match, and if they do—make the linking automatically.

There were thirty six such articles for the Hebrew–English pair. About four of them were unrelated, and I fixed the linking between the rest of them. Some of them required manual intervention, because there were interfering links to unrelated subjects. For some simple cases it took me just a few seconds, and for a few complicated ones—a few minutes.

I also tried doing the same for Russian–English, but there are over a thousand article pairs there, so I only did a few. I also did a few for Catalan and Greek, and I finished all ten pairs for Bengali, even though I don’t actually know Greek or Bengali. I just used a bit of healthy intuition and Google Translate, and I’m pretty sure that I did it well.

You can help!

Here are my suggested instructions for doing this.

Preparation:

  1. Log in to mediawiki.org. This account is used also for the tool.
  2. Now go to the tool’s site. Click Login, and allow the tool to use your mediawiki.org account.
  3. Go to settings, and choose your pair of languages.
  4. Go to “Check by list” and you’ll see a list of article pairs. If there are no suggested article pairs for the language pair you selected, go back to number 3 choose some other languages. As I wrote above, from my experience, you don’t need to know a language thoroughly to perform this useful work ;)

Now click a link to a pair of articles that looks reasonable. Articles in both languages will open side by side.

  1. If the articles are definitely not about the exact same subject, click “No” in the list and find another pair.
  2. If the articles are about the same subject and one of them doesn’t have any interlanguage links, click “Add links” in the interlanguage area. In the box that will open, write the language name of the other language in the first field and the title of the article in the other field, and then click the “Link with page” button. A list of articles in other languages will be shown. If it looks reasonable, click “Confirm”, and then “Close dialog and reload page”. That’s it, the pages are linked! Click “Yes” in the list in the linking tool and proceed to another article pair.
  3. If the articles are about the same subject, but both of them appear to have links to other language, it’s possible that explicit interlanguage links are written in the source code of the articles. To resolve this, do the following:
    1. Open both articles for editing in source mode.
    2. Scroll all the way down and find whether they have explicit interlanguage links.
    3. If these are correct links to articles about the same subjects in other languages, go to those articles, and link them using Wikidata. Note that it often happens in such cases that these are links to redirects, so the actual current title may be different.
    4. If these are links to articles about other subjects, even if they are related, remove those links. For example, if the article in Bengali is about an island, and the article in Dutch is about a city on that island, remove the link – these subject are distinct enough. Ditto if the article in English is about an American human rights organization and the article in French is about a French human rights organization.
    5. If you were able to remove all the explicit links from the source, go back to point 2 above and link the articles using Wikidata.
    6. If it’s too complicated to remove these links for any reason, feel free to go to another article, but it would be nice to leave a note about this on the articles’ talk pages so that other editors would clean this up some time.

That’s it. It may get a tad complicated for some cases, but if you ask me, it’s a lot of fun.

Where to read about the Elections in India?

There is an election process going on in India, which is frequently called “the world’s largest democracy” and an “upcoming world power”. Both descriptions are quite true, so elections in such a country should be pretty important, shouldn’t they?

Because of my work I have a lot of Facebook friends in India, and they frequently write about it. Mostly in English, and sometimes in their own languages—Hindi, Kannada, Malayalam and others. Even when it’s in English I hardly understand anything, however, because it is coming from people who are immersed in the India culture.

It is similar with Indian English-language news sites, such as The Times of India: The language is English, but to me it feels like information overload, and there are too many words that are known to Indians, but not to me.

With English-language news sites outside of India, such as CNN, BBC and The Guardian it’s the opposite: they give too little attention to this topic. I already know pretty much everything that they have to say: a huge number of people are voting, Narendra Modi from the BJP is likely to become the new prime minister and the Congress party is likely to become weaker.

Russian and Hebrew sites hardly mention it at all.

What’s left? Wikipedia, of course. Though far from perfect, the English Wikipedia page Indian general election, 2014 gives a good summary of the topic for people who are not Indians. It links terms that are not known to foreigners, such as “Lok Sabha” and “UPA” to their Wikipedia articles, so learning about them requires just one click. When they are mentioned in The Times of India, I have to open Wikipedia and read about them, so why not do it in Wikipedia directly?

This also happens to be the first Google result for “india elections”. And if you go the page “Elections in India” in Wikipedia, a note on the top conveniently sends you directly to the page about the ongoing election process. Compare this to the Britannica website: searching it for “india elections” yields results that are hardly useful—there’s hardly anything about elections in India in general, let alone about the current one.

One thing that I didn’t like is the usage of characteristic Indian words such as “lakh” and “crore”, which mean, respectively, “a hundred thousands” and “ten millions”. I replaced most of their occurrences in the article with the usual international numbers, and I think that I found a calculation mistake on the way.

So while Wikipedia is, again, far from perfect, its “wisdom of the crowds” system works surprisingly well time after time.

WikiAcademy Kosovo 2014 or, “This Israeli geek made a joke about hobbits. You won’t believe what happened next.”

In late February 2014 I attended WikiAcademy in Kosovo.

What do most people know about Kosovo? That it’s a place somewhere… um… they kinda heard about in the news some time ago.


What it actually is? It’s a partially recognized country, which was in the past part of Serbia and Yugoslavia. It is mostly populated by Albanians, with small minorities of Serbs, Turks and others.

The ethnic difference between Kosovo and the rest of Serbia caused many tensions. In the 1990s and 2000s the area experienced a lot violence. NATO and much of Europe supported Kosovo’s independence, Serbia and Russia objected, and after the Kosovo war the region emerged as a de-facto independent state. Some countries recognized it and some didn’t.

Sadly, as it happens very often, what most people hear about such places is a lot of news about violence and very little stories about anything else—history, culture, architecture, language, music. I definitely care a lot about these positive things, and not much about the wars.


I flew via Istanbul, and the lady at the boarding didn’t quite know what to do with my passport: she looked for a visa and when she couldn’t find one, she just asked me whether I need one. I said that I didn’t and she let me on the plane. The passport control guy on arrival also didn’t know whether Israelis need a visa and had to check a table. I guess that not many Israelis come there, which is a shame, really.

Right from the airport my hosts took me to a different event: a BarCamp in Prizren. Prizren is absolutely beautiful. The Byrek—what we call “Burekas” in Israel—is delicious, the beer is fantastic, the streets are beautiful and the buildings are magnificent. Magnificent not like in France, but like in… Kosovo. In the Balkans. Down to Earth and human.

Prizren, Stone bridge and Sinan Pasha Mosque. Tobias Klenze, CC-BY-SA 3.0

Prizren, Stone bridge and Sinan Pasha Mosque. Tobias Klenze, CC-BY-SA 3.0

In the BarCamp there were three talks: two in Albanian, which I sadly don’t know, about… some Open Source projects. The third one was mine, about that-website-that-we-all-know-and-love. I used just one “slide”—xkcd’s famous protester, which, to my surprise, a lot of people in the audience didn’t recognize. I invited people to contribute, of course, and I enjoyed answering a question about how concepts such as “love” can be referenced and fact-checked. The bar is which the event was held is called “Hobbiton”, which is appropriately adorned with multiple Tolkien-themed posters. The Albanian Wikipedia, however, didn’t have an article about Hobbits, which I mentioned in my talk, with hope that it would be written. Was it? Stay tuned.


My second and third day in Kosovo were dedicated to WikiAcademy itself—first in the town of Gjakova and then in the capital, Prishtina.

So what is the WikiAcademy in Kosovo? It’s an event organized by IPKO Foundation, a local organization that promotes modern telecommunications in Kosovo. This event is different from what we call “WikiAcademy” in Israel, which is more like an academic conference with talks—in Kosovo it’s more like a Wikipedia editing workshop for newcomers, but a very large one. The 2014 edition of WikiAcademy is the second edition of this event, and it was held over three weekends in two cities—Gjakova and the capital Prishtina, with participants from several more cities. Over two hundred people participated.

The organizers, some of whom are experienced Wikipedians themselves, prepared the event very well. The logistics were great—working wifi, tasty food and comfortable transportation—but more importantly, the participants were very well-prepared for their task of writing Wikipedia articles: they received clear topics and instructions about writing with correct encyclopedic style and citing sources.

The articles were written in the English Wikipedia. The topics of the articles were all about cities of Kosovo: their architecture and monuments, education, events, festivals, culture, history and nature. Most people probably never heard about cities with unusual names such as Peja (a.k.a. Peć), Ferizaj (a.k.a. Uroševac) and Štrpce and well, now they are not just mentioned in the English Wikipedia by name, but there are several detailed articles about different topics related to each of them.

I’ll reiterate this: It was fantastic to see people sitting with reference books and encyclopedias to be able to cite sources. So often this is the biggest challenge in Wikipedia editing workshops, and the organizers prepared the participants very well. It was also great that everybody knew which articles are they working on.

My role was to give three talks, about Wikipedia’s encyclopedic writing style, about good practices for talk pages, and about translating articles, and other than that—to help people write, cite sources correctly, insert images, and make sure that they don’t violate any policies. It was challenging and tiring, but oh so fun. Hackathons and Wikipedia meet-ups are possibly the only kinds of events where I’m continuously so energized and talkative.

People sitting with laptops in a room

WikiAcademy Prishtina. Katie Chan, CC-BY-SA 4.0.

I also did my best to showcase Wikimedia’s newest software: VisualEditor, which the newbies just loved, and the Content Translation prototype.

During my talk about translation, I created, as a demo, two articles in Albanian: Hobbit (as promised above!), and Haifa, the city that hosted Wikimania 2011.


After I came back, the event continued. I was one of the judges that chose the best articles and photos for awarding prizes. The big winner was the well-deserving Historical monuments in Prishtina, although many others were wonderful: Flaka e Janarit, Rugova Mountains, Health Care in Kosovo, Water in Prishtina. The awards ceremony was held a few days after that.

There was also a bit of a dark side to the contest: Because most of the writers were newbies, and none were English speakers, there were many little innocent mistakes in spelling, referencing and writing style, which the English Wikipedia editors took very seriously. Some articles were even proposed for deletion, although all (or almost all) were kept. This, again, raises the well-known dilemma—it’s important to keep Wikipedia’s standards high, but it’s just as important to remain nice in the process and not “bite the newcomers”.


My thanks go out to the excellent organizers: Arianit Dobroshi, Gent Thaçi, Abetare Gojani, Rineta Hoxha, Altin Ukshini, Lis Balaj and many others. I enjoyed every minute, and learned a lot.


Related:

This post mostly uses Albanian spellings of place names. I do that simply because that’s what I saw during my visit. Don’t consider this post authoritative with regards to the preferred English spellings. Wikipedia may use different spellings, and they are not so consistent.



Follow

Get every new post delivered to your Inbox.

Join 2,084 other followers