Wikipedia, a Jamaican Jew, and Yak Shaving

For me, writing in Wikipedia is very often a story, within a story, within a story.

I am a member of the Language committee, which examines and approves the creation of editions of Wikipedia in new languages.

Recently we approved the new edition in the Jamaican language—an English-based creole commonly heard in reggae, in which books were published, and into which “the usual suspects” were translated: The New Testament, Alice’s Adventures in Wonderland, The Little Prince—and now, Wikipedia.

Since the draft “incubator” Wikipedia in this language conformed to the requirements for creating a full-fledged new domain, I supported the domain’s creation. My work as a language committee member could end here—and I’m a volunteer there to begin with—but I nonetheless decided to shave a yak.

bos_grunniens_at_letdar_on_annapurna_circuit

Normal people, when they need a sweater, buy one in a store. I consider shaving a yak.

Some time after a Wikipedia in a new language is created, all the draft articles from the incubator are imported. When that is completed, I go over the list of imported articles and try to see whether there are any that aren’t linked to their counterparts in other languages. With some topics it’s easy by guessing the name of the topic or by looking at the images, and with some others it’s hard. With an English-based creole it’s of course very easy.

And that’s how the Jamaican Wikipedia ended up with only one article that doesn’t have a version in any other language: Aizak Mendiz Belisario.

It was easy enough to understand that this was a Jewish artist who lived in Jamaica in the 19th century. He was already mentioned a couple of times in the English Wikipedia, but there was no whole article about him. So I thought: Jamaican is similar enough to English and I can understand what most of the article is about, and the artist seems notable enough for an encyclopedia, because he was one of the pioneers of art in Jamaica, and because an anthology about him was published recently. And, of course, I am in a team that develops Content Translation—a translation tool for Wikipedia articles. So I decided to translate it to English.

As soon as I started the translation process, I noticed a bug. So I filed it, and because it was so easy to fix, I just fixed it.

Then I started actually translating the article. On the way I learned about the John Canoe festival, and added another spelling variant to the article about it in English; I verified that the book about the artist was actually published (you know, hoaxes happen), and googled for some more information about the artist with the hope of improving the English article further.

belisario3

Normal people could just say “Fine, that language looks legit, let’s start a Wikipedia in it”. But I actually had to read all the articles in it, and then write a new one, improve another one, fix a bug, and write a blog post about all of it.

So here you go: Isaac Mendes Belisario, in English.

There is a story like this one behind every one of the millions and millions of articles in Wikipedia in all of its languages.

Pop Bookmark

A couple of years ago when Facebook was still using me, a person whose opinion I respect very much wrote words of praise to a certain musician I didn’t know as a Facebook status. The description made me think that I may like the music, but I didn’t have time to check it back then, so I made a browser bookmark to remind myself to do it.

Today I finally did it… and found out that it’s just a pop singer of the kind that doesn’t interest me very much. At least now I have one bookmark less, which is a good thing.

I still respect that person very much.

I Deleted My Facebook Account

I used Facebook quite a lot. I posted lots of things, I got to know a lot of people, I learned about things that I wouldn’t learn anywhere else, I shared experiences.

But the feeling that I am the product and Facebook is the user got stronger and stronger as time passed. It happens with many other companies and products, but with Facebook it’s especially strong.

In February 2015 I stopped posting, sharing and liking, and I deleted Facebook apps from all my other devices. I continued occasionally reading and exchanging private messages in a private browser window.

Then I noticed that a few times things were shared in my name, and people liked them and commented on them. I am sure that I didn’t share them, and I am also quite sure that it wasn’t a virus (are there viruses that do such things on GNU/Linux?). Also, a few people told me that they received messages from me, and I’m sure that I didn’t send them; It’s possible that they saw something else under my name and thought that it’s a message even though it was something else, but in any case, nobody is supposed to think such a thing. That’s not how people are supposed to interact.

I am not a bug, not an A/B test, not a robot, not an integer in a database. I am Amir Aharoni and from today Facebook doesn’t use me. There are other and better ways to communicate with people.

Stop saying that “everybody is on Facebook”. I am not. I don’t feel exceptionally proud or special. I am not the only one who does this; a few of my friends did the same and didn’t write any blog posts or make any fuss about it.

You should delete your Facebook account, too.

Amir Aharoni’s Quasi-Pro Tips for Translating the Software That Powers Wikipedia

As you probably already knew, Wikipedia is a website. A website has content—the articles, and user interface—the menus around the articles and the various screens that let editors edit the articles and communicate to each other.

Another thing that you probably already knew is that Wikipedia is massively multilingual, so both the content and the user interface must be translated.

Translation of articles is a topic for another post. This post is about getting all of the user interface translated to your language, as quickly and efficiently as possible.

The most important piece of software that powers Wikipedia and its sister projects is called MediaWiki. As of today, there are 3,335 messages to translate in MediaWiki. “Messages” in the MediaWiki jargon are strings that are shown in the user interface, and that can be translated. In addition to core MediaWiki, Wikipedia also has dozens of MediaWiki extensions installed, some of them very important—extensions for displaying citations and mathematical formulas, uploading files, receiving notifications, mobile browsing, different editing environments, etc. There are around 3,500 messages to translate in the main extensions, and over 10,000 messages to translate if you want to have all the extensions translated. There are also the Wikipedia mobile apps and additional tools for making automated edits (bots) and monitoring vandalism, with several hundreds of messages each.

Translating all of it probably sounds like an enormous job, and yes, it takes time, but it’s doable.

In February 2011 or so—sorry, I don’t remember the exact date—I completed the translation into Hebrew of all of the messages that are needed for Wikipedia and projects related to it. All. The total, complete, no-excuses, premium Wikipedia experience, in Hebrew. Every single part of the MediaWiki software, extensions and additional tools was translated to Hebrew, and if you were a Hebrew speaker, you didn’t need to know a single English word to use it.

I wasn’t the only one who did this of course. There were plenty of other people who did this before I joined the effort, and plenty of others who helped along the way: Rotem Dan, Ofra Hod, Yaron Shahrabani, Rotem Liss, Or Shapiro, Shani Evenshtein, Inkbug (whose real name I don’t know), and many others. But back then in 2011 it was I who made a conscious effort to get to 100%. It took me quite a few weeks, but I made it.

Of course, the software that powers Wikipedia changes every single day. So the day after the translations statistics got to 100%, they went down to 99%, because new messages to translate were added. But there were just a few of them, and it took me a few minutes to translate them and get back to 100%.

I’ve been doing this almost every day since then, keeping Hebrew at 100%. Sometimes it slips because I am traveling or ill. It slipped for quite a few months because in late 2014 I became a father, and a lot of new messages happened to be added at the same time, but Hebrew is back at 100% now. And I keep doing this.

With the sincere hope that this will be useful for translating the software behind Wikipedia to your language, let me tell you how.

Preparation

First, let’s do some work to set you up.

  • Get a translatewiki.net account if you haven’t already.
  • Make sure you know your language code.
  • Go to you preferences, to the Editing tab, and add languages that you know to Assistant languages.
  • Familiarize yourself with the Support page and with the localization guidelines for MediaWiki.
  • Add yourself to the portal for your language. The page name is Portal:Xyz, where Xyz is your language code.

Priorities, part 1

The translatewiki.net website hosts many projects to translate beyond stuff related to Wikipedia. Among other things it hosts such respectable Free Software projects as OpenStreetMap, Etherpad, MathJax, Blockly, and others. Also, not all the MediaWiki extensions are used on Wikimedia projects; there are plenty of extensions, with many thousands of translatable messages, that are not used by Wikimedia, but only on other sites, but they use translatewiki.net as the platform for translation of their user interface.

It would be nice to translate all of them, but because I don’t have time for that, I have to prioritize.

On my translatewiki.net user page I have a list of direct links to the translation interface of the projects that are the most important:

  • Core MediaWiki: the heart of it all
  • Extensions used by Wikimedia: the extensions
  • MediaWiki Action Api: the documentation of the API functions, mostly interesting to developers who build tools around Wikimedia projects
  • Wikipedia Android app
  • Wikipedia iOS app
  • Installer: MediaWiki’s installer, not used in Wikipedia because MediaWiki is already installed there, but useful for people who install their own instances of MediaWiki, in particular new developers
  • Intuition: a set of different tools, like edit counters, statistics collectors, etc.
  • Pywikibot: a library for writing bots—scripts that make useful automatic edits to MediaWiki sites.

I usually don’t work on translating other projects unless all of the above projects are 100% translated to Hebrew. I occasionally make an exception for OpenStreetMap or Etherpad, but only if there’s little to translate there and the untranslated MediaWiki-related projects are not very important, for example, they are unlikely to be used by anybody except a few software developers, but I translate those, too.

Priorities, part 2

So how can you know what is important among more than 15,000 messages from the Wikimedia universe?

Start from MediaWiki most important messages. If your language is not at 100% in this list, it absolutely must be. This list is automatically created periodically by counting which 600 or so messages are actually shown most frequently to Wikipedia users. This list includes messages from MediaWiki core and a bunch of extensions, so when you’re done with it, you’ll see that the statistics for several groups improved by themselves.

Now, if the translation of MediaWiki core to your language is not yet at 18%, get it there. Why 18%? Because that’s the threshold for exporting your language to the source code. This is essential for making it possible to use your language in your Wikipedia (or Incubator). It will be quite easy to find short and simple messages to translate (of course, you still have to do it carefully and correctly).

Getting Things Done, One by One

Once you have the most important MediaWiki messages 100% and at least 18% of MediaWiki core is translated to your language, where do you go next?

I have surprising advice.

You need to get everything to 100% eventually. There are several ways to get there. Your mileage may vary, but I’m going to suggest the way that worked for me: Complete the piece that is the easiest to get to 100%! For me this is an easy way to strike an item off my list and feel that I accomplished something.

But still, there are so many items at which you could start looking! So here’s my selection of components that are more user-visible and less technical, sorted not by importance, but by the number of messages to translate:

  • Cite: the extension that displays footnotes on Wikipedia
  • Babel: the extension that displays boxes on userpages with information about the languages that the user knows
  • Math: the extension that displays math formulas in articles
  • Thanks: the extension for sending “thank you” messages to other editors
  • Universal Language Selector: the extension that lets people select the language they need from a long list of languages (disclaimer: I am one of its developers)
    • jquery.uls: an internal component of Universal Language Selector that has to be translated separately for technical reasons
  • Wikibase Client: the part of Wikidata that appears on Wikipedia, mostly for handling interlanguage links
  • ProofreadPage: the extension that makes it easy to digitize PDF and DjVu files on Wikisource
  • Wikibase Lib: additional messages for Wikidata
  • Echo: the extension that shows notifications about messages and events (the red numbers at the top of Wikipedia)
  • WikiEditor: the toolbar for the classic wiki syntax editor
  • ContentTranslation extension that helps translate articles between languages (disclaimer: I am one of its developers)
  • Wikipedia Android mobile app
  • Wikipedia iOS mobile app
  • UploadWizard: the extension that helps people upload files to Wikimedia Commons comfortably
  • MobileFrontend: the extension that adapts MediaWiki to mobile phones
  • VisualEditor: the extension that allows Wikipedia articles to be edited in a WYSIWYG style
  • Flow: the extension that is starting to make talk pages more comfortable to use
  • Wikibase Repo: the extension that powers the Wikidata website
  • Translate: the extension that powers translatewiki.net itself (disclaimer: I am one of its developers)
  • MediaWiki core: the software itself!

I put MediaWiki core last intentionally. It’s a very large message group, with over 3000 messages. It’s hard to get it completed quickly, and to be honest, some of its features are not seen very frequently by users who aren’t site administrators or very advanced editors. By all means, do complete it, try to do it as early as possible, and get your friends to help you, but it’s also OK if it takes some time.

Getting All Things Done

OK, so if you translate all the items above, you’ll make Wikipedia in your language mostly usable for most readers and editors.

But let’s go further.

Let’s go further not just for the sake of seeing pure 100% in the statistics everywhere. There’s more.

As I wrote above, the software changes every single day. So do the translatable messages. You need to get your language to 100% not just once; you need to keep doing it continuously.

Once you make the effort of getting to 100%, it will be much easier to keep it there. This means translating some things that are used rarely (but used nevertheless; otherwise they’d be removed). This means investing a few more days or weeks into translating-translating-translating.

But you’ll be able to congratulate yourself on the accomplishments along the way, and on the big accomplishment of getting everything to 100%.

One strategy to accomplish this is translating extension by extension. This means, going to your translatewiki.net language statistics: here’s an example with Albanian, but choose your own. Click “expand” on MediaWiki, then again “expand” on “MediaWiki Extensions”, then on “Extensions used by Wikimedia” and finally, on “Extensions used by Wikimedia – Main”. Similarly to what I described above, find the smaller extensions first and translate them. Once you’re done with all the Main extensions, do all the extensions used by Wikimedia. (Going to all extensions, beyond Extensions used by Wikimedia, helps users of these extensions, but doesn’t help Wikipedia very much.) This strategy can work well if you have several people translating to your language, because it’s easy to divide work by topic.

Another strategy is quietly competing with other languages. Open the statistics for Extensions Used by Wikimedia – Main. Find your language. Now translate as many messages as needed to pass the language above you in the list. Then translate as many messages as needed to pass the next language above you in the list. Repeat until you get to 100%.

For example, here’s an excerpt from the statistics for today:

MediaWiki translation stats exampleLet’s say that you are translating to Malay. You only need to translate eight messages to go up a notch. Then six messages more to go up another notch. And so on.

Once you’re done, you will have translated over 3,400 messages, but it’s much easier to do it in small steps.

Once you get to 100% in the main extensions, do the same with all the Extensions Used by Wikimeda. It’s over 10,000 messages, but the same strategies work.

Good Stuff to Do Along the Way

Never assume that the English message is perfect. Never. Do what you can to improve the English messages.

Developers are people just like you are. They may know their code very well, but they may not be the most brilliant writers. And though some messages are written by professional user experience designers, some are written by the developers themselves. Developers are developers; they are not necessarily very good writers or designers, and the messages that they write in English may not be perfect. Keep in mind that many, many MediaWiki developers are not native English speakers; a lot of them are from Russia, Netherlands, India, Spain, Germany, Norway, China, France and many other countries, and English is foreign to them, and they may make mistakes.

So report problems with the English messages to the translatewiki Support page. (Use the opportunity to help other translators who are asking questions there, if you can.)

Another good thing is to do your best to try running the software that you are translating. If there are thousands of messages that are not translated to your language, then chances are that it’s already deployed in Wikipedia and you can try it. Actually trying to use it will help you translate it better.

Whenever relevant, fix the documentation displayed near the translation area. Strange as it may sound, it is possible that you understand the message better than the developer who wrote it!

Before translating a component, review the messages that were already translated. It’s useful for learning the current terminology, and you can also improve them and make them more consistent.

After you gain some experience, create a localization guide in your language. There are very few of them, and there should be more. Here’s the localization guide for French, for example. Create your own with the title “Localisation guidelines/xyz” where “xyz” is your language code.

As in Wikipedia, Be Bold.

OK, So I Got to 100%, What Now?

Well done and congratulations.

Now check the statistics for your language every day. I can’t emphasize how important it is to do this every day.

The way I do this is having a list of links on my translatewiki.net user page. I click them every day, and if there’s anything new to translate, I immediately translate it. Usually there is just a small number of new messages to translate; I didn’t measure, but usually it’s less than 20. Quite often you won’t have to translate from scratch, but to update the translation of a message that changed in English, which is usually even faster.

But what if you suddenly see 200 new messages to translate? It happens occasionally. Maybe several times a year, when a major new feature is added or an existing feature is changed.

Basically, handle it the same way you got to 100% before: step by step, part by part, day by day, week by week, notch by notch, and get back to 100%.

But you can also try to anticipate it. Follow the discussions about new features, check out new extensions that appear before they are added to the Extensions Used by Wikimedia group, consider translating them when you have a few spare minutes. At the worst case, they will never be used by Wikimedia, but they may be used by somebody else who speaks your language, and your translations will definitely feed the translation memory database that helps you and other people translate more efficiently and easily.

Consider also translating other useful projects: OpenStreetMap, Etherpad, Blockly, Encyclopedia of Life, etc. The same techniques apply everywhere.

What Do I Get for Doing All This Work?

The knowledge that thanks to you people who speak your language can use Wikipedia without having to learn English. Awesome, isn’t it?

Oh, and enormous experience with software localization, which is a rather useful job skill these days.

Is There Any Other Way in Which I Can Help?

Yes!

If you find this post useful, please translate it to other languages and publish it in your blog. No copyright restrictions, public domain (but it would be nice if you credit me). Make any adaptations you need for your language. It took me years of experience to learn all of this, and it took me about four hours to write it. Translating it will take you much less than four hours, and it will help people be more efficient translators.

The Case for Localizing Names, part 3

I love music.

In particular, I love Israeli music.

In the last few years, I usually have some files of Israeli music with me when I leave my home, or my country – on my laptop or on my phone (ripped from CDs that I own, which is legit as far as my interpretation of copyright law goes).

And sometimes people from other countries are curious about it and ask me to copy some files for them. This is a copyright issue, but I justify it by the fact that they hardly have a chance to purchase it where they live, so they aren’t really hurting the relevant market. But there’s something bigger: a technical issue with the artist and song names.

Hebrew is written in the Hebrew alphabet. CDs have artist names and song titles in Hebrew, with English translations or transliterations added only occasionally. When I rip CDs, I give the files names in Hebrew letters. Most people around the world don’t know the Hebrew alphabet, so looking for a song they like using these files will be impossible for them. They would only be able to enjoy them if they don’t mind listening to everything in a shuffle. And though the newest phones are able to display Hebrew correctly, some devices that people have are still unable to do that.

I actually recall myself renaming files en masse to let friends from other countries listen to some Israeli music and now the artists’ names.

I’m not sure how to resolve this robustly, but much like with email and social networks and with legal forms, songs could use titles in different languages or scripts. Maybe MusicBrainz or Wikidata could add a structured property for transliterated song titles, and music files could be identified like that. Maybe each music track could have multiple fields for titles in different languages.

It’s good not just for international exchange between friends, but for marketing, too – some cultures only listen to music in English and maybe in their own language, but some are OK with listening to music in a lot of languages, because they are all equally foreign.

Long story, song names must be more easily localizable than they are today.

How to make hummus

Preparation
Get a big food processor. A stick blender will work, but a big sturdy strong food processor that can work uninterrupted for a few minutes is better.
A cup of chickpeas

A cup of chickpeas

Get small chickpeas. (Big ones work, too, but the smaller they are, the softer they get, and it’s important.)
Wash with flowing water, and remove bad ones (black, stale, etc.)

Chickpeas in water

Chickpeas in water

Put chickpeas in water for at least 24 hours. Keep them in a refrigerator. Change the water every six hours or so. I usually have them in the water for two or three days. They will increase twice or more in size during this time, so use a large receptacle.

Peeling

Optionally, you may peel your chickpeas. It may make the final paste slightly smoother, but it’s very time-consuming.

Peeled vs unpeeled chickpeas

Peeled vs unpeeled chickpeas

Boiling
Boil the chickpeas in a pot on a small stove until they are soft. “Soft” means that you can crush them with your fingers or teeth as easily as a boiled green pea. This may take a few hours, depending on weather, water quality, type of pot, fire intensity, and of course the chickpeas themselves. Usually it takes me somewhere between two and four hours. I begin in the morning and it’s ready by lunch time. (Arabs frequently do it overnight and have it as breakfast.)

I’ve been told that using a pressure cooker can shorten the time a lot, but I never tried it. But covering the pot while boiling is certainly a good idea.
Mixing
For one cup of chickpeas you’ll need:
Salt, cumin, pepper, olive oil, tahini, lemon, garlic

Salt, cumin, pepper, olive oil, tahini, lemon, garlic

  • Half a cup or more of tahini. Try to get something produced in Israel or an Arab country – Palestine, Lebanon, Egypt. In Israel, Tahini from Nablus is very highly regarded. Uzbek or Turkish tahini may be OK, but I’m not sure. Get raw tahini: it should have nothing but sesame in the ingredients (and maybe oil, but even that is unnecessary). Don’t use “tahini salads”, “seasoned tahini”, or “tahini spreads” if they have anything except sesame.
  • Half a cup of olive oil.
  • Fresh cold water. Some people use the water in which the chickpeas were boiled, and it’s OK, but fresh cold water gives the final product brighter color. For the amount see below.
  • Squeezed lemon juice. Half a lemon may be enough, but it can go up to a whole lemon or even more if you like it.
  • A clove of garlic. Some people don’t use it – a matter of taste.
  • A pinch of cumin. Just a tiny little pinch – it gives enough taste. Too much of it won’t ruin the taste, but will darken the color.
  • Salt and black pepper to taste. Small pinches should be enough.
Put the garlic, the cumin and a couple of spoons of chickpeas (without water) in the food processor and grind for about a minute. Add olive oil, lemon juice, and a bit of tahini. Grind for a minute more. Check the consistency. It will still be far from the final product, but should start looking like a paste.
Let's start!!!

Let’s start!!!

Add a quarter of a cup of water and grind a bit more. From here on, keep adding chickpeas, tahini, water, salt and pepper. Be especially careful with water – too much of it will make the whole thing too liquid, so add it little by little until the consistency looks beautiful and tastes well. Adding a lot of tahini is usually a good thing, but also depends on your taste.

Adding tahini and pepper

Adding tahini and pepper

It may be a good idea not to grind all the chickpeas, but to keep some boiled ones and add them as a topping. In fact, many hummus restaurants serve plates of hummus with lots of non-ground chickpeas in the middle, but do make sure that they are very soft.

Grind, grind, grind, grind, grind!

Grind, grind, grind, grind, grind!

Serving
Most commonly, it’s spread on a plate and “wiped” with a pita, but knock yourself out and serve it any way that is tasty to you :)
Basic: with whole boiled chickpeas, parsley, olive oil, cumin and paprika

Basic: with whole boiled chickpeas, parsley, olive oil, cumin and paprika

Very often it is spread on the plate using a spoon in a few rounds so that most of it is close to the edges and the middle of the plate is mostly empty and filled with additions, such as:
  • Boiled soft chickpeas
  • Fried mushrooms
  • Fava beans
  • Hard-boiled egg
  • Baked eggplant
The universal toppings are a bit of olive oil, black pepper, paprika and turmeric.
Another version - with fried mushrooms and the chickpeas mixed in

Another version – with fried mushrooms and the chickpeas mixed in

Variations
  • A lot of people suggest adding a spoon of baking soda while boiling. They say that it makes the chickpeas softer. I tried it a few times, and it doesn’t hurt, but not really necessary either.
  • It’s OK to cheat by buying a can of preserved whole chickpeas if they are sold in your area. They are already soft, so you only need to boil them for a few minutes. It saves you a lot of time and the taste is fine.

The first ten or so times that I tried to do it, it was very far from brilliant. It can take years to become good at it. Don’t let it discourage you :)

The Stupidest Sentence I’ve Ever Read

The stupidest sentence I’ve ever read was not written by a child. Not by a religious demagogue. Not by a YouTube user. Not by a politician and not by a political opinion blogger. Not by somebody who discovered a fun folk etymology.

All such people are expected to write stupid sentences, but they are all understandable in their context. Even the religious demagogue. I just don’t expect anything smart there.

No, the single stupidest sentence that I’ve ever read was written by a Harvard Medical School professor.

“We all know that exercise makes us feel better, but most of us have no idea why.”

This is the opening sentence of a book called Spark!: How exercise will improve the performance of your brain by John Ratey and Eric Hagerman.

The rest of this book may well be good, but I just couldn’t get past this. Seriously? Seriously? Opening a book that purports to be scientific, even if popular, with a sentence that is so easily falsified is a complete non-starter for me.

Exercise doesn’t make me feel better. And I damn well know why. It makes me feel like I’m tired and bored. It makes my body hurt. If makes me think that I’m investing time and effort in something exceptionally pointless and negative while I could do something useful. It does not make me feel anything positive at all.

This book, which is supposed to convince me to do exercise, does precisely the opposite with its opening sentence: It makes me hate the thought of exercise even more.

I first read that sentence a couple of years ago. Today I saw the book on the shelf, and I am still convinced that it’s the stupidest one I’ve ever read. I don’t care about “setting the mood”. I don’t care that that’s how book marketing works. I like things that have meaning, and sadly this book throws meaning out the window right from the start.

Feel free to call me a lazy ass, but you’ll be missing the point.


Archives


Follow

Get every new post delivered to your Inbox.

Join 2,414 other followers