Archive for the 'software' Category

Weird GMail Habit: Removing Control Characters

GMail has a weirdish feature that probably very few people except me know about. When using it with a Hebrew user interface, invisible control characters—LRM, RLM, RLE, LRE and the like—are added to some strings to make them appear correctly in a mixed-direction interface.

Most notably, they are added to email addresses. I sometimes want to copy these email addresses as text, and my mouse pointer picks the control characters as well. Of course, these control characters are by themselves invisible to humans, but very much visible to computers, and an email address with these characters is not correct, even if it appears to be the same to human eyes.

It already became a habit for me to carefully delete and manually restore the first and the last characters of an email address to make sure that the control characters are removed.

It would be better if GMail just used the <bdi> element or CSS bidi isolation. They are fairly well supported in modern browsers and provide better experience.

Serbian Spam

I always celebrate when I receive spam in a language in which I haven’t yet received spam. I just received spam in Serbian for the first time. It was in the Cyrillic alphabet; Serbian can also be written in Latin, and it is frequently done in Serbia, possibly even more frequently than in Cyrillic, even though the government prefers Cyrillic.

This makes me wonder: Is Serbian in Cyrillic popular and important enough for spamming in it, or did the silly spammer just use Google Translate to translate to Serbian and got the result in Cyrillic, because that’s what Google Translate does?

If you know Serbian, can you please tell me whether it looks real or machine-translated? Words like “5иеарс” and the spaces before the punctuation marks give me a strong suspicion that it’s machine translation, but I might be wrong.

Молим вас за попустљивост за нежељене природи овог писма , али је рођена из очаја и тренутног развоја . Молимо носе са мном . Моје име је сер Алекс Бењамин Хубертревизор Африке развојне банке открио постојећи налог за успавану 5иеарс .

Када сам открио да није било ни наставак ни исплате са овог рачуна на овог дугог периода и наши банкарских закона предвиђа да ће било неупотребљивим чине више од 5иеарс иду на банковни прихода као неостварен фонда .

Ја сам се распитивала за личне депонента и његове најближе , али нажалост ,депонент и његове најближе преминуо на путу до Сенегала за тајкун , а он је оставио иза себе нема тело за ову тврдњу само сам направио ову истрагу само да буде двоструко сигурни у ту чињеницу , а пошто сам био неуспешан у лоцирању родбину .

So, how does it look? And do you receive Serbian spam? Thanks.

The Fateful March of 1998 – my #webstory

I first connected to the web in the summer of 1997. I bought a new computer with Windows 95 and Microsoft Internet Explorer 2. For about a week I thought that that’s how the web is supposed to look, but I kept seeing messages saying “Your browser doesn’t support frames” on a lot of sites. And then I found that there’s this thing called Microsoft Internet Explorer 3. I went to microsoft.com and downloaded it. It was the first piece of software that I downloaded. It was about 10 megabytes and took about an hour on my dial-up connection.

Most notably, Microsoft Internet Explorer 3 supported frames and animated GIFs. I loved animated GIFs! I guess that it makes me quite a hipster.

A cat in headphones dancing to house music.

House cat. Sorry, it’s an anachronism— this animated GIF is from mid-2000s. 1997′s animated GIFs were quite different.

And then Microsoft Internet Explorer 4 came out. I thought—”well, if the move from IE2 to IE3 made such a big difference, then I guess that I should try number 4, and it will be even cooler”. And I tried. And it was a disaster. The installation screwed up everything on my computer. I had no idea how to disable the dreaded Active Desktop, which it introduced. It didn’t work so well with my Hebrew version of Windows 95. So I did what a lot of people did very often back then and formatted my hard drive and re-installed Windows.

And the question arose—which browser should I use? IE3 was stable, but I didn’t like that it was getting old. So I went to netscape.com, to try that Netscape Navigator browser that I kept hearing everybody talking about it.

And I loved it.

I loved its nifty toolbars and its bookmarks manager. I loved the crash reporting; it crashed quite often, actually, but I didn’t feel so bad about it, because Microsoft’s programs crashed often, too, and in case of Netscape I felt good about reporting these crashes. Netscape’s email program, Netscape Messenger, was truly outstanding. I especially loved the green dot, which marked messages as read and unread in one click. Most of all, it said very clearly something that I came to realize only years later: “I am a program that lets you browse the web as well as possible. I am not trying to do anything else.”

Fast forward to March 1998. Netscape made the big announcement that the development of its browser becomes an open source project code-named “Mozilla”. I started hearing about “open source”, “free software” and Linux shortly before that, but it was mostly in the context of crazy geek hobbyists. And then suddenly a big famous end-user product that I love becomes open source—that felt really cool.

I followed Mozilla news since then. I heard about Bugzilla before its first version was released. I liked Mozilla’s decision to redo the whole rendering based on standards, even though many people criticized it. The thing that annoyed me the most in Mozilla’s early years was the lack of support for proper right-to-left text support, which was present in Internet Explorer. That’s why I, sadly, used mostly IE, and even became a bit of an IE power user. But I waited eagerly for Mozilla to do it and tried every alpha release.

"Are you fed up with your browser? You're not alone. We want you to know that there's an alternative... Firefox." The logo of Firefox is drawn with names of people.

The famous New York Times ad.

I was thrilled about the announcement of Firefox, the first stable version of Mozilla’s browser. I gave 10$ to the famous 2004 New York Times Firefox advertisement, and I still have the poster of that advertisement at home.

A long list of names, including Amir Elisha Aharoni

And there’s my name. Third line in the middle.

It always seemed natural to me that I follow Mozilla news so eagerly. I thought that everybody does it. I mean, how is it even possible to use the web in any way without being at least a bit curious about the technology that runs it?

And then in 2008 I wrote a little unimportant post in my Hebrew blog about a funny spelling correction. Tomer Cohen commented on it and suggested me to try the Hebrew spelling dictionary and Hebrew Firefox in general. And that’s how my big love story with software localization began.

I started sending corrections to the translation of Firefox’s interface translation. I started sending corrections to the Hebrew spelling dictionary. I got so curious about the way the spelling dictionary was built that I ended up doing a whole university degree in Hebrew Language. Really.

And in 2011 I started working in the Language Engineering team in the Wikimedia Foundation. I love it, and it probably wouldn’t have happened without my involvement with Mozilla. In the same year I also became a Mozilla Rep—a volunteer representative of Mozilla at conferences, blogs and forums.

Probably the most important thing that I learned from my Mozilla story is that loving the web and being curious about it is not something obvious. Most people just want something that works for checking weather, news, Facebook friends updates, homework help and kitten videos. And for the most part, that is perfectly fine. But the people’s freedom to read reliable and complete news on any electronic device cannot actually be taken for granted. Neither the people’s freedom and privacy to share their thoughts in social networks. Mozilla is among the most important organizations that care for these things and it develops technologies that make them possible. Technologies that let you browse the web as well as possible and don’t try to do anything else.

We do it for one simple reason: We love the web.

Do you love it, too?

P.S. As I began writing this post, I realized that Microsoft’s Active Desktop was not so different from today’s devices, which are heavily based on web technologies: Firefox OS, Chrome OS and others. I can’t say that I love Microsoft, but as it often happens, it was quite pioneering with ideas, and not so good with their execution. Credit where credit’s due.

The Case for Localizing Names

I often help my friends and family members open email accounts. Sometimes they are starting to use the Internet and sometimes they move from old email services (Yahoo, Walla!, ISP) to something modern (like it or not, GMail).

At some point they have to fill their name, which will appear in the “from” field. And then I have to suggest them to write it in Latin characters, even though most of them speak languages that aren’t written in Latin characters – mostly Hebrew and Russian. Chances are that some day they will send an email to somebody who cannot read Russian or Hebrew, and Latin is relatively better known.

Only relatively, though. It may seem obvious to you that everybody knows the Latin script, but in fact, a lot of people are not comfortable with it at all. There are also other complications: lossy and inconsistent transliteration rules (is Amir אמיר or עמיר?), potential right-to-left rendering problems, and more. And of course, all people are happy to see their name in their language.

And people are also happy to see their friends’ names in their own language and not in a foreign or a neutral language. I have, for example, a lot of friends in India. Most of them write their names in English, but some write it in Marathi or in Malayalam. It’s certainly good for them, but in practice it’s much harder for me to find them this way, so English would be better – but Hebrew or Russian would be better yet.

Finally, there are a lot of people in the world who have more than one linguistic background. Mine are Russian, Hebrew and English, and I am really not such a special case. There are many millions of immigrants who have mixed backgrounds: Punjabi-Hindi-Urdu-English, Kurdish-Turkish-German, Kazakh-Russian-Norwegian, and others, and others and others. From each of these backgrounds they have friends, co-workers and family members, with whom they would love to communicate in the respective language. In each of these backgrounds they have friends who would want to find them using the name under which they know them there and using the appropriate language and writing system.

And sometimes people change their names, too. I did once, and so have many other people.

All this means that people’s names should be translatable, just like books, articles and software interfaces. Facebook and Google+ allow me to add a very limited number of names in foreign languages. Why wouldn’t they let me write my name in four, five, ten languages? This would make it easier for people who speak these languages to find me and to communicate with me. I would go even further and allow people who speak languages that I don’t know well to write my name as their hear it in their language and to add it to my details. Yet again, this would make me easier to find to even more people.

Some degree of automation can be possible. A lot of names are, after all, repetitive, so social networks would be able to suggest people with common names how their name would be written in other languages.

Wikipedia is actually quite good in this regard: Usually people have the same username across projects, and this username is not necessarily written in Latin letters, but people can customize the appearance of their signature in each project. I did it in a few languages, and people who speak those languages appreciate it.

I can only hope that social networks and email systems will allow as much flexibility as possible with this.

File System

Dear software industry,

Please stop forcing programs that organize the music on my computer down my throat. I already have a program that does it pretty well. It’s called a file system.

Instead of investing your time and effort in writing pointless software that gets in my way when I want to listen to a song, invest in an education program that will teach the human race how to do basic things with a computer. For example, to understand what a file is.

Thank you.

Firefox Aurora – Mozilla’s biggest breakthrough since Firefox itself

This post encourages you to be a little more adventurous. Please try doing what it says, even if you don’t consider yourself a techie person.

The release of Firefox 4 in March 2011 brought many noticeable innovations in the browser itself, but there was another important innovation that was overlooked and misunderstood by many: A new procedure for testing and releasing new versions.

Before Firefox 4, the release schedule of the Firefox browser was inconsistent and versions were released “when they were ready”. Beta versions were released at rather random dates and quite frequently they were unstable. Nightly builds were appropriately called “Minefield” – they crashed so often that it was impossible to use them for daily web browsing activities.

The most significant breakthrough with regards to the testing of the Firefox browser came a year ago: Mozilla decided on a regular six-week release schedule and introduced the “release channels”: Nightly, Aurora, Beta and Release. The “Release” version is what most people download and use. “Beta” could be called a “Release candidate” – few, if any, changes are made to it before it becomes “Release”. Both “Aurora” and “Nightly” are updated daily and the differences between them are that “Nightly” has more experimental features that come right from the developers’ laptops and that “Aurora” is usually released with translations to all the languages that Firefox supports, while “Nightly” is mostly released in English.

Now here’s the most important part: I use Aurora and Nightly most of the time and my own experience is that both of them are actually very stable and can be used for daily browsing. It’s possible to install all the versions side-by-side on one machine and to have them use the same add-ons, preferences, history and bookmarks. This makes it possible for many testers to fully use them for whatever they need the browser for in their life without going back to the stable version. There certainly are surprises and bugs in functionality, but i have yet to encounter one that would make me give up. In comparison, in the old “Minefield” builds the browser would often crash before a tester would even notice these bugs, so it not so useful for testing.

This change is huge. Looking back at the year of this release schedule, this may be the biggest breakthrough in the world of web browsers since the release of Firefox 1.0 in 2004. In case you forgot, before Firefox was called “Firefox”, it was just “Mozilla”; it was innovative, but too experimental for the casual user: it had clunky user interface and it couldn’t open many websites, which were built with only Microsoft Internet Explorer in mind. Consequently, it was frequently laughed at. “Firefox” was an effort to take the great innovative thing that Mozilla was, clean it up and make it functional, shiny, inviting and easy to install and use. That effort was an earth-shaking success, that revived competition and innovation in Internet technologies.

Aurora does to software testing what Firefox did to web browsing. It makes beta testing easy and fun for many people – it turns testing from a bug hunting game that only nerds want to play into a fun and unobtrusive thing that anybody can do without even noticing. And it is a yet another thing that the Mozilla Foundation does to make the web better for everybody, with everybody’s participation.

A few words about Mozilla’s competitors: The Google Chrome team does something similar with what they call “Canary builds”. I use them to peek into the future of Chrome and i occasionally report bugs in them, but i find them much less stable than Firefox Nightly, so they aren’t as game-changing. Just as Minefield from Mozilla’s distant past, they crash too often to be useful as a daily web browser, so i keep going back to Firefox Aurora. Microsoft releases new versions of Microsoft Internet Explorer very rarely and installing future test versions is way too hard for most people, so it’s not even in the game. Opera is in the middle: It releases new versions of its browser quite frequently and offers beta builds for downloading, but it doesn’t have a public bug tracking system, so i cannot really participate in the development process.

To sum things up: Download Firefox Aurora and start using it as your daily browser and report bugs if you find any. You’ll see that it’s easier than you thought to make the Web better.

Kim Jong Il, Tumblr, WebFonts and Firefox

Kim Jong Il died.

Then a humorous blog called “kim jong-il looking at things” surged in popularity.

I looked at it, too, and found it funny.

And then i looked at its about section and became sad. Its about section said: “for a more beautiful experience use google chrome or safari. font-face seems to have an issue with firefox and will display a very bland arial instead of the exquisite amaranth.” Someone reading this may think that it’s a bug in Firefox, but as a matter of fact, Firefox is the browser that implements font-face correctly according to the CSS standard.

This Kim Jong Il blog is hosted on tumblr.com – a nice and stylish blog service. Among other services, tumblr gives its gives users an option to use web fonts to improve the appearance of their blogs. tumblr’s developers probably only tested this feature with Chrome and Safari and when it didn’t work on Firefox nobody cared – after all, as nice as it is, it’s just another English font.

tumblr.com has the same issue that Wikipedias in Indic languages had after we installed WebFonts there – it tries to load the font files from a different server, but Firefox, according to the standard, doesn’t load the font from a different domain if that domain is not explicitly configured to support font loading. We in Wikimedia fixed it immediately after finding it, because using web fonts for us is a way to make our website readable. For tumblr, as for most other English websites, using web fonts is just a way to make the website a little more beautiful.

tumblr.com should fix this bug. I reported this font problem at getsatisfaction.com, hoping that tumblr developers would notice it. It hasn’t been done yet, even though it’s a one-line fix.

tumblr webmasters! If you happen to read this post – please fix this issue. Thank you.

MozCamp Berlin 2011, part 2

Except the general topic of Loving the Web, there was another important topic present in almost every time slot of MozCamp Berlin 2011, a topic that interest me more than anything else in software: localization. I attended most of the localization talks and gave one myself.

MozCamp Berlin 2011 WorldReady

MozCamp Berlin 2011 WorldReady

  • Vito Smolej from Slovenia gave two important talks about Translation Memory, especially in OmegaT. Translation Memory is barely used in Mozilla localization projects, even though it could make things much more efficient and Vito showed some ways in which it could be employed.
  • Jean-Bernard Marcon from France talked about the state of the BabelZilla site, which is used to translate Mozilla add-ons. Gladly, i didn’t have to tell him that despite the impressive amount of localizations that are done at that site, it is very problematic because of numerous technical issues – he said himself that he’s well aware of them and is going to replace the software completely Real Soon Now. I found it a little strange, however, that Jean-Bernard is happy about using the site for translating only Mozilla add-ons and doesn’t want to extend it to any other projects – say, Firefox itself. Oh well, as long as he maintains the add-ons site well, i’m happy.
  • Chris Hofmann and Jeff Beatty gave a great presentation about the present and the future of organizing localization groups and communicating about it. Frankly, it’s not all that i hoped to hear, but i’m really happy just to know that Mozilla, like Wikimedia, now has a guy whose job is to communicate about localization.

And i gave a talk that compares the localization of Mozilla and MediaWiki, the software behind Wikipedia. The slides are here. Many people who attended it said that it was bold of me to say these rather negative things about Mozilla. It is somewhat true – it is quite bold of me to use the first major Mozilla event i attend as a bully pulpit to promote my other project, but the talk was generally well-received. I believe that i succeeded at making my point: Both Mozilla and MediaWiki are leaders in the world of massively localized Free Software and both projects have things to learn from each other – Mozilla can simplify its translation workflow and consider converging its currently sprawling tools and procedures, as it is in MediaWiki, and MediaWiki can learn a lot from Mozilla about building the localization teams as communities of people and about quality control.

Finally, i was very glad to meet Dwayne Bailey and Alexandru Szasz – developers of Pootle and Narro, two localization tools used in the Mozilla world. Talking to them was very interesting and inspiring – they both understand well the importance of localization and the shortcomings of the current tools, including the ones that they are developing, and they are keen on fixing them. As a result of this excellent meeting i completed the translation of Pootle itself into Hebrew. And there is more to come.

MozCamp Berlin 2011, part 1

On November 12–13 i participated in MozCamp Berlin. (I’m writing this late-ish, because a day after that i went to India to participate in a Wikimedia conference and not one, but two hackathons. That was a crazy month.)


In the past i participated in small events of the Israeli Mozilla community, but this was my first major Mozilla-centric event.

MozCamp Berlin 2011 group photo

MozCamp Berlin 2011 group photo. Notice the fox on the left and yours truly on the right.

The biggest thing that i take from this event is the understanding that i belong to this community of people who love the web. I never properly realized it earlier; i somehow thought that loving the web is a given. It is not.

Johnathan Nightingale, director of Firefox Engineering repeated the phrase “we <3 the web” several times in his keynote speech. And this is the thing that makes the Mozilla community special.

Firefox is not the only good web browser. Opera and Google Chrome are reasonably good, too. Frankly, they are even better than Firefox in some features, though i find them less essential.

Firefox is not the only web browser that strives to implement web standards. Opera, Google Chrome and even recent versions of Microsoft Internet Explorer try to do that, too.

Firefox is not even the only web browser that is Free Software. So is Chromium.

But Firefox and the Mozilla community around it love the web. I don’t really have a solid way to explain it – it’s mostly a feeling. And with other browsers i just don’t have it. They help people surf the web, but they aren’t in the business of loving it.

And this is important, because the Internet is not just a piece of technical infrastructure that helps people communicate, do business and find information and entertainment. The Internet is a culture in itself – worthy of appreciation in itself and worthy of love in itself – and the Mozilla community is there to make it happen.

Some people would understand from this that Firefox is for the nerds who care about the technology more than they care about going out every once in a while. It isn’t. It’s not, in fact, just about a browser. It’s about the web – more and more Mozilla is not just developing a great browser, but also technologies and trends that affect all users of all browsers, rather than target markets. By using Firefox you get as close as you can to the cutting edge, not just of cool new features, but of openness and equality. Some people may find this ideology boring and pointless; i find it important, because without it the Internet would not be where it is today. Imagine an Internet in which the main sites you visit every day are not Facebook, Wikipedia, Google and your favorite blogs, but msn.com… and nothing but msn.com. Without Mozilla that’s how the Internet would probably look today. Without Mozilla something like this may well happen in the future.


Thanks a lot to William Quiviger, Pierros Papadeas, Greg Jost and all the other hard-working people who produced this great event.

More about it in the next couple of posts very soon.

The Software Localization Paradox

Wikimania in Haifa was great. Plenty of people wrote blog posts about it; the world doesn’t need a yet another post about how great it was.

What the world does need is more blog posts about the great ideas that grew in the little hallway conversations there. One of the things that i discussed with many people at Wikimania is what i call The Software Localization Paradox. That’s an idea that has been bothering me for about a year. I tried to look for other people who wrote about it online and couldn’t find anything.

Like any other translation, software localization is best done by people who know well both the original language in which the software interface was written – usually English, and the target language. People who don’t know English strongly prefer to use software in a language they know. If the software is not available in their language, they will either not use it at all or will have to memorize lots of otherwise meaningless English strings and locations of buttons. People who do know English often prefer to use software in English even if it is available in their native language. The two most frequent explanations for that is that the translation is bad and that people who want to use computers should learn English anyway. The problem is that for various reasons lots of people will never learn English even if it would be mandatory in schools and useful for business. They will have to suffer the bad translations and will have no way to fix it.

I’ve been talking to people at Wikimania about this, especially people from India. (I also spoke to people from Thailand, Russia, Greece and other countries, but Indians were the biggest group.) All of them knew English and at least one language of India. The larger group of Indian Wikipedians to whom i spoke preferred English for most communication, especially online, even if they had computers and mobile phones that supported Indian languages; some of them even preferred to speak English at home with their families. They also preferred reading and writing articles in the English Wikipedia. The second, smaller, group preferred the local language. Most of these people also happened to be working on localizing software, such as MediaWiki and Firefox.

So this is the paradox – to fix localization bugs, someone must notice them, and to notice them, more people who know English must use localized software, but people who know English rarely use localized software. That’s why lately i’ve been evangelizing about it. Even people who know English well should use software in their language – not to boost their national pride, but to help the people who speak that language and don’t know English. They should use the software especially if it’s translated badly, because they are the only ones who can report bugs in the translation or fix the bugs themselves.

(A side note: Needless to say, Free Software is much more convenient for localization, because proprietary software companies are usually too hard to even approach about this matter; they only pay translators if they have a reason to believe that it will increase sales. This is another often overlooked advantage of Free Software.)

I am glad to say that i convinced most people to whom i spoke about it at Wikimania to at least try to use Firefox in their native language and taught them where to report bugs about it. I also challenged them to write at least one article in the Wikipedia in their own language, such as Hindi, Telugu or Kannada – as useful as the English Wikipedia is to the world, Telugu Wikipedia is much more useful for people who speak Telugu, but no English. I already saw some results.

I am now looking for ideas and verifiable data to develop this concept further. What are the best strategies to convince people that they should use localized software? For example: How economically viable is software localization? What is cheaper for an education department of a country – to translate software for schools or to teach all the students English? Or: How does the absence of localized software affect different geographical areas in Africa, India, the Middle East?

Any ideas about this are very welcome.



Follow

Get every new post delivered to your Inbox.

Join 1,701 other followers