It seems that the new GMail compose interface overrides Firefox’s Ctrl-Shift-X shortcut, which switches the writing direction. It also overrides the right-click->Switch writing direction function; it simply doesn’t do anything.
I can probably switch the direction by using rich text, but using rich text has its own issues, and I usually want to send my email in plain text.
Dear Google, please fix this. I tried the new compose interface several times and I complained about this problem in emails to my googler friends. Unfortunately this is still not fixed, and starting from today I can’t go back to the old compose interface.
I understand, of course, that GMail is a free service that doesn’t come with a warranty. Dear Google, I am asking you a favor. You did, in fact, contribute quite a lot to the development of support for right-to-left languages on the Web. I am only asking you to keep this support good.
P.S. Dear Google, please ask Google employees who speak right-to-left languages to use Google products in these languages, and to write email in these languages. Dog-fooding is the best testing. Thank you, again.
For the last couple of years I’ve been helping my parents to learn to use computers. Mostly very common and well-known things: GMail, Picasa, seraching Google, reading news websites, talking on Skype, the Russian social network Odnoklassniki, and not much more than that.
One of the most curious things that I found in my experiences with them is that emails and popups about new features are completely unhelpful to them. They always call me when they get them and ask me what to do now. It is awkward, because basically the emails tell them what to do, but instead of reading them and learning, they are reading them aloud to me:
— “It says: ‘Now you can find your friends more easily by typing their names in the search box’—so what do I do now?”
— “I don’t know… When you want to find somebody, type their names in the search box maybe?”
I am not saying that my parents are stupid; they aren’t. I am saying that these emails are not helpful. They appear to arrive from the helpful people in Google or Odnoklassniki, but the fact is that every time it happens, my parents are confused.
This makes me wonder: Is the effectiveness of these emails and popups and callouts researched? What are they good for? I don’t find them useful, because I actually like to find out things by myself; that’s my idea of user-friendliness: if it’s not self-explanatory, it is not user-friendly. My parents don’t find them useful, because they ask me what do the have to do. So is it useful for anybody?
PS 1: I know that Odnoklassniki is awful. They insisted.
PS 2: I know that Skype is not Free Software and that it doesn’t respect people’s privacy. Give me something properly Free that actually works. For what it’s worth, I did teach both of my parents to use Firefox and they hate other browsers, and on my mother’s laptop I installed Fedora, so except Skype, her online experience is almost completely Free.
In this post I shall use the Tower of Babel in a somewhat more relevant and specific way: It will speak about multilingualism and about Babel itself.
This is how most people saw the Wikipedia article about the Tower of Babel until today:
And this is how most people will see it from today:
Notice how the Akkadian name now appears as actual Akkadian cuneiform, and not as meaningless squares. Even if you, like most people, cannot actually read cuneiform, you probably understand that showing it this way is more correct, useful and educational.
This is possible thanks to the webfonts technology, which was enabled on the English Wikipedia today. It was already enabled in Wikipedias in some languages for many months, mostly in languages of India, which have severe problems with font support in the common operating systems, but now it’s available in the English Wikipedia, where it mostly serves to show parts of text that are written in exotic fonts.
The current iteration of the webfonts support in Wikipedia is part of a larger project: the Universal Language Selector (ULS). I am very proud to be one of its developers. My team in Wikimedia developed it over the last year or so, during which it underwent a rigorous process of design, testing with dozens of users from different countries, development, bug fixing and deployment. In addition to webfonts it provides an easy way to pick the user interface language, and to type in non-English languages (the latter feature is disabled by default in the English Wikipedia; to enable it, click the cog icon near “Languages” in the sidebar, then click “Input” and “Enable input tools”). In the future it will provide even more abilities, so stay tuned.
If you edit Wikipedia, or want to try editing it, one way in which you could help with the deployment of webfonts would be to make sure that all foreign strings in Wikipedia are marked with the appropriate HTML lang attribute; for example, that every Vietnamese string is marked as <span lang=”vi” dir=”ltr”>. This will help the software apply the webfonts correctly, and in the future it will also help spelling and hyphenation software, etc.
This wouldn’t be possible without the help of many, many people. The developers of Mozilla Firefox, Google Chrome, Safari, Microsoft Internet Explorer and Opera, who developed the support for webfonts in these browsers; The people in Wikimedia who designed and developed the ULS: Alolita Sharma, Arun Ganesh, Brandon Harris, Niklas Laxström, Pau Giner, Santhosh Thottingal and Siebrand Mazeland; The many volunteers who tested ULS and reported useful bugs; The people in Unicode, such as Michael Everson, who work hard to give a number to every letter in every imaginable alphabet and make massive online multilingualism possible; And last but not least, the talented and generous people who developed all those fonts for the different scripts and released them under Free licenses. I send you all my deep appreciation, as a developer and as a reader of Wikipedia.
This issue is laughably easy to avoid: If you are writing the content, you are supposed to know in what language it is written, so if it’s English, just write <html lang=”en” dir=”ltr”> even though these seem to be the defaults. Nineteen or so characters that ensure your content is readable and not displayed backwards. Please do it always and tell all your friends to do it.
The problem is that you don’t only have to explicitly set the language and the direction, but, as silly as it sounds, you have to set them correctly, too. A more subtle, but nevertheless quite frequent and disruptive bug is displaying presumably, but not actually, translated content in a different direction. This happens quite frequently when a website supports the browser language detection feature, known as Accept-Language:
The web server sees that the browser requests content in Hebrew.
The web server sends a response with <html lang=”he” dir=”rtl”>, but because the website is not actually translated, the text is shown in the fallback language, which is usually English.
The user sees the content just like this numbered list, which I intentionally set to dir=”rtl”: with the numbers and the punctuation on the wrong side, and possibly invisible, because English is not a right-to-left language.
Of course, it can go even worse. Arrows can point the wrong way and buttons and images can overlap and hide each other, rendering the page not just hard to read, but totally unusable.
This bug is also an example of the Software Localization Paradox: It manifests itself when Accept-Language is not English, but most developers install English operating systems and don’t bother to change the preferred language settings in the browser, so they never see how this bug manifests itself. The site developers don’t bother to test for it either.
The solution, of course, is to set a different language and direction only if the site is actually translated, and not to pretend that it’s translated if it’s not.
Here are two examples of such brokenness. Both sites are important and useful, but hard to use for people whose Accept-Language is Hebrew, Persian or Arabic.
After showing an example of a web development bug from a site for, ahem, web developers, here is an even funnier example: The home page of Unicode’s CLDR. That’s right: Unicode’s own website shows text with incorrect direction:
The only words translated here are “Contents” (תוכן) and “Search this site” (חיפוש באתר זה), which is not so useful. The rest is shown in English, and the direction is broken: Notice the strange alignment of the content and the schedule table. A few months ago that table was so broken that its content wasn’t visible at all, but that was probably patched.
One thing that I will not do is switch my Accept-Language to English. Whenever I can, I don’t just want to see the website correctly, but to try to help my neighbor: see the possible problems that can affect other users who use different language. Somebody has to break the Software Localization Paradox.
I first connected to the web in the summer of 1997. I bought a new computer with Windows 95 and Microsoft Internet Explorer 2. For about a week I thought that that’s how the web is supposed to look, but I kept seeing messages saying “Your browser doesn’t support frames” on a lot of sites. And then I found that there’s this thing called Microsoft Internet Explorer 3. I went to microsoft.com and downloaded it. It was the first piece of software that I downloaded. It was about 10 megabytes and took about an hour on my dial-up connection.
Most notably, Microsoft Internet Explorer 3 supported frames and animated GIFs. I loved animated GIFs! I guess that it makes me quite a hipster.
And then Microsoft Internet Explorer 4 came out. I thought—”well, if the move from IE2 to IE3 made such a big difference, then I guess that I should try number 4, and it will be even cooler”. And I tried. And it was a disaster. The installation screwed up everything on my computer. I had no idea how to disable the dreaded Active Desktop, which it introduced. It didn’t work so well with my Hebrew version of Windows 95. So I did what a lot of people did very often back then and formatted my hard drive and re-installed Windows.
And the question arose—which browser should I use? IE3 was stable, but I didn’t like that it was getting old. So I went to netscape.com, to try that Netscape Navigator browser that I kept hearing everybody talking about it.
And I loved it.
I loved its nifty toolbars and its bookmarks manager. I loved the crash reporting; it crashed quite often, actually, but I didn’t feel so bad about it, because Microsoft’s programs crashed often, too, and in case of Netscape I felt good about reporting these crashes. Netscape’s email program, Netscape Messenger, was truly outstanding. I especially loved the green dot, which marked messages as read and unread in one click. Most of all, it said very clearly something that I came to realize only years later: “I am a program that lets you browse the web as well as possible. I am not trying to do anything else.”
Fast forward to March 1998. Netscape made the big announcement that the development of its browser becomes an open source project code-named “Mozilla”. I started hearing about “open source”, “free software” and Linux shortly before that, but it was mostly in the context of crazy geek hobbyists. And then suddenly a big famous end-user product that I love becomes open source—that felt really cool.
I followed Mozilla news since then. I heard about Bugzilla before its first version was released. I liked Mozilla’s decision to redo the whole rendering based on standards, even though many people criticized it. The thing that annoyed me the most in Mozilla’s early years was the lack of support for proper right-to-left text support, which was present in Internet Explorer. That’s why I, sadly, used mostly IE, and even became a bit of an IE power user. But I waited eagerly for Mozilla to do it and tried every alpha release.
I was thrilled about the announcement of Firefox, the first stable version of Mozilla’s browser. I gave 10$ to the famous 2004 New York Times Firefox advertisement, and I still have the poster of that advertisement at home.
It always seemed natural to me that I follow Mozilla news so eagerly. I thought that everybody does it. I mean, how is it even possible to use the web in any way without being at least a bit curious about the technology that runs it?
I started sending corrections to the translation of Firefox’s interface translation. I started sending corrections to the Hebrew spelling dictionary. I got so curious about the way the spelling dictionary was built that I ended up doing a whole university degree in Hebrew Language. Really.
And in 2011 I started working in the Language Engineering team in the Wikimedia Foundation. I love it, and it probably wouldn’t have happened without my involvement with Mozilla. In the same year I also became a Mozilla Rep—a volunteer representative of Mozilla at conferences, blogs and forums.
Probably the most important thing that I learned from my Mozilla story is that loving the web and being curious about it is not something obvious. Most people just want something that works for checking weather, news, Facebook friends updates, homework help and kitten videos. And for the most part, that is perfectly fine. But the people’s freedom to read reliable and complete news on any electronic device cannot actually be taken for granted. Neither the people’s freedom and privacy to share their thoughts in social networks. Mozilla is among the most important organizations that care for these things and it develops technologies that make them possible. Technologies that let you browse the web as well as possible and don’t try to do anything else.
P.S. As I began writing this post, I realized that Microsoft’s Active Desktop was not so different from today’s devices, which are heavily based on web technologies: Firefox OS, Chrome OS and others. I can’t say that I love Microsoft, but as it often happens, it was quite pioneering with ideas, and not so good with their execution. Credit where credit’s due.
I’m in an Internet cafe in Mumbai. I tried to install Firefox with the Marathi interface, but on the computers here fonts for languages of India are not installed. That’s right – on computers in India fonts for languages of India are not installed. Hence, installing Firefox in Marathi failed at the very first stage, because the fonts are needed for the installation wizard.
Actually, I’m not surprised that these fonts are not installed, because it’s not my first time in India. I know that it happens a lot in this country. I would install them, but I don’t have a permission.
I find it incredibly weird – and tragic – that so many people in India don’t even try to use computers in any language except English. The one curious thing that I did find was an “English typing computer” shop. It’s just a place where you can use a computer to write Word documents in Hindi or Marathi, but using an English-based transliteration keyboard rather than the standard Indian Devanagari InScript keyboard, because they find transliteration keyboards easier. Of course, they could just install such a keyboard layout on their computers… but they prefer to go to an “English typing computer” shop.
We, software internationalization people, have so much more work to do.
The email is written in English, but notice how the text is aligned unusually to the right. Notice also that the punctuation marks appear at the wrong end of the sentence. I used Firefox developer tools to apply the correct direction, and saw it correctly:
This happens because I use GMail with the Hebrew interface. GMail has to guess the direction of the emails that I receive, because in plain text there’s no easy way to specify the direction (I hope to discuss it in a separate post soon). Usually GMail guesses correctly. Ironically, for HTML-formatted emails like this one, GMail often guesses incorrectly, even though in HTML, unlike in plain text, it’s quite easy to specify the direction by simply adding dir=”ltr” to the root element of the email.
Unfortunately a lot of HTML authors don’t bother to specify explicit direction. Many are not even aware of this exotic dir attribute. Others think that because “ltr” is the default, they don’t have to specify it. They are wrong: As this email shows, the left-to-right HTML content is embedded in a right-to-left environment, and the “rtl” definition propagates to the embedded content.
You could blame GMail, of course, but it’s much more practical to always define the direction of your HTML content, even if it’s the default. You can never know where will your content end up.
P.S.: I read this post before publishing and suddenly realized that its style is quite similar to “Best Practices” books, such as Damian Conway’s classic “Perl Best Practices” – it tells you to do something that is not obviously needed, and explains why it is needed nevertheless. I like to acknowledge sources of inspiration. Thank you, Damian.
Software localization and language tools are poorly understood by a lot of people in general. Probably the most misunderstood language tool, despite its ubiquity, is spell checking.
Here are some things that most people probably do understand about spelling checkers:
Using a spelling checker does not guarantee perfect grammar and correctness of the text. False positives and false negatives happen.
Spelling checkers don’t include all possible words – they don’t have names, rare technical terms, neologisms, etc.
And here are some facts about spelling checkers that people often don’t understand. Some of them are are so basic that they seem ridiculous, but nevertheless i heard them more than once:
Spelling checkers can exist for any language, not just for English.
At least in some programs it is possible to check the spelling of several languages at once, in one document.
Some spelling checkers intentionally omit some words, because they are too rare to be useful.
The same list of words can be used in several programs.
Contrariwise, the same language can have several lists of words available.
But probably the biggest misunderstanding about spelling checkers is that they are software just like any other: It was created by programmers, it has maintainers, and it has bugs. These bugs can be reported and fixed. This is relatively easy to do with Free Software like Firefox and LibreOffice, because proprietary software vendors usually don’t accept bug reports at all. But in fact, even with Free Software it is easy only in theory.
The problem with spelling checkers is that almost any person can easily find lots of missing words in them just by writing email and Facebook updates (and dare i mention, Wikipedia articles). It’s a problem, because there’s no easy way to report them. When the spell checker marks a legitimate word in red, the user can press “Add to dictionary”. This function adds the word to a local file, so it’s useful only for that user on that computer. It’s not even shared with that user’s other computers or mobile devices, and it’s certainly not shared with other people who speak that language and for whom that word can be useful.
The user can report a missing word as a bug in the bug tracking system of the program that he uses to write the texts, the most common examples being Firefox and LibreOffice. Both of these projects use Bugzilla to track bugs. However, filling a whole Bugzilla report form just to report a missing word is way too hard and time-consuming for most users, so they won’t do it. And even if they would do it, it would be hard for the maintainers of Firefox and LibreOffice to handle that bug report, because the spelling dictionaries are usually maintained by other people.
Now what if reporting a missing word to the spelling dictionary maintainers would be as easy as pressing “Add to dictionary”?
The answer is very simple – spelling dictionaries for many languages would quickly start to grow and improve. This is an area that just begs to be crowd-sourced. Sure, big, important and well-supported languages like English, French, Russian, Spanish and German may not really need it, because they have huge dictionaries already. But the benefit for languages without good software support would be enormous. I’m mostly talking about languages of Africa, India, the Pacific and Native American languages, too.
There’s not much to do on the client side: Just let “Add to dictionary” send the information to a server instead of saving it locally. Anonymous reporting should probably be the default, but there can be an option to attach an email address to the report and get the response of the maintainer. The more interesting question is what to do on the server side. Well, that’s not too complicated, either.
When the word arrives, the maintainer is notified and must do something about it. I can think of these possible resolutions:
The word is added to the dictionary and distributed to all users in the next released version.
The word is an inflected form of an existing word that the dictionary didn’t recognize because of a bug in the inflection logic. The bug is fixed and the fix is distributed to all users in the next released version.
The word is correct, but not added to the dictionary which is distributed to general users, because it’s deemed too rare to be useful for most people. It is, however, added to the dictionary for the benefit of linguists and other people who need complete dictionaries. Personal names that aren’t common enough to be included in the dictionary can receive similar treatment.
The word is not added to the dictionary, because it’s in the wrong language, but it can be forwarded to the maintainer of the spelling dictionary for that language. (The same can be done for a different spelling standard in the same language, like color/colour in English.)
The word is not added to the dictionary, because it’s a common misspelling (like “attendence” would be in English.)
The word is not added to the dictionary, because it’s complete gibberish.
Some of the points above can be identified semi-automatically, but the ultimate decision should be up to the dictionary maintainer. Mistakes that are reported too often – again, “attendence” may become one – can be filtered out automatically. The IP addresses of abusive users who send too much garbage can be blocked.
The same system for maintaining spelling dictionaries can be used for all languages and reside on the same website. This would be similar to translatewiki.net – one website in which all the translations for MediaWiki and related projects are handled. It makes sense on translatewiki.net, because the translation requirements for all languages are pretty much the same and the translators help each other. The requirements for spelling dictionaries are also mostly the same for all languages, even though they differ in the implementation of morphology and in other features, so developers of dictionaries for different languages can collaborate.
I already started implementing a web service for doing this. I called it Orthoman – “orthography manager”. I picked Perl and Catalyst for this – Perl is the language that i know best and i heard that Catalyst is a good framework for writing web services. I never wrote a web service from scratch before, so i’m slowish and this “implementation” doesn’t do anything useful yet. If you have a different suggestion for me – Ruby, Python, whatever -, you are welcome to propose it to me. If you are a web service implementation genius and can implement the thing i described here in two hours, feel free to do it in any language.
After writing this post I found out that Google Chrome, in fact, does support vertical Mongolian text.
The title of this post is designed to catch the eye. Microsoft Internet Explorer is not better than Firefox, Chrome and Opera – it’s worse than them in every imaginable regard.
Except one: the support for Mongol Bichig, the vertical Mongolian script.
Mongolian script is unique: its letters are connected, similarly to Arabic and its lines are written vertically. About three million Mongols in the independent republic of Mongolia use this script mostly for historical purposes, and use the Cyrillic script in their daily life, but the classical vertical script is the regular script for nearly six million Mongols in China – that’s about twice as much people.
The only browser that is able to display the vertical Mongolian script is Microsoft Internet Explorer. I don’t really know why Microsoft bothered to do it; maybe because the government of the People’s Republic of China demanded it. If that is true, then i salute the government of the People’s Republic of China. And i definitely salute Microsoft. I don’t like Microsoft’s insistence on keeping their code proprietary, but pioneering the support for this script, or any other, is praiseworthy.
I am very sad that at this time i cannot recommend my Mongolian friends to use my favorite browser, Firefox, or other modern browsers such as Google Chrome and Opera. For all their modernity, speed, feature richness and standards compliance, they are useless to over six million people who want to read and write in the vertical Mongolian script. At most, these browsers can display the script horizontally and with some letters incorrectly rendered. This also means that the only useful operating system for these people is Microsoft Windows.
One explanation that i heard for not supporting the vertical Mongolian script is that the CSS writing modes standard is not completely defined. This is actually a good and even noble reason, but when the most basic ability to read a language is in question, experimental support is better than no support.
So, which modern free browser will be the first to support the Mongolian script? I guess that it will be Firefox, given its excellent track record in supporting Unicode, and that Google Chrome will follow it after three years or so. But if Chrome developers surprise me and get there first, i’ll be just as happy. In any case, i am waiting impatiently, along with more than six million Mongols.
Gerv is a very well-known and very talented Mozilla programmer, and also a devout Christian. His blog is called “Hacking for Christ”. There’s nothing weird or wrong about it – there are many other excellent Christian hackers, like Perl’s Larry Wall and Jonathan Worthington and Mozilla’s Jonathan Kew. Gerv’s comment wasn’t particularly hateful; as it often goes, it focused on the legal side of things. Gerv is also an unusually charming person; i had the pleasure to meet him in Berlin.
All that said, i support gay marriage, i don’t support Gerv’s comment and i think that he shouldn’t have post it that way. But once he did, hey – water under the bridge. I care much more about his contributions to Mozilla’s code than about his social, legal and religious opinions.
And the loveliest part of it all is that in one the many comments to his post, i found a link to the play “8”, about the fight for recognizing gay marriage in California. On one hand, it’s a very well played PR stunt, with the highest league stars such as like Brad Pitt, George Clooney, Martin Sheen, Jamie Lee Curtis, Kevin Bacon, Yeardley Smith, John C. Reilly and George Takei. On the other hand, it’s actually worth watching. If this is what came out of that poorly placed blog post, then i’m not complaining.
Liron Dorfman, a Wikimedia Israel activist, periodically lectures in a library in the north of the country, helping librarians contribute their vast knowledge and experience with reference works to the Free Encyclopedia.
At one of these lectures he called me in panic and asked for urgent help: He was trying to teach the librarians how to upload images to the Commons, Wikipedia’s images repository, and the Upload Wizard got stuck.
My first question, of course, was “Which browser are you using?”
The answer was, non-surprisingly, “Microsoft Internet Explorer”.
So i told him to try another browser. There wasn’t one installed, so i told him to download one. He wanted to download Chrome, but i insisted on Firefox, and he agreed.
So he installed Firefox, tried the Upload Wizard there, and it worked. Win.
It was a nice demonstration of how Firefox can save the day. It would probably work in Google Chrome, too; it has many bugs that make it almost unusable to me, but for this matter, Firefox was just a matter of personal preference.
Of course, uploading should work in Microsoft Internet Explorer, too; about 30% of Wikipedia readers still use it, and about half of them use the old Internet Explorer 8, which is the newest version available on the still-popular Windows XP. The fact, however, is that for better or worse MediaWiki developers mostly use GNU/Linux and Mac, on which Microsoft Internet Explorer doesn’t run at all, so we don’t even open it unless they have a reason. We usually test new features on it, but it is rare for us to actually use it for browsing the web, and that is essential for noticing bugs that would otherwise go unnoticed.
We all wish that all our users would stop using the old, proprietary and non-compliant Microsoft Internet Explorer, but we cannot convert millions of people overnight; even the giants Google and Facebook tried to do that and until now it was not a great success. Until then, we hope that the people who still use it will at least be able to read and contribute text and media. We can only fix problems if we know about them, so if you use Internet Explorer and encounter a problem in Wikipedia or the websites related to it, please report it at Wikimedia’s bug reporting site.
But if you just stop using Internet Explorer and move to a modern browser, we’ll be quite happy, too.
And to get back to the opening point – never be shy to introduce your friends who still use Microsoft Internet Explorer to Firefox! They’ll thank you in any case, but it works especially well when things break. If you find yourself doing that a lot, then you are already very cool and you should consider going further by becoming a Mozilla Rep.