Always define the language and the direction of your HTML documents, part 01

I received this email from Safari Books Online:

Email in English from Safari Books, oriented like Hebrew
Email in English from Safari Books, oriented like Hebrew. Click to enlarge.

The email is written in English, but notice how the text is aligned unusually to the right. Notice also that the punctuation marks appear at the wrong end of the sentence. I used Firefox developer tools to apply the correct direction, and saw it correctly:

The same email, with corrected left-to-right formatting using Firefox developer tools
The same email, with corrected left-to-right formatting using Firefox developer tools

This happens because I use GMail with the Hebrew interface. GMail has to guess the direction of the emails that I receive, because in plain text there’s no easy way to specify the direction (I hope to discuss it in a separate post soon). Usually GMail guesses correctly. Ironically, for HTML-formatted emails like this one, GMail often guesses incorrectly, even though in HTML, unlike in plain text, it’s quite easy to specify the direction by simply adding dir=”ltr” to the root element of the email.

Unfortunately a lot of HTML authors don’t bother to specify explicit direction. Many are not even aware of this exotic dir attribute. Others think that because “ltr” is the default, they don’t have to specify it. They are wrong: As this email shows, the left-to-right HTML content is embedded in a right-to-left environment, and the “rtl” definition propagates to the embedded content.

You could blame GMail, of course, but it’s much more practical to always define the direction of your HTML content, even if it’s the default. You can never know where will your content end up.

P.S.: I read this post before publishing and suddenly realized that its style is quite similar to “Best Practices” books, such as Damian Conway’s classic “Perl Best Practices” – it tells you to do something that is not obviously needed, and explains why it is needed nevertheless. I like to acknowledge sources of inspiration. Thank you, Damian.


Why I Don’t Plan To Use Any Apple Products

Well, basically, because of this. If that page offends you, then you deserve to be offended.

And seriously, I have so many completely practical reasons not to use any Apple products:

  1. I don’t want to waste a second of my life on getting used to the weird Alt, Control, Command and Option keys, or whatever they are called there. I’m efficient with using keyboard shortcuts, which are similar in Windows and graphical desktop GNU/Linux environments with Windows-style keyboards. Every time I try to use a Mac, I immediately start climbing up the walls, because the shortcuts don’t work. If you tell me that once I learn them, it gets really natural, then you are defeating the whole Mac idea of “it just works”. Not that I ever seriously thought that it’s true.
  2. I love right-clicking and I hate control-clicking. I know that I can connect a normal mouse with two or three buttons, but the very idea that by default the mouse has only one button because I’m apparently too stupid to understand the difference between right-clicking and left-clicking offends me. And the Mac touchpads come with one button. Mac lovers tell me that I can use gestures to achieve the effect of a right click, but I hate gestures with a passion. Call me old-fashioned if you will.
  3. I’ll have to buy Mac OS X even though I’m not going to use it. I once spent an hour with an experienced Mac user trying to understand how to write Hebrew from right to left properly. I suppose that it’s possible to do it there somehow, but in 2012 I don’t want to waste a second of my time on an operating system in which it’s so hard to figure out how to do such a simple thing.
  4. I do not want documents to scroll the other way. I do not want documents to scroll the other way. I do not want documents to scroll the other way.

And all that – even before I get to the ideological points. For example, that Apple wants to kill the open web with walled-garden apps, that it forces app developers to get approval for everything, that its licenses are among the most obnoxiously proprietary.

That everything made by Apple is unnecessarily expensive just because it’s supposed to be more fashionable. Yes, they probably invested a bit more in design. Yes, they probably invested a bit more in the right alloy. But the main reason for their high prices is not the quality of the product and not even the fact that they are stylish, but because the high price is the thing that makes them more fashionable. This is preposterous and I am not cooperating with that.

Well, yes, Macs have certain positive points. A Mac can run all the development tools that I need – it comes with a usable Unix-style terminal and programming languages, such as PHP, Python and Ruby (I didn’t check, but probably Perl, too). It has a high-quality screen. On the average, Macbooks are usually thinner and lighter. But there are no Mac features that are compelling enough for me to bother to reconsider the above points.

What I really fail to understand is why so many Free Software developers use Macs – but that’s a topic for a separate post.

MozCamp Berlin 2011, part 3 – Hackasaurus

One especially awesome project i learned about at MozCamp Berlin is Hackasaurus. (Big thanks to Alina for convincing me to attend the talk about it.)

The Hackasaurus mascot - a girl with a dinosaur tale wearing goggles and holding a laptop
The Hackasaurus mascot

Hackasaurus is a set of software tools and workflows to teach young people web programming. Its technical centerpiece, the “X-Ray Goggles”, is a tool that works similarly to Firebug and Google Chrome Developer tools: it helps the user examine and change, or “remix”, the inner workings of a web page – the structure of HTML elements and CSS styles. It has less features than the above tools, but it is designed to have just enough to get average people started with understanding web technologies. It is also laughably easy to install: it’s not even an add-on – you only need to add a bookmark.

According to the Hackasaurus creators Jess Klein and Atul Varma, even though the tool was intended for children, it is being used for learning about web technologies by people of all ages who were curious about web development, but found other HTML tutorials too hard.

And it works not just in Firefox, but in other browsers, too. That is one more example of how the Mozilla movement is not just about Firefox, but about Loving the web.

Hackasaurus can be easily translated to other languages using Pootle. I already translated most of it to Hebrew. Special thanks to Atul for creating the page which is frequently updated with the translations in progress – it is essential for testing the localized version. For example, i can see that the right-to-left directionality of Hackasaurus in Hebrew must still be fixed – i hope to find the time to do it myself as soon as possible.

And most importantly, i am thinking of using the tool to start teaching web development in a fun way in the schools in my area. This has been done successfully in Barcelona, New York, Brighton, Nairobi and other places and i plan to add Jerusalem and Haifa to this list soon.


Israeli programmers use many words of English origin when they speak Hebrew. (Many of them prefer to write only in English instead of Hebrew, which is a separate issue.)

When they use these English words, they tend to adapt them to Hebrew pronunciation. Some adaptations are simple, for example “router” is pronounced with an Israeli, rather than English [r] sound (some people – not necessarily purists! – use the Hebrew word נַתָּב [natav] for that). “SQL” is rarely pronounced as “sequel” – usually it’s “ess cue el”, and the same goes for MySQL.

But some are harder to explain. For example, “component” is often pronounced [kompoˈnenta]. I heard it in several companies and i don’t quite understand why. Note the [a] in the end and the stress, too: in English it’s supposed to be something in the area of [kʌmˈpoʊnənt] – on the second syllable, not the third. I have never heard an Israeli programmer pronounce it with correct stress when speaking in English – i always hear it as [ˈkomponənt] – with stress on the first syllable and with a [o]’s in the first two syllables.

The only languages available on Google Translate in which this word is anywhere near [komponénta] are Serbian (компонента), German (Komponente), Romanian (componentă) and Spanish and Italian (componente). It may have something to do with them, but the solution is probably more complicated. Does anyone have any idea?

Git glossary lacunae, part 1

If you work with linguistics, philology, texts or editing, you probably know what a “lacuna” is. If you don’t, then any dictionary will tell you that a lacuna is something that is supposed to be somewhere, but is missing, and it is usually said of texts in which words or whole passages are missing for some reasons.

I already wrote here about how much i hate the source code management system called Git (Git sucks 1, Git sucks 2). Actually, Git itself is probably a good piece of software, but learning it is terribly hard. I’ve been trying to do it for years and i still don’t understand almost anything. Learning Git is hard because every piece of documentation that discusses it is full of cryptic jargon. The solution to this problem is supposed to be in the man page called gitglossary, but it is very incomplete; in philologists’ jargon, it has lacunae.

I compiled a list of Git terms which i found hard to understand and which i could not find in gitglossary. At some point i thought that i would try to understand what these weird words mean myself and send patches with definitions to the maintainers of that file. Unfortunately, i am too busy to do that. The least i can do is to post that list here. If you are a Git expert, consider writing definitions for them and sending them as a patch to Git’s maintainers.

  • add
  • author
  • bisect
  • clone
  • committer
  • diff
  • grep
  • log
  • packed ref
  • remote (as a noun)
  • repo (it means “repository”, of course, but the glossary should mention this abbreviation)
  • reset
  • staging, staging area (a synonym for “index”, if i understand correctly)
  • status
  • treeish
  • working copy (this one seems simple, but it’s not)

These words are defined in the glossary, but the definitions are unclear:

  • parent – I couldn’t understand a word of that definition.
  • reflog – The definition says that this thing “can tell you what the 3rd last revision in _this_ repository was”. It is unclear whether the number 3 hear is just an example or it always refers to the 3rd last revision.
  • checkout – This should be defined very clearly and carefully, because the usage of this term in Git is quite different from its usage in other version control system. The current definition is unclear and circular: a checkout is “the action of updating all or part of the working tree with a tree object”; to understand it one needs to know what a “working tree” is – and it is defined as “the tree of actual checked out files”.

So, i’m sincerely sorry for only bringing up the problem without providing a solution. I hope that it’s better than just doing nothing.

(By the way, i would gladly post it as a bug in Git’s online bug tracking database… except that for some strange reason last time i checked Git developers don’t have one.)

Advocacy for the Uncool: SVN vs. git and Cygwin vs. the World

There are two Free Software packages that many Free Software people love to hate: Cygwin and Subversion.

Cygwin is a Unix-like environment on Windows. It gives the user a shell, and it’s possible to install there Perl, Python, Ruby, GNU make, gcc, vim and many other familiar tools from the GNU world. It’s even possible to run X windows using it.

I mostly use it for running Perl on Windows. There are two other major versions of Perl for Windows: ActiveState and Strawberry. Every now and then i try using them and i get immediately frustrated: from my experience, Cygwin is much more stable and predictable. Failure to install a CPAN module on Cygwin is much more rare than on ActiveState and Strawberry. Maybe i install the wrong modules, but for modules that i need Cygwin did the job better.

Cygwin is not without problems. But all too often it does the job more readily than ActiveState, Strawberry and GNU/Linux. Nevertheless, Free Software people tend to call me names, when i tell them that i use Cygwin. “You should expect problems when you run an emulator instead of running real Linux!”, they say. Well, what do you know – sometimes, i have to run Windows, that’s a fact of life, and there are stupid problems with Linux, too.

Another stupid holy war in the Free Software community is Git vs. Subversion (SVN in short). Both are source code management (SCM) systems. The “cool” Free Software people say that git is better, because it git lets you create your own repositories, because git is faster, because git is easier.

I can see the principal advantage in having a local repository, which is the way git works. I can work offline and make as many commits as i like. In SVN i need to go online for every commit. But that, in practice, is the only disadvantage that SVN has. People say that SVN sucks at branching and merging. They like to quote Linus Torvalds: “Did you ever try to merge using SVN? Did you enjoy the experience?” Well, i have news for them: I tried branching and merging using Perforce, Mercurial, ClearCase, SVN and git – and i didn’t enjoy the experience in any of them. So git also sucks at branching and merging, but the difference is that with git i lost data, too. Every single time i tried to branch and merge using git, i cursed the hell out of it, copied the files i wanted to change to a backup directory, deleted the repository, recreated it, and did the merge manually. Every single time.

Besides, every time i try to use git, i feel like a fucking scientologist, forced to look up every single word in the help files: how the hell am i supposed to remember the difference between “pull” and “fetch” or between “branch”, “clone” and “checkout”? To understand what “fetch” is, i need to understand what the fuck “head”, “tag”, “object” and “ref” are. Go on and tell me that i should sit down and learn git properly, but i didn’t have to sit down and learn SVN. It just worked without forcing me to understand things.

Call me stupid and old-fashioned, but SVN didn’t give me a headache. Ever.

So, cool kids, go on, keep being cool, keep telling people that Cygwin and SVN suck. But every now and then do a reality check, please. You find it fun to use git? Great. Just don’t force it on other people.

To the developers of Cygwin and SVN i want to say: Thank you. You deserve far more appreciation than you get.

is perl still worth learning

Someone entered “is perl still worth learning” into a search engine and found my blog.

The answer is Yes.

Python and Ruby are not inherently bad, but Perl is at least as useful and modern as them, it has – arguably – a wonderful community of programmers, it has an amazing library of reusable modules called CPAN.

My wife Hadar is starting serious work on her PhD in physics in the Technion. The guys in the lab in which she will be working wrote some calculations software in Fortran on Windows. The first thing that Hadar is doing is deciphering this Fortran code. She asked me for some help, and i couldn’t provide much, because i don’t really know Fortran. I suggested that she will advise those lab guys to consider porting their software, at least for the future, to Perl, because it is portable and because it is quite possible that it has the same capabilities for mathematical and scientific work as Fortran has. She told it to one of the researchers there and he replied that it should not be done, because “Perl is just a language for network servers.”

Saying that “Perl is just a language for network servers” is pretty much like saying that all Russian women are prostitutes. It’s a sad and silly prejudice. Here’s an article that dispels it: Ten Perl Myths.

So Hadar learned a little Perl and PDL – the Perl library for advanced mathematics. She picked up the basics very quickly. I was pleasantly surprised that she found that Perl’s main data types are scalars ($drug = 'caffeine') and arrays (@drugs = ('marijuana', 'quaalude', 'paracetamol')), because in math it works the same way (we didn’t discuss hashes yet). I was even more surprised to learn that it seemed perfectly fine to her that @drugs is an array, but to access ‘quaalude’ you need to write $drugs[2] and not @drugs[2]. We tried searching CPAN for various mathematical functions, such as eigenvalue, matrix diagonal and linear algebra, and found everything.

So she’s gonna try that.

If she can’t convince them to migrate to Perl, i’ll have to learn Fortran and try to help them migrate from a Windows version of Fortran to GNU Fortran.


I program for living, but i’ve never received proper formal education in serious algorithms.

Here’s a very simple problem: Take an array of length n and fill it with zeros. Every array member represents a binary digit. Now, using this array and not using the usual math for binary conversion, print all binary digits from zero to the maximum binary number with n digits. For example, with n == 3 this should be printed:

0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1

Here’s what i wrote. Is it OK or is it an embarrassment?

use strict;
use warnings;

# That's right, upgrade to Perl 5.10.
# If you can't, comment out this line.
use 5.010;

my $digits = $ARGV[0] // 3;

# /If you don't have Perl 5.10, use this:
# my $digits = defined $ARGV[0] ? $ARGV[0] : 3;

my @matrix = ();
my @number = map { 0 } (1 .. $digits);
my $last = 0;

while (not $last) {
    push @matrix, [ @number ];

    my $digit_index = $digits;
    while ($digit_index) {
        $last = 1;

        if ($number[$digit_index]) {
            $number[$digit_index] = 0;
        else {
            $last = 0;
            $number[$digit_index] = 1;
          next NUMBER;

foreach my $number (@matrix) {
    print "@{$number}\n";

Oh (edit): The real embarrassment – in WordPress the sourcecode presentation cannot display Perl properly. But if i put ‘ruby’ instead of ‘perl’ in the language attribute, it works mostly fine …