Always define the language and the direction of your HTML documents, part 01

I received this email from Safari Books Online:

Email in English from Safari Books, oriented like Hebrew

Email in English from Safari Books, oriented like Hebrew. Click to enlarge.

The email is written in English, but notice how the text is aligned unusually to the right. Notice also that the punctuation marks appear at the wrong end of the sentence. I used Firefox developer tools to apply the correct direction, and saw it correctly:

The same email, with corrected left-to-right formatting using Firefox developer tools

The same email, with corrected left-to-right formatting using Firefox developer tools

This happens because I use GMail with the Hebrew interface. GMail has to guess the direction of the emails that I receive, because in plain text there’s no easy way to specify the direction (I hope to discuss it in a separate post soon). Usually GMail guesses correctly. Ironically, for HTML-formatted emails like this one, GMail often guesses incorrectly, even though in HTML, unlike in plain text, it’s quite easy to specify the direction by simply adding dir=”ltr” to the root element of the email.

Unfortunately a lot of HTML authors don’t bother to specify explicit direction. Many are not even aware of this exotic dir attribute. Others think that because “ltr” is the default, they don’t have to specify it. They are wrong: As this email shows, the left-to-right HTML content is embedded in a right-to-left environment, and the “rtl” definition propagates to the embedded content.

You could blame GMail, of course, but it’s much more practical to always define the direction of your HTML content, even if it’s the default. You can never know where will your content end up.

P.S.: I read this post before publishing and suddenly realized that its style is quite similar to “Best Practices” books, such as Damian Conway’s classic “Perl Best Practices” – it tells you to do something that is not obviously needed, and explains why it is needed nevertheless. I like to acknowledge sources of inspiration. Thank you, Damian.

About these ads

6 Responses to “Always define the language and the direction of your HTML documents, part 01”


  1. 1 Júda 2012-08-07 at 18:45

    U+202B allows you to specify the direction as RTL. I use it when writing emails (I write them in Vim, and Mutt sends them using msmtp).

  2. 4 Lina 2013-06-19 at 16:28

    hi Amir,

    You mentioned that
    >> GMail has to guess the direction of the emails that I receive, because in
    >> plain text there’s no easy way to specify the direction (I hope to discuss
    >> it in a separate post soon).

    Could you link me to that post or maybe describe here how GMail guesses the direction of the plain text?
    My impression is that it derives it from the direction of majority of the characters in the mail.
    E.g. suppose a mail consists of 20 characters, 15 of which are RTL. Then GMail sets the direction of all paragraphs in that mail to RTL.
    Correct?

    • 5 aharoni 2013-06-20 at 16:52

      It’s actually a mystery. I don’t know Google’s whole algorithm. GMail chat does something like what you describe, although I don’t know the exact numbers. I’m not sure about emails.


  1. 1 Always define the language and the direction of your HTML documents, part 02: Backwards English | Aharoni in Unicode, ya mama Trackback on 2013-05-18 at 14:14

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s





Follow

Get every new post delivered to your Inbox.

Join 1,706 other followers

%d bloggers like this: