Differences Between Things

The search box in Wikipedia suggests auto-completion when you start typing. For example, if you type “je” in the English Wikipedia search box, you’ll get the suggestions “Jews”, “Jewish”, “Jerusalem”, “Jesus”. (Jews kick ass!)

Jews Kick Ass. Henry Winkler, Albert Einstein, Sammy Davis Jr., Jesus, William Shatner, Bob Dylan

If you search for “differences between”, you’ll get this list:

auto-suggestions at Wikipedia for "differences between"

The top spot belongs to “Differences between editions of Dungeons & Dragons” and that shouldn’t be surprising: the article “List of Advanced Dungeons & Dragons 2nd edition monsters” only recently lost its first place in the list of the longest English Wikipedia articles by number of bytes to “‎2011 ITF Men’s Circuit” (it’s something in tennis).

Out of ten suggestions, six are related to languages. American and British English are considered one language, but everybody admits that it has many variations by pronunciation, spelling, vocabulary and many other parameters, and lots of people love to bicker about the spelling of “meter” and “aluminum”. Bosnian, Croatian and Serbian are one language that has different names for reasons that are more political than linguistic. Something similar can probably be said about Malaysian and Indonesian, Norwegian Bokmål and Standard Danish and Scottish Gaelic and Irish, but i know very little about these pairs.

Spanish and Portuguese are related, but definitely separate and mostly mutually unintelligible languages. It’s been said that it is easier for Portuguese speakers to understand Spanish speakers than the other way around, which is interesting, but it doesn’t really justify an encyclopedic article, as in the other cases. In fact, i am somewhat surprised that “Differences between Brazilian and European Portuguese dialects” is not in the list, given the huge number of arguments about it in the Portuguese – sorry, Lusophone – Wikipedia.

“Butterflies and moths” is probably the most serious article in this list, but that’s probably because i’m not a Biologist.

And the last two articles are about movies (James Bond – movies vs. novels) and religion (Codex Sinaiticus vs. Vaticanus), which is also very Wikipedia, the encyclopedia about which someone said that it has more stamp collectors than good writers. (Citation needed; I can’t find the original quote.)


Houaiss Unicode: Portuguese vs. Hebrew

I bought the Houaiss dictionary of Portuguese language.

It is very good, with some features that i haven’t seen in any other dictionary. For example, if you search for “gato” (cat), you’ll find a list of collective nouns for cats – bichanada, gataria. I am not familiar with an English dictionary that points me from “cat” to “pack”. It also lists the sounds that cats make – berrar, miar, roncar, ronronar, miada, miado, miau, mio, rom-rom, roufenho and many others. This feature exists for other animals, too.

It also has etymologies, synonyms, paronyms, antonyms, date of first usage, similar-sounding words, and many other lovely features.

I bought a paper edition with a CD-ROM. To install it from the CD-ROM i need to type an obnoxious serial number, but i can live with that. It also works only on Windows, but i can live with that, too, even though i am terribly ashamed of it.

But it does have one particularly obnoxious mis-feature: it doesn’t support Unicode. So i sent them this email:


I am only a student of the Portuguese language and i don’t write it so well yet. Feel free to reply in Portuguese.

I bought the Houaiss dictionary, versão monousuário 1.0 junho de 2009. I installed it on my Windows XP PC and i was very disappointed to find out that most of this program doesn’t support Unicode. You probably programmed the strings in some kind of an ANSI encoding and not in Unicode.

I live in Israel and my computer is set to display non-Unicode programs in Hebrew. If you don’t know what am i talking about – in Windows XP, take a look at Control Panel -> Regional and Language Options -> Advanced -> Language for non-Unicode programs. Unfortunately i still have to use some old non-Unicode programs for my work, and these programs need to display Hebrew. To change this setting, i need to reboot the computer, which is very inconvenient, and since i use this computer for work most of the time, i am forced to see Hebrew letters instead of the special Portuguese characters ã, õ, ç etc. in the Houaiss program. Take a look the attached image to see how it looks on my machine.

Strangely enough, the central pane, where the dictionary article appears, works correctly. For example, the word “Derivação” appears with the right letters. But all the rest is broken: the word list on the left, the Parפnimos (Parônimos) tab at the bottom, the Acepחץes (Acepções) tab at the top appear with Hebrew characters. Hebrew characters also appear in the About box (Ajuda->Sobre) and in the installation program. In the menu itself question marks instead of special Portuguese characters: “Conjuga??o” instead of “Conjugação”. They appear as question marks even if i change the setting of “Language for non-Unicode programs” to “Portuguese (Brazil)”!

Note also this: Since the wordlist on the left doesn’t work correctly, i can’t easily search for words which include special characters. For example, if i want to search for the word “parônimo”, i try to type “p-a-r-o-n…”, and the program doesn’t get anywhere near “parônimo”, because you treat ‘o’ and ‘ô’ as different characters. So i need to scroll to it manually.

Besides this very annoying Unicode bug, i am very happy about the dictionary itself, so can you please fix this, so that my satisfação with it would be complete? In 2010 there are no more reasons to produce non-Unicode software. Besides, Windows 2000, which supports Unicode, is listed as a technical requirement to run the program.

Thanks in advance!

I sent this email to producao@objetiva.com.br and immediately received three identical replies from three different emails with human names, asking me to confirm that i am person and not a spam robot by replying. I replied to one of them and received a confirmation that i am not a spam robot. Good to know. Now please fix the Unicode support in your dictionary. It will take one day, including cafezinho breaks and a sword-fight.


YouTube may be a competitor to Wikipedia as one of the most massively multilingual sites on the web.

Many people who comment there don’t seem to care that English is the lingua franca of the web. They just write in Russian, Portuguese, Indonesian, Catalan and Croatian and it creates a soup of languages. And that is a Very Good Thing. It makes languages seen and promotes tolerance. Variety and tolerance are mighty good.

Gender Studies

In most Hebrew language courses a significant majority of students are female. The only exception is the course “Medieval Hebrew: Piyyut and Spanish Poetry”, which has 70% of male students. Calling this course “the hardest” wouldn’t be very objective, but it is safe to say that the Even-Shoshan Dictionary is not very useful for understanding the texts that we read there.

In Linguistics courses i took the ratio of male-to-female students was pretty much even. The same goes for “Spanish for beginners”.

However, in the “Advanced Portuguese” course all students are male.

(Hi, Jane.)