Archive for the 'Internet' Category

Twitter Must Make it Easy to Mass-Report Spam Bots

I found a network of Russian female bots. Twitter spam bots.

They are not actually female. They just have Russian female names and female photos.

Most of those that I found were created in September 2016, although some were created at other times.

They all have similar taglines:

  • “In my opinion, everything is wonderful. I wonder what else” (“По-моему всё прекрасно. Интересно что ещё”)
  • “Right now absolutely everything is excellent. I wonder how else” (“Сейчас вообще всё отлично. Интересно как там ещё”)
  • “It looks like absolutely everything is wonderful. I’ll see what will happen next” (“Вроде вообще всё прекрасно. Посмотрю что будет дальше”)

… And so forth, with minor variations, which are very easy to detect for a human who knows Russian, although I’m less sure about software. (This reminds me of how I was interviewed for several natural language processing positions around 2011. All of them were about optimizing site text for Google ads, and all of them specifically targeted only English. When you only target English, other languages are used to spam you.)

Their usernames are all almost random and end with two digits: flowoghub90, viotrondo86, chirowsga88 (although “90” seem to be the most frequent digits). As location, they all indicate one of the large cities of Russia: Moscow, Krasnoyarsk, Perm, Saint-Petersburg, Rostov-on-Don, etc.

All of them post nothing but retweets of other accounts popular in Russia:

Curiously, all their names are only typical to ethnic Russians. Names of real women from Russia would be much more varied—there would be a lot of typical Armenian, Ukrainian, Jewish, Georgian, and Tatar names that reflect Russia’s diversity: Melikyan, Petrenko, Rivkind, Gamkrelidze, Khamitova. But these spam bot accounts only have names such as Kuznetsova, Romanova, Ershova, Medvedeva, Kiseleva. If you aren’t familiar with the Russian culture, let me make a comparison to the U.S.: It’s like having a lot of people named Smith, Harris, Anderson, and Roberts, and nobody named Gonzalez, Khan, O’Connor, Rosenberg, or Kim. Maybe the spammers wanted to be more mainstream than mainstream, and maybe it is just overt racism.

I found them when I noticed that a lot of unfamiliar accounts with Russian female names were retweeting something by Pavel Durov in which I was mentioned. Durov is the founder of VK and Telegram, and I guess that he can be classified under “major internet businesses” in the list above. I noticed the similar taglines of the “women”, and immediately understood they are all spam bots.

These accounts are active. Some of them retweeted stuff while I was writing this post. I also keep getting retweet notifications, more than two weeks after Durov’s original tweet was posted.

When I am looking at any of these accounts, Twitter suggests me similar ones, and they are all in the same network: Russian female names, similar “everything is wonderful” taglines, similar content. So Twitter’s software understands that they are similar, but doesn’t understand that they are spam bots that should be utterly banned. I also noticed that some of them are still suggested to me after I blocked them, which goes against the whole point of blocking.

I don’t know how many there are of them in this network. Likely thousands. I reported thirty or so, and I wonder whether it’s efficient for anything.

I also don’t know what is their purpose. Boost the popularity of other Russian accounts? But those that they retweet are popular already. Waste the time of people who try to use Twitter productively? Maybe; at least it’s the effect in my case. Function as bot followers in “pay to follow” networks? Possibly, but they have existed for a year, and they don’t follow so many people.

I’m probably not discovering anything very new in this post. But especially if I don’t, it all the more makes me wonder why isn’t this problem already addressed somehow. At the very least it should be possible to report them more efficiently with one click or tap. And Twitter should also provide a form for mass-reporting; currently, Twitter’s guides about spam only suggest this: “The most effective way to report spam is to go directly to the offending account profile, click the drop-down menu in the upper right corner, and select “report account as spam” from the list.” It’s OK for one account, but it requires five clicks, and it doesn’t scale for something as systematic as what I am describing in this post.

I do hope that somebody from Twitter will read this and do something about it. This is obvious systematic abuse, and I have no better way to report it.

Advertisements

I Deleted My Facebook Account

I used Facebook quite a lot. I posted lots of things, I got to know a lot of people, I learned about things that I wouldn’t learn anywhere else, I shared experiences.

But the feeling that I am the product and Facebook is the user got stronger and stronger as time passed. It happens with many other companies and products, but with Facebook it’s especially strong.

In February 2015 I stopped posting, sharing and liking, and I deleted Facebook apps from all my other devices. I continued occasionally reading and exchanging private messages in a private browser window.

Then I noticed that a few times things were shared in my name, and people liked them and commented on them. I am sure that I didn’t share them, and I am also quite sure that it wasn’t a virus (are there viruses that do such things on GNU/Linux?). Also, a few people told me that they received messages from me, and I’m sure that I didn’t send them; It’s possible that they saw something else under my name and thought that it’s a message even though it was something else, but in any case, nobody is supposed to think such a thing. That’s not how people are supposed to interact.

I am not a bug, not an A/B test, not a robot, not an integer in a database. I am Amir Aharoni and from today Facebook doesn’t use me. There are other and better ways to communicate with people.

Stop saying that “everybody is on Facebook”. I am not. I don’t feel exceptionally proud or special. I am not the only one who does this; a few of my friends did the same and didn’t write any blog posts or make any fuss about it.

You should delete your Facebook account, too.

Weird GMail Habit: Removing Control Characters

GMail has a weirdish feature that probably very few people except me know about. When using it with a Hebrew user interface, invisible control characters—LRM, RLM, RLE, LRE and the like—are added to some strings to make them appear correctly in a mixed-direction interface.

Most notably, they are added to email addresses. I sometimes want to copy these email addresses as text, and my mouse pointer picks the control characters as well. Of course, these control characters are by themselves invisible to humans, but very much visible to computers, and an email address with these characters is not correct, even if it appears to be the same to human eyes.

It already became a habit for me to carefully delete and manually restore the first and the last characters of an email address to make sure that the control characters are removed.

It would be better if GMail just used the <bdi> element or CSS bidi isolation. They are fairly well supported in modern browsers and provide better experience.

The Fateful March of 1998 – my #webstory

I first connected to the web in the summer of 1997. I bought a new computer with Windows 95 and Microsoft Internet Explorer 2. For about a week I thought that that’s how the web is supposed to look, but I kept seeing messages saying “Your browser doesn’t support frames” on a lot of sites. And then I found that there’s this thing called Microsoft Internet Explorer 3. I went to microsoft.com and downloaded it. It was the first piece of software that I downloaded. It was about 10 megabytes and took about an hour on my dial-up connection.

Most notably, Microsoft Internet Explorer 3 supported frames and animated GIFs. I loved animated GIFs! I guess that it makes me quite a hipster.

A cat in headphones dancing to house music.

House cat. Sorry, it’s an anachronism— this animated GIF is from mid-2000s. 1997’s animated GIFs were quite different.

And then Microsoft Internet Explorer 4 came out. I thought—”well, if the move from IE2 to IE3 made such a big difference, then I guess that I should try number 4, and it will be even cooler”. And I tried. And it was a disaster. The installation screwed up everything on my computer. I had no idea how to disable the dreaded Active Desktop, which it introduced. It didn’t work so well with my Hebrew version of Windows 95. So I did what a lot of people did very often back then and formatted my hard drive and re-installed Windows.

And the question arose—which browser should I use? IE3 was stable, but I didn’t like that it was getting old. So I went to netscape.com, to try that Netscape Navigator browser that I kept hearing everybody talking about it.

And I loved it.

I loved its nifty toolbars and its bookmarks manager. I loved the crash reporting; it crashed quite often, actually, but I didn’t feel so bad about it, because Microsoft’s programs crashed often, too, and in case of Netscape I felt good about reporting these crashes. Netscape’s email program, Netscape Messenger, was truly outstanding. I especially loved the green dot, which marked messages as read and unread in one click. Most of all, it said very clearly something that I came to realize only years later: “I am a program that lets you browse the web as well as possible. I am not trying to do anything else.”

Fast forward to March 1998. Netscape made the big announcement that the development of its browser becomes an open source project code-named “Mozilla”. I started hearing about “open source”, “free software” and Linux shortly before that, but it was mostly in the context of crazy geek hobbyists. And then suddenly a big famous end-user product that I love becomes open source—that felt really cool.

I followed Mozilla news since then. I heard about Bugzilla before its first version was released. I liked Mozilla’s decision to redo the whole rendering based on standards, even though many people criticized it. The thing that annoyed me the most in Mozilla’s early years was the lack of support for proper right-to-left text support, which was present in Internet Explorer. That’s why I, sadly, used mostly IE, and even became a bit of an IE power user. But I waited eagerly for Mozilla to do it and tried every alpha release.

"Are you fed up with your browser? You're not alone. We want you to know that there's an alternative... Firefox." The logo of Firefox is drawn with names of people.

The famous New York Times ad.

I was thrilled about the announcement of Firefox, the first stable version of Mozilla’s browser. I gave 10$ to the famous 2004 New York Times Firefox advertisement, and I still have the poster of that advertisement at home.

A long list of names, including Amir Elisha Aharoni

And there’s my name. Third line in the middle.

It always seemed natural to me that I follow Mozilla news so eagerly. I thought that everybody does it. I mean, how is it even possible to use the web in any way without being at least a bit curious about the technology that runs it?

And then in 2008 I wrote a little unimportant post in my Hebrew blog about a funny spelling correction. Tomer Cohen commented on it and suggested me to try the Hebrew spelling dictionary and Hebrew Firefox in general. And that’s how my big love story with software localization began.

I started sending corrections to the translation of Firefox’s interface translation. I started sending corrections to the Hebrew spelling dictionary. I got so curious about the way the spelling dictionary was built that I ended up doing a whole university degree in Hebrew Language. Really.

And in 2011 I started working in the Language Engineering team in the Wikimedia Foundation. I love it, and it probably wouldn’t have happened without my involvement with Mozilla. In the same year I also became a Mozilla Rep—a volunteer representative of Mozilla at conferences, blogs and forums.

Probably the most important thing that I learned from my Mozilla story is that loving the web and being curious about it is not something obvious. Most people just want something that works for checking weather, news, Facebook friends updates, homework help and kitten videos. And for the most part, that is perfectly fine. But the people’s freedom to read reliable and complete news on any electronic device cannot actually be taken for granted. Neither the people’s freedom and privacy to share their thoughts in social networks. Mozilla is among the most important organizations that care for these things and it develops technologies that make them possible. Technologies that let you browse the web as well as possible and don’t try to do anything else.

We do it for one simple reason: We love the web.

Do you love it, too?

P.S. As I began writing this post, I realized that Microsoft’s Active Desktop was not so different from today’s devices, which are heavily based on web technologies: Firefox OS, Chrome OS and others. I can’t say that I love Microsoft, but as it often happens, it was quite pioneering with ideas, and not so good with their execution. Credit where credit’s due.

Always define the language and the direction of your HTML documents, part 01

I received this email from Safari Books Online:

Email in English from Safari Books, oriented like Hebrew

Email in English from Safari Books, oriented like Hebrew. Click to enlarge.

The email is written in English, but notice how the text is aligned unusually to the right. Notice also that the punctuation marks appear at the wrong end of the sentence. I used Firefox developer tools to apply the correct direction, and saw it correctly:

The same email, with corrected left-to-right formatting using Firefox developer tools

The same email, with corrected left-to-right formatting using Firefox developer tools

This happens because I use GMail with the Hebrew interface. GMail has to guess the direction of the emails that I receive, because in plain text there’s no easy way to specify the direction (I hope to discuss it in a separate post soon). Usually GMail guesses correctly. Ironically, for HTML-formatted emails like this one, GMail often guesses incorrectly, even though in HTML, unlike in plain text, it’s quite easy to specify the direction by simply adding dir=”ltr” to the root element of the email.

Unfortunately a lot of HTML authors don’t bother to specify explicit direction. Many are not even aware of this exotic dir attribute. Others think that because “ltr” is the default, they don’t have to specify it. They are wrong: As this email shows, the left-to-right HTML content is embedded in a right-to-left environment, and the “rtl” definition propagates to the embedded content.

You could blame GMail, of course, but it’s much more practical to always define the direction of your HTML content, even if it’s the default. You can never know where will your content end up.

P.S.: I read this post before publishing and suddenly realized that its style is quite similar to “Best Practices” books, such as Damian Conway’s classic “Perl Best Practices” – it tells you to do something that is not obviously needed, and explains why it is needed nevertheless. I like to acknowledge sources of inspiration. Thank you, Damian.

The International Union of Wikis

People who work with Wikipedia quickly run into the interlanguage links – links to other versions of the same article. Inside Wikipedia lingo they are also frequently called “interwiki links”, although actually it’s not quite right: Interwiki links is a much wider concept.

Wikis existed long before Wikipedia was the most popular wiki of them all. They were a strange idea – websites that anyone could edit. They tried various ways of creating an inter-wiki community, in which different wiki communities would exchange ideas and reuse content and skills. Various schemes to do that were proposed, but none of them ever caught on – the old-days wikis were respectable, but small, and the web was too large and free-form.

And then Wikipedia came. Wikipedia started as a yet another wiki, so it tried to blend in the wiki community. At some point it got interwiki links – easy ways to link to other websites. It is easy to link to another page inside the same wiki by adding square brackets, and it is only slightly harder to link to another wiki: Instead of writing a whole URL with http and all that, you would just write a short prefix and a name of a page, and that’s it.

But to which wikis it is possible to link? Thanks to the popularity of Wikipedia, MediaWiki and other wiki engines, there are thousands of them now, and you don’t have prefixes for all of them. The prefixes for Wikimedia projects were managed in the internals of the database by the small group of developers. The list was exported to the Wikimedia Interwiki map. And actually… it wasn’t used that much. The old dream of having a network of wikis which are not just Wikipedia hasn’t come true yet. But this may change now, because recently the process became more open and user-friendly: The Interwiki extension was installed on Wikimedia wikis.

This extension allows displaying all the available interwiki prefixes in a dedicated table. It also allows users with appropriate preferences to edit them. Take a look at the Interwiki table for the English Wikipedia and you’ll see all the prefixes. Many of them are language codes – these are the interlanguage links. But there are many others: wiki communities of city residents, scientists, programmers, librarians, enthusiasts of countries etc. If you try the URLs in the list, you’ll see that some target sites are sadly dead, so they should probably be removed from the list. But others can be quite promising – for example Appropedia, a knowledge base of collaborative solutions in sustainability, appropriate technology and poverty reduction. That’s a very positive thing, not just because sustainability is a nice thing, but because it’s great to have many specialized information sources and not just one huge Wikipedia.

Now Wikimedia wiki communities can add their own interwiki prefixes to link to other websites that may interest them. An example off the top of my head is that the Slovak Wikipedia community would add a prefix for easy linking to a site with information about Slovak culture. Of course, the language and the topic can be just about anything.

This feature, just like all other MediaWiki extensions is translatable to all languages in translatewiki.net. For example, here’s the translation of the Interwiki extension to Hebrew. The translation of the Interwiki extension to the Slovak language, which I mentioned earlier, is not complete yet and should be completed. If you are curious in translating the extension or any other component of MediaWiki into your language, open an account at that website and just start translating.

Why I Don’t Plan To Use Any Apple Products

Well, basically, because of this. If that page offends you, then you deserve to be offended.

And seriously, I have so many completely practical reasons not to use any Apple products:

  1. I don’t want to waste a second of my life on getting used to the weird Alt, Control, Command and Option keys, or whatever they are called there. I’m efficient with using keyboard shortcuts, which are similar in Windows and graphical desktop GNU/Linux environments with Windows-style keyboards. Every time I try to use a Mac, I immediately start climbing up the walls, because the shortcuts don’t work. If you tell me that once I learn them, it gets really natural, then you are defeating the whole Mac idea of “it just works”. Not that I ever seriously thought that it’s true.
  2. I love right-clicking and I hate control-clicking. I know that I can connect a normal mouse with two or three buttons, but the very idea that by default the mouse has only one button because I’m apparently too stupid to understand the difference between right-clicking and left-clicking offends me. And the Mac touchpads come with one button. Mac lovers tell me that I can use gestures to achieve the effect of a right click, but I hate gestures with a passion. Call me old-fashioned if you will.
  3. I’ll have to buy Mac OS X even though I’m not going to use it. I once spent an hour with an experienced Mac user trying to understand how to write Hebrew from right to left properly. I suppose that it’s possible to do it there somehow, but in 2012 I don’t want to waste a second of my time on an operating system in which it’s so hard to figure out how to do such a simple thing.
  4. I do not want documents to scroll the other way. I do not want documents to scroll the other way. I do not want documents to scroll the other way.

And all that – even before I get to the ideological points. For example, that Apple wants to kill the open web with walled-garden apps, that it forces app developers to get approval for everything, that its licenses are among the most obnoxiously proprietary.

That everything made by Apple is unnecessarily expensive just because it’s supposed to be more fashionable. Yes, they probably invested a bit more in design. Yes, they probably invested a bit more in the right alloy. But the main reason for their high prices is not the quality of the product and not even the fact that they are stylish, but because the high price is the thing that makes them more fashionable. This is preposterous and I am not cooperating with that.

Well, yes, Macs have certain positive points. A Mac can run all the development tools that I need – it comes with a usable Unix-style terminal and programming languages, such as PHP, Python and Ruby (I didn’t check, but probably Perl, too). It has a high-quality screen. On the average, Macbooks are usually thinner and lighter. But there are no Mac features that are compelling enough for me to bother to reconsider the above points.

What I really fail to understand is why so many Free Software developers use Macs – but that’s a topic for a separate post.


Archives