Archive

Archive for the ‘Gimpy internet nonsense’ Category

Pharma hackers gonna pharma hack, 2013 edition

I was Googling for an old Banditry post yesterday, as part of a discussion about that new ‘people lie about their drinking’ study. Eventually I found it, only to discover that I’d linked to a (London) Times article, and that therefore the paywall had ruined the whole thing (curiously, even though the Times now shows unregistered users the headline, lede and first sentence for new articles, it completely screws up on old ones). So I more or less gave up on the post [*].

While Googling, I was rather surprised to discover the amount of content that I’d apparently written about the availability, acquisition and applications of various medicinal substances (link will hopefully die in a few weeks as Google updates itself). I briefly considered the possibility that in a fit of poverty and/or drunkenness I’d decided to set up my own online pharmacy, then remembered that I’m based in the country with some of the tightest controls on prescription drugs in the world so that would be rather silly. Rather, I’d been hacked.

I’ve been blogging for more than a decade now, so this isn’t the first pharmaceutical spam I’ve experienced: but it is the most insidious.

Creepy crawling

The hacked pages are tainted only to Google’s crawler – if you or I or anyone in the world who isn’t Google’s crawler click through to them, then they appear as originally intended, both in the browser and in the source code. So the spam-merchant gets to benefit from my PageRank without doing suspicious things to my traffic stats or making suspicious links appear on my actual site, which has been the giveaway for previous hacks. They also, cleverly, didn’t go  for an out-and-out hack of all pages, so if you google for “johnband.org” or search the site for a specific term that isn’t drug-related, then you’ll get the correct result, with no indication that some of the pages (mostly tag pages, category pages, and monthly archives) exist to Google only as pharmaceutical billboards.

Conveniently, Google has a funky-cool Fetch As Google tool, described here by their engineer Matt Cutts, which allows you to see exactly what the Googlebot sees when it crawls any page on your site. Sticking the affected pages into the tool confirmed that Google was still seeing them as pharmaceutically compromised. And that they’d been this way since last July-August.

So, I junked my evening plans and settled in for a night of Fun With WordPress, PHP, MySQL, Unix Permissions And Google. Which is my favourite sort of fun, obviously.

Hope, cruelly dashed

The top Google hit on the pharma hack, from blogger Chris Pearson, was an extremely well-written summary which described an identical problem to mine. “Result!”, I thought. So I followed Chris’s steps, only to discover that absolutely none of them worked. The trouble is, the pharma spammers are cleverer bastards than I’d thought: once the tricks of your trade are readily visible with a quick Google, you’re at a disadvantage. And Chris’s post dates from April 2010. Three years of malware evolution later, although his macro-level points are still worth a read, the actual techniques described were way obsolete.

Bugger.

So I Googled a bit more, mostly finding sites that repeated Chris’s solution, but eventually happening upon a couple of write-ups that were closer to my problem – at least, in the sense that they also found none of the things Chris describes, nor any of the obvious hacks I’ve experienced before like a doctored .htaccess file or dodgy-sounding access permissions, nor any changes to the main WordPress database… at least, none of the changes that anyone has noted online.

The most comprehensive, although perhaps the least comprehensible unless you’re ultra-techie, was a post from Shaun Green from February 2012. Short version: the current version of the hack creates php files with names that sound like they should be real WordPress files, and distributes them throughout your WordPress install but especially in the wp-includes folder so that they’re almost impossible to find and tell apart from real WordPress files without doing extremely nerdy things.

I’m not really a deep-level coder, so following all of Shaun’s steps sounded rather painful. And my install didn’t contain the specific filenames he lists (https.php and class-sftp.php), so I would have had to literally retrace his steps rather than just following his conclusions.

Instead, I went for a slightly lower-tech option. Everything in the wp-includes folder is a standard WordPress file, which shouldn’t have changed since installation. The same is true for everything in the wp-admin folder, and for everything in the WordPress root folder except for wp-config.php (which I’d already checked to make sure it wasn’t compromised). So I downloaded a vanilla version of WordPress 3.5.1, deleted everything from my install except for the wp-content folder (where themes, plugins and pictures are stored) and wp-config.php, and then copied the untainted files across.

One quick check on Fetch As Google later and – hurrah! – the pharmaceuticals had all disappeared. Now all I need to do is wait for Google to update its cache, and everything should be back to normal.

Gone forever?

While the problem was solved in the short term, it clearly wasn’t solved in the long term: I’d started with an uncorrupted WP installation, and someone had managed to corrupt it. So – after doing the basic password changing things, obviously – I installed Wordfence and Better WP Security. If you host your own WordPress blog (anything that isn’t on wordpress.com), then so should you. Wordfence is the equivalent of an antivirus program for your WordPress install; Better WP Security automates a whole bunch of handy lockdown and obfuscation tricks. Wordfence threw up a few vaguely suspicious files associated with some of the themes that were installed, so I deleted them; everything was then fine.

I’ve also set up Google Alerts that notify me if any new content appears on johnband.org containing various spammy keywords (the usual suspects), which obviously won’t be much use until the current spam-buggered content is removed, but will then allow me to kill any future infections before they’ve completely ruined my search results. I’ll update this post in the event that anything else occurs. If I remember, I’ll update it in a couple of months if nothing else has occurred, since zero is sometimes a helpful data point.

TL/DR: Was quite painful, could have been much worse. If this happens to you I definitely recommend the “for every folder which shouldn’t have changed since WP was installed, delete the folder and reinstall” approach, although do check the database and fix any issues there first. And set up the security things even if this hasn’t happened to you yet, because it probably will.

[*] Short version of post I was going to write: epidemological studies into alcohol-related harm are also based on self-reported consumption, so while it’s likely that everyone drinks more than they say, it’s also likely that alcohol is correspondingly less bad for you than those studies have shown, by about the same margin – unless we can come up with valid reasons why people would underestimate in one sort of study but not the other. Also, News Corporation are still unimaginably bad at digital strategy.

An open letter to Roy Wood

Dear Mr Wood -

For your information, in all known dialects of English, the phrase ‘snowman’ refers to a figure of a man which is made from snow (something like a statue, sculpture or perhaps golem), rather than a man who delivers snow.

I understand there is room for confusion here, when considering examples such as ‘milkman’, ‘postman’ and pizza boy’. However, ‘snowman’ is very much in the style of ‘wicker man’ (which is not a term for a gentleman who delivers baskets) or ‘iron man’ (which is not a term for either a gentleman who delivers ferrous metals, or a gentleman who works in a laundry).

Before seeking to write a song that seeks to summarise the Anglophone world’s Christmas traditions, one might have thought you would have done the most basic research into what these traditions actually were. I can promise you that none of them involve a snow delivery gentleman “bringing the snow”.

I look forward to this error being corrected in future releases of your popular chart hit.

Yours sincerely,
John B

PS Happy Christmas!

Artificial John B Impersonator

The text below is the output of my phone’s AI autocomplete app (SwiftKey), based on its knowledge of the things I say and do. My only interaction was to choose between the three suggestions it offered.

I have been made to the park. I am not going anywhere, and I will not receive the item is faulty, but the most important things. The only way I could have a good idea. The problem isn’t with the loan companies, who is a good film, but the subway. I am a real Zeus, and I am not sure what to expect.

The problem is that it is not a good idea. The only thing I have been a bit of fun. I have been a while, but I can confirm that you can get round to the zoo tomorrow, zoo tomorrow. And I know that analgesics, and I am not sure what to expect. I have a glass of wine. I have a glass of wine. The only way I could have a great day and the Machine, but it is. It has an appointment. The quality is the most of my own.

We have to do it. I have a glass of wine.. I don’t vividly the best. If I identified and then I will have the same. Federalism the best. The only problem is the first place to find a job, but the best way of organizing emails into the address below and click the Help section for the next few days ago.

It was a bit of a fine example of the most important thing. But it was a very long time to get the latest version, and the Democrats are total pansies who are not responsible for any reason. I have a glass of wine.

The quality of life and I am not sure what to expect. I don’t think it’s a good time to do it. It is a bit of a new member. I don’t vividly recall his own words. The problem isn’t a problem with your friends, but I guess the queen plays by her own rules.

K is technically right here at the moment. The quality is not a good day and night and then the next day or night. The only way to get a quote from you soon as possible to make a decision on the phone and the other side of things to come and see if you are looking for a few weeks, but it doesn’t matter how much I love you so that.

I have a glass of wine.

Slightly worried that in the reasonably near future, my phone will replace me.

Shot by how many sides again?

Everything is retro is coming round again, and so on. In that vein, I’ve dragged some offensive old blog or other from the depths of my spare hard drive. My foreword to the reprint may provide context. Dive in, if that’s your thing.

You’d probably do better just to watch this, though:

Fans of Sharpeners will like this

All the content from the long-defunct Sharpener group blog (formerly at thesharpener.net, before pirates stole the domain name) is now available at sharpener.johnband.org. The formatting’s basic, and categories have been lost; this may improve in future.

That was the easy-ish task, building a new WordPress 3.3.1 site based on a fairly arbitrary selection of obsolete MySQL databases (while junking all actual blog skins etc because they were compromised by virus-injecting malware types over the years). The next task, which will be super-exciting for fans of masochism, will be to set up a WordPress 3.3.1 blog and then import a whole bunch of tables from a non-standard, custom-built Access database into it.

Fans of controversy and excellence, and/or readers of my last post, may be able to guess which particular Holy Grail of magazine-titled Internet history will be revived as if by Dr Frankenstein at the end of this process.

That worked remarkably well, all things considered

If you’re seeing this, then my server migration was absolutely gangbusters-awesome, God’s in his heaven, and all’s right with the world. The Sharpener and SBBS projects may be slightly more challenging, but they are on the way. If you don’t know what the last sentence means, then I salute your wisdom in spending the mid-to-late 2000s on worthwhile pursuits.

Because I AM the Queen of the Zulus

Shannon, who is aces, just came up with the best mashup concept ever.

Civ + sex devices + Lulu + Lady Popular = “No, fuck YOU. This bling does look fabulous against my fur, because I am the queen of the Zulus, and you’re still fucking an analog blow-up doll.”

Blogging is dead and no-one cares?

My riot policing piece yesterday attracted 600 unique visitors in 24 hours. That isn’t exactly Perez Hilton, but is about six times my current normal run rate (I think the biggest this blog has ever been is about 1000 daily visitors, for some of the global financial crisis articles).

The fact that the piece had quite a few visitors isn’t too surprising, I suppose – it was a take on a newsworthy and important topic that dissented somewhat from the conventional wisdom, based on hours and hours of discussion with people who were on the scene across different English cities and/or who really understand counterinsurgency strategy. And it was pleasing to see strategy/COIN experts talking about it favourably.

The odd thing, though, is that whenever I’ve written a piece in the past that has gained masses of attention, it’s been through links from bigger blogs, news sources, or occasionally forums. This time, as far as I can see from my logs, there haven’t been *any* blog links to the piece. All the traffic is coming from retweets and reshares on Twitter and Facebook.

I wouldn’t go quite as far as to say that blogs are dead as a medium: the existence of a self-publishing platform with a fairly powerful off-the-shelf CMS, and that isn’t restricted to a particular social network, remains useful.

But it’s looking like the sense in which we’ve traditionally understand blogs – roughly, a community of people who link to each other’s posts, comment on them, and write pieces that track back to them – no longer really applies. Facebook and Twitter have killed it, in favour of something flatter and much less based on the blogger’s personal brand.

Have Google ever met any foreigners?

The World’s Unsurprisingly Fastest-Growing Networking Platform, Google+, is getting stick from various corners for its naming policy. This formally restricts you to “Use your full first and last name in a single language“.

The idea behind it is sensible. G+ aims to be a combination of a professional network like LinkedIn, and a personal network like Facebook, with ‘circles’ ensuring your clients can’t see your Tequila Night photos and that your girlfriend’s mates don’t get spammed with your articles on social media marketing.

In both those cases, the connections and relationships that people have become meaningful *because* they use their real names. It’s one of the reasons why Facebook, despite now having 750 million users encompassing many utter idiots, hasn’t descended into the kind of horrible pseudonymous anarchy found on MySpace or Bebo. So banning people from calling themselves thinks like HotBloke1988 or BieberFan1997 is probably a good thing.

Similarly, and also sensibly, Google wants to have proper segmentation between users, interests, brands. This is a model which Facebook took some years to implement properly, leading to the occasional whinge and/or viral petition from silly people when their inappropriately-set-up page gets taken down because it’s using a personal profile to advertise a product or political cause. Part of the reason for Google to be so hardcore about enforcing real names in the initial roll-out is to make sure that people understand from Day 1 that You Can’t Do That, and need to set up the proper sort of page for whatever you’re trying to spruik.

While I understand that this annoys some pseudonymous writers, I think they’re a sacrifice worth making in the short term to ensure that Google+ starts and continues as a place based around actual relationships and trust, like Facebook and LinkedIn. In the long term, there’s no reason why they shouldn’t adopt brand identities and share in G+ that way – there’s no real difference between ‘Skud’ and ‘TechCrunch’, in the sense that they’re both content sources defined entirely by what they publish online.

However, the real problem with this part of the G+ roll-out is the massively ham-fisted way in which it deals with anyone whose name doesn’t fit the Anglo-Saxon convention of Firstname Middlename1 Middlename2 Familyname. Which accounts for, erm, almost everyone in China, a sizeable proportion of the population of India, and everyone in Spanish-speaking and Russian-speaking countries. And would have been completely avoidable if even *one* developer from *one* of these countries had worked on the G+ project.

If you’re starting a new social network, it’s straightforward to build a database that has 12 name fields instead of 2. This allows you to account for any combination of names in any language, while also allowing your users to select which of those names are displayed in the default profile, and in which order.

So a Chinese person with a Western nickname could write their name as Lee (Familyname) Wan-Wing (Firstname) Robert (Nickname), and then choose to display their name as “Lee Wan-Wing” or “Robert Lee” depending on their preferred convention. The default to display would be Firstname Familyname, but any others would count. Similarly, a Spanish person could enter their name as Javier (Firstname) Garcia (Familyname) Lopez (Matronymic), while a Russian would be Mikhail (Firstname) Sergeyevich (Patronymic), and a South Indian would be Prashant (Name) Kumar (Patronymic). This would make all names traceable and transparent, while also ensuring that everyone gets the opportunity to pick something that’s culturally appropriate.

Given that Google employs 10,000 staff outside of the US, including many Indians and many Chinese people, it seems bizarre that this concern doesn’t appear to even have arisen during the G+ roll-out. Differences in database design formed by the use of English versus non-English users have been a massive concern in Internet circles for decades, as highlighted most obviously by the time taken to allow non-ASCII characters for domain names. Any multinational company has to deal with the “names don’t map onto English names” problem for its own staff, even if its customers are largely based in the west (surely there can’t be a software company in the world that doesn’t employ South Indians?).

The only explanation I can think of is that it simply didn’t occur to the senior managers in charge of Google+ that different people worldwide might have different naming concepts. And that none of the less senior foreigners raised the concept. God Bless America!

Baffling or flattering?

As if to add ammo to the fervent Marxists who’ve been criticising me for my slavish adherence to neoliberal economics lately [*], I’m going to admit that I’m a fan of The Economist on Facebook.

Not because it’s my favourite paper – I subscribe to the New Yorker, Private Eye and Crikey, and would subscribe to the Grauniad if it went PPV – but because it’s interesting, shapes debate, has a good Facebook presence, and the Facebook comments mechanism gives a better view of “what people think” than the “solely for ubergeeks and psychopaths” den of web comments.

One of the things that I’m looking at right now, both academically and professionally, is the challenge presented by dealing with things that have historically been marketed and customised territory-by-territory in a social media environment that’s global. The Economist provides an excellent example, since every week, it lists its covers on the Web.

Now, if you don’t commute far too often between the US and Other Places, you’re probably not aware that the Economist has covers in the plural: both in the US and outside the US, it purports to be a global newspaper (and, compared to US newspapers, it has a fair point). But it isn’t: there’s a US edition with specifically US-focused content, ads and cover, whereas the global edition only has a US cover if the most exciting thing occurring is actually in the US.

If the Economist admitted to its US readers “yes, actually, we do realise you’re a bunch of insular tits just as much as the rest of your countrymen; stop pretending you’re some kind of cosmopolitan international relations knowall just because you read a paper written by slightly-right-wing people in London instead of raging-right-wing fanatics at home; and we all know we only bother printing international news at all in the US version because otherwise we’d lose our USP; we know perfectly well – and it’s clear from our ad placings – that none of you lot read it”, then it might just about risk losing some of its mystique as an international oracle. Which would kill its whole point

So for the Economist’s Facebook presence, where discriminating between visitors from different countries is hard, it definitely wouldn’t want to show a separate “US Edition” and “World Edition”. That would break the spell.

The way it has dealt with this is ABSOLUTELY FUCKING BRILLIANT. Every week, it adds a “Worldwide Excluding the UK, Europe & Asia Edition” and a “UK, Europe & Asia Edition“. That way, Americans – who are sufficiently geographically disendowed to realise that the world, in any meaningful sense, consists of North America, Europe and Asia – can keep the illusion that they’re reading the World Edition, unlike those silly Europeans and Asians who’ve got a customised edition to suit their own parochial concerns. And we (Asia edition is sold in Aus and NZ, obviously) can work out the conceit and laugh at the Americans.

Overall, this is a great win. Except for the poor sods in Canada, South America and Africa, who presumably have to make do with the lobotomised edition containing news that’s irrelevant. Although I suppose for the South Americans it might help them understand when they’ll next be invaded by CIA-backed guerrillas.

[*] my slavish adherence consisted of making the claim that “pretending that basic economics and tax are hard, if you’re someone who purports to understand postmodernist literacy criticism, is embarrassing”. This isn’t because I rate one over the other, but simply because both neoclassical and Keynesian economics are Very Easy To Follow, whilst Derrida and Deleuze are The Opposite.