Tyler Moore and I have been looking into how phishing attackers locate insecure websites on which to host their fake webpages, and our paper is being presented this week at the Financial Cryptography conference in Barbados. We found that compromised machines accounted for 75.8% of all the attacks, “free” web hosting accounts for 17.4%, and the rest is various specialist gangs — albeit those gangs should not be ignored; they’re sending most of the phishing spam and (probably) scooping most of the money!
Sometimes the same machine gets compromised more than once. Now this could be the same person setting up multiple phishing sites on a machine that they can attack at will… However, we often observe that the new site is in a completely different directory — strongly suggesting a different attacker has broken into the same machine, but in a different way. We looked at all the recompromises where there was a delay of at least a week before the second attack and found that in 83% of cases a different directory was used… and using this definition of a “recompromise” we found that around 10% of machines were recompromised within 4 weeks, rising to 20% after six months. Since there’s a lot of vulnerable machines out there, there is something slightly different about the machines that get attacked again and again.
For 2486 sites we also had summary website logging data from The Webalizer; where sites had left their daily visitor statistics world-readable. One of the bits of data The Webalizer documents is which search terms were used to locate the website (because these are available in the “Referrer” header, and that will document what was typed into search engines such as Google).
We found that some of these searches were “evil” in that they were looking for specific versions of software that contained security vulnerabilities (“If you’re running version 1.024 then I can break in”); or they were looking for existing phishing websites (“if you can break in, then so can I”); or they were seeking the PHP “shells” that phishing attackers often install to help them upload files onto the website (“if you haven’t password protected your shell, then I can upload files as well”).
In all, we found “evil searches” on 204 machines that hosted phishing websites AND that, in the vast majority of cases, these searches corresponded in time to when the website was broken into. Furthermore, in 25 cases the website was compromised twice and we were monitoring the daily log summaries after the first break-in: here 4 of the evil searches occurred before the second break in, 20 on the day of the second break in, and just one after the second break-in. Of course, where people didn’t “click through” from Google search results, perhaps because they had an automated tool, then we won’t have a record of their searches — but neverthless, even at the 18% incidence we can be sure of, searches are an important mechanism.
The recompromise rates for sites where we found evil searches were a lot higher: 20% recompromised after 4 weeks, nearly 50% after six months. There are lots of complicating factors here, not least that sites with world-readable Webalizer data might simply be inherently less secure. However, overall we believe that it clearly indicates that the phishing attackers are using search to find machines to attack; and that if one attacker can find the site, then it is likely that others will do so independently.
There’s a lot more in the paper itself (which is well-worth reading before commenting on this article, since it goes into much more detail than is possible here)… In particular, we show that publishing URLs in PhishTank slightly decreases the recompromise rate (getting the sites fixed is a bigger effect than the bad guys locating sites that someone else has compromised); and we also have a detailed discussion of various mitigation strategies that might be employed, now that we have firmly established that “evil searching” is an important way of locating machines to compromise.
Somewhat off topic: I just read Willem Buiter’s wishlist for regulating the financial sector (http://blogs.ft.com/maverecon/2009/02/regulating-the-new-financial-sector/) and it occurred to me that they could do with the insight of security engineers (which I am not) into the problem.
For example, would it be a good idea to follow the standard way of keeping airport screeners awake – inserting artificial ‘tests’ which happen more frequently than real crises? Maybe rather than prescribing what capabilities a bank needs to withstand a crisis, the regulator should actually introduce defaults into the system and see what happens.
I know at least some of you guys are interesting in economics. Could you get yourselves invited into this regulatory design activity?
Pardon my pendantry, but
Your use of quote marks around “evil” is unnecessary.
How is “evil” defined?
Like any word it is defined by classic usage, and the usage of the most influential users, which in this case, are the writers of classic fairy tales. In which case:
Evil means harm
Evil means intent to willfully do harm for inadequate reason.
Evil means harm intentionally caused by person.
Evil is the act of doing harm
An evil person is a person who might likely do harm for no very compelling reason.
Any characteristics or behavior indicative of intent to do harm or propensity to do harm is evil.
In which case, these searches are evil, no quote marks required, because the searcher intends to do harm, and the searches are indicative of intent to do harm.
Your use of quote marks suggests that your usage of the word “evil” is unduly influenced by moral philosophers, but philosophers do not define correct usage of words. Actual non ironic usage by influential speakers and writers defines correct usage of words.
Spammers are evil, phishers are evil, people who break in to other people’s web sites to use them for their own purposes at the expense of the rightful owner are evil, no quote marks are necessary. Most of the time it is not that hard to say who is evil and why.
Worst “I’m off to Barabos” post ever 😉
Considering all the other “word” words being quoted, I think the use of “evil” is helpful.