Category Archives: Authentication

Measuring password re-use empirically

In the aftermath of Anonymous’ revenge hacking of HBGary over the weekend, some enterprising hackers used one of the stolen credentials and some social engineering to gain root access at rootkit.com, which has been down for a few days since. There isn’t much novel about the hack but the dump of rootkit.com’s SQL databases provides another password dataset for research, though an order of magnitude smaller than the Gawker dataset with just 81,000 hashed passwords.

More interestingly, due to the close proximity of the hacks, we can compare the passwords associated with email addresses registered at both Gawker and rootkit.com. This gives an interesting data point on the widely known problem of password re-use. This new data seems to indicate a significantly higher re-use rate than the few previously published estimates. Continue reading Measuring password re-use empirically

Another Gawker bug: handling non-ASCII characters in passwords

A few weeks ago I detailed how Gawker lost a million of their users’ passwords. Soon after this I found an interesting vulnerability in Gawker’s password deployment involving the handling of non-ASCII characters. Specifically, they didn’t handle them at all until two weeks ago, instead they were mapping all non-ASCII characters to the ASCII ‘?’ prior to hashing them. This not only greatly limited the theoretical space of passwords, but meant that passwords consisting of any n non-ASCII characters were equivalent to ‘?’^n. Native Telugu or Korean speakers with passwords like ‘రహస్య సంకేత పదం’ or ‘비밀번호’ were vulnerable to an attacker simply guessing a string of question marks. An attacker may in fact know in advance that some users are from non-Latin countries (for example by looking at their email addresses) potentially making this more easily exploitable.

Continue reading Another Gawker bug: handling non-ASCII characters in passwords

The Gawker hack: how a million passwords were lost

Almost a year to the date after the landmark RockYou password hack, we have seen another large password breach, this time of Gawker Media. While an order of magnitude smaller, it’s still probably the second largest public compromise of a website’s password file, and in many ways it’s a more interesting case than RockYou. The story quickly made it to the mainstream press, but the reported details are vague and often wrong. I’ve obtained a copy of the data (which remains generally available, though Gawker is attempting to block listing of the torrent files) so I’ll try to clarify the details of the leak and Gawker’s password implementation (gleaned mostly from the readme file provided with the leaked data and from reverse engineering MySQL dumps). I’ll discuss the actual password dataset in a future post. Continue reading The Gawker hack: how a million passwords were lost

Passwords in the wild, part IV: the future

This is the fourth and final part in a series on password implementations at real websites, based on my paper at WEIS 2010 with Sören Preibusch.

Given the problems associated with passwords on the web outlined in the past few days, for years academics have searched for new technology to replace passwords. This thinking can at times be counter-productive, as no silver bullets have yet materialised and this has distracted attention away from fixing the most pressing problems associated with passwords. Currently, the trendiest proposed solution is to use federated identity protocols to greatly reduce the number of websites which must collect passwords (as we’ve argued would be a very positive step). Much focus has been given to OpenID, yet it is still struggling to gain widespread adoption. OpenID was deployed at less than 3% of websites we observed, with only Mixx and LiveJournal giving it much prominence.

Nevertheless, we optimistically feel that real changes will happen in the next few years, as password authentication on the web seems to be becoming increasingly unsustainable due to the increasing scale and interconnectivity of websites collecting passwords. We actually think we are already in the early stages of a password revolution, just not of the type predicted by academia.

Continue reading Passwords in the wild, part IV: the future

Passwords in the wild, part III: password standards for the Web

This is the third part in a series on password implementations at real websites, based on my paper at WEIS 2010 with Joseph Bonneau.

In our analysis of 150 password deployments online, we observed a surprising diversity of implementation choices. Whilst sites can be ranked by the overall security of their password scheme, there is a vast middle group in which sites make seemingly incongruous security decisions. We also found almost no evidence of commonality in implementations. Examining the details of Web forms (variable names, etc.) and the format of automated emails, we found little evidence that sites are re-using a common code base. This lack of consistency in technical choices suggests that standards and guidelines could improve security.

Numerous RFCs concern themselves with one-time passwords and other relatively sophisticated authentication protocols. Yet, traditional password-based authentication remains the most prevalent authentication protocol on the Internet, as the International Telecommunication Union–itself a United Nations specialized agency to standardise telecommunications on a worldwide basis–observes in their ITU-T Recommendation X.1151, “Guideline on secure password-based, authentication protocol with key exchange.” Client PKI has not seen wide-spread adoption and tokens or smart-cards are prohibitively cost-inefficient or inconvenient for most websites. While passwords have many shortcomings, it is essential deploy them as carefully and securely as possible. Formal standards and guidelines of best practices are essential to help developers.

Continue reading Passwords in the wild, part III: password standards for the Web

Passwords in the wild, part II: failures in the market

This is the second part in a series on password implementations at real websites, based on my paper at WEIS 2010 with Sören Preibusch.

As we discussed yesterday, dubious practices abound within real sites’ password implementations. Password insecurity isn’t only due to random implementation mistakes, though. When we scored sites’ passwords implementations on a 10-point aggregate scale it became clear that a wide spectrum of implementation quality exists. Many web authentication giants (Amazon, eBay, Facebook, Google, LiveJournal, Microsoft, MySpace, Yahoo!) scored near the top, joined by a few unlikely standouts (IKEA, CNBC). At the opposite end were a slew of lesser-known merchants and news websites. Exploring the factors which lead to better security confirms the basic tenets of security economics: sites with more at stake tend to do better. However, doing better isn’t enough. Given users’ well-documented tendency to re-use passwords, the varying levels of security may represent a serious market failure which is undermining the security of password-based authentication.

Continue reading Passwords in the wild, part II: failures in the market

Passwords in the wild, part I: the gap between theory and implementation

Sören Preibusch and I have finalised our in-depth report on password practices in the wild, The password thicket: technical and market failures in human authentication on the web, presented in Boston last month for WEIS 2010. The motivation for our report was a lack of technical research into real password deployments. Passwords have been studied as an authentication mechanism quite intensively for the last 30 years, but we believe ours was the first large study into how Internet sites actually implement them. We studied 150 sites, including the most visited overall sites plus a random sample of mid-level sites. We signed up for free accounts with each site, and using a mixture of scripting and patience, captured all visible aspects of password deployment, from enrolment and login to reset and attacks.

Our data (which is now publicly available) gives us an interesting picture into the current state of password deployment. Because the dataset is huge and the paper is quite lengthy, we’ll be discussing our findings and their implications from a series of different perspectives. Today, we’ll focus on the preventable mistakes. In academic literature, it’s assumed that passwords will be encrypted during transmission, hashed before storage, and attempts to guess usernames or passwords will be throttled. None of these is widely true in practice.

Continue reading Passwords in the wild, part I: the gap between theory and implementation

PINs and the burden on customers

A survey by the Consumers’ Association shows that 10% of cardholders write down or share their PIN. This high proportion surely raises serious doubt about whether it’s fair for banks to claim that such people are “grossly negligent” even if the PIN is well disguised (for example, as part of a phone number in an address book with hundreds of other numbers). And if banks don’t want disabled people to share PINs with carers, they ought to come up with an alternative, or be held to account under disability discrimination laws.

Interestingly, Mark Bowerman (PR for the banks) says in this article that customers should not use the same PIN for multiple cards. We heard him on radio saying exactly the opposite a few years ago. Now he tells people to change PINs to something easy to remember (and easier for criminals to guess).

By giving customers contradictory and impractical advice, the banks are placing an unmeetable burden on them.

The banks also frequently give advice that is simply wrong. Look, for example, at this video by Barclays showing how to enter your PIN at a merchant terminal!

Evaluating statistical attacks on personal knowledge questions

What is your mother’s maiden name? How about your pet’s name? Questions like these were a dark corner of security systems for quite some time. Most security researchers instinctively think they aren’t very secure. But they still have gained widespread deployment as a backup to password-based authentication when email-based identification isn’t available. Free webmail providers, for example, may have no other choice. Unfortunately, because most websites rely on email when passwords fail, and email providers rely on personal knowledge questions, most web authentication is no more secure than personal knowledge questions. This risk has gotten more attention recently, with high profile compromises of Paris Hilton’s phone, Sarah Palin’s email, and Twitter’s corporate Google Documents occurring due to guessed personal knowledge questions.

There’s finally been a surge of academic research into the area in the last five years. It’s been shown, for example, that these questions are easy to look up online, often found in public records, and easy for friends and acquaintances to guess. In a joint work with Mike Just and Greg Matthews from the University of Edinburgh published this week in the proceedings of Financial Cryptography 2010, we’ve examined the more basic question of how secure the underlying answer distributions are to statistical guessing. Put another way, if an attacker wants to do no target-specific work, but just guess common answers for a large number of accounts using population-wide statistics, how well can she do?

Continue reading Evaluating statistical attacks on personal knowledge questions