Improve your password cracking with the rockyou archive

Last December rockyou.com was compromised and 32 million passwords were leaked. Imperva and others have published some basic statistics about these passwords, but much more can be learned from this archive. Indeed this material gives very valuable information about how real users choose a password, and it can help to improve your cracking rules. For instance, here are a few observations obtained by simple greping :


Rockyou.com users clearly love numbers: 54% of all passwords contain at least a digit and 16% of them are entirely made of digits. Out of all the passwords that contain at least a digit, 43% of them use either a single digit or two consecutive ones. 91% of users put these one or two digits at the end of their password.


Punctuation characters are much less popular than numbers, as less than 4% of all passwords contain at least one. When rockyou users chose a punctuation character, 21% of the time they took a ‘!’ and 85% of the time they put it at the end.

Many more interesting patterns can be obtained from the rockyou archive. And all these patterns translate nicely into cracking rules for your favourite password cracker which will largely improve your cracking performances:

I believe the main problem with passwords it this: to select a password most people first pick a word or an easy sequence they can remember, and then they modify it to comply with the local password policy. But these additional modifications are very similar across people: usually digits and punctuation go to the end, while capital letters come at the beginning. Other linguistic patterns are strong: if people insert an opening parenthesis, they will most of the time close it within the same password (85% do in the rockyou archive).

Also people are lazy. If the password policy requires at least one capital letter, let’s use just one; after all, holding the shift key too long is kind of annoying. And if that painful administrator forces me to change my password every 2 months, why create a complete new password every time when I can just increment the last digit and pass the check ?

Now that large lists of real-world passwords are becoming more and more available, people are trying to automate the extraction of efficient rules from them. Two efforts worth mentioning are looking for hidden Markov models in passwords: the Markov generator by Solar Designer and the Probabilistic Password Cracker by Matt Weir.