Exploring bad passwords

If your password is in the file rockyou.txt then it’s a bad password. Password cracking software will find it instantly. (Use long, randomly generated passwords; staying off the list of worst passwords is necessary but not sufficient for security.)

The rockyou.txt file currently contains 14,344,394 bad passwords. I poked around in the file and this post reports some things I found.

To make things more interesting, I made myself a rule that I could only use command line utilities.

Pure numeric passwords

I was curious how many of these passwords consisted only of digits so I ran the following.

    grep -P '^\d+$' rockyou.txt | wc -l

This says 2,346,744 of the passwords only contain digits, about 1 in 6.

Digit distribution

I made a file of digits appearing in the passwords

    grep -o -P '\d' rockyou.txt > digits

and looked at the frequency of digits.

    for i in 0 1 2 3 4 5 6 7 8 9
    do 
        grep -c $i digits
    done

This is what I got:

    5740291
    6734380
    5237479
    3767584
    3391342
    3355180
    3118364
    3100596
    3567258
    3855490

The digits are distributed more evenly than I would have expected. 1’s are more common than other digits, but only about twice as common as the least common digits.

Longest bad passwords

How long is the longest bad password? The command

    wc -L rockyou.txt

shows that one line in the file is 285 characters long. What is this password? The command

    grep -P '.{285}' rockyou.txt

shows that it’s some HTML code. Nice try whoever thought of that, but you’ve been pwned.

A similar search for all-digit passwords show that the longest numeric passwords are 255 digits long. One of these is a string of 255 zeros.

Dictionary words

A common bit of advice is to not choose passwords that can be found in a database. That’s good advice as far as it goes, but it doesn’t go very far.

I used the comm utility to see how many bad passwords are not in the dictionary by running

    comm -23 sorted dict | wc -l

and the answer was 14,310,684. Nearly all the bad passwords are not in a dictionary!

(Here sorted is a sorted version of the rockyou.txt file; I believe the file is initially sorted by popularity, worst passwords first. The comm utility complained that my system dictionary isn’t sorted, which I found odd, but I sorted it to make comm happy and dict is the sorted file.)

Curiously, the command

    comm -13 sorted dict | wc -l

shows there are 70,624 words in the dictionary (specifically, the american-english file on my Linux box) that are not on the bad password list.

Smallest ‘good’ numeric password

What is the smallest number not in the list of pure numeric passwords? The following command strips leading zeros from purely numeric passwords, sorts the results as numbers, removes duplicates, and stores the results in a file called nums.

    grep -P '^\d+$' rockyou.txt | sed 's/^0\+//' | sort -n | uniq > nums

The file nums begins with a blank. I removed this with sed.

    sed -i 1d nums

Next I used awk to print instances where the line number does not match the line in the file nums.

    awk '{if (NR-$0 < 0) print $0 }' nums | less

The first number this prints is 61. This means that the first line is 1, the second line is 2, and so on, but the 60th line is 61. That means 60 is missing. The file rockyou.txt does not contain 60. You can verify this: the command

    grep '^60$' rockyou.txt

returns nothing. 60 is the smallest number not in the bad password file. There are passwords that contain ’60’ as a substring, but just 60 as a complete password is not in the file.

Related posts

2 thoughts on “Exploring bad passwords

Comments are closed.