password dictionary generator

I had the need to generate a password dictionary that would cover every possible combination for a defined character set.  I first learned to program in Python so I was going to start there first.  Before writing the program I decided to Google and see if anyone else had tackled this problem via Python, turned out they had.  Siph0n posted his Python code to create a password dictionary over at the BackTrack forums.  I wanted to post it here as a mirror and to discuss the implications of creating a password dictionary with every possible combination.  Below is the Python code.

f=open('wordlist', 'w')

def xselections(items, n):
    if n==0: yield []
    else:
        for i in xrange(len(items)):
            for ss in xselections(items, n-1):
                yield [items[i]]+ss

# Numbers = 48 - 57
# Capital = 65 - 90
# Lower = 97 - 122
numb = range(48,58)
cap = range(65,91)
low = range(97,123)
choice = 0
while int(choice) not in range(1,8):
    choice = raw_input('''
    1) Numbers
    2) Capital Letters
    3) Lowercase Letters
    4) Numbers + Capital Letters
    5) Numbers + Lowercase Letters
    6) Numbers + Capital Letters + Lowercase Letters
    7) Capital Letters + Lowercase Letters
    : '''
)

choice = int(choice)
poss = []
if choice == 1:
    poss += numb
elif choice == 2:
    poss += cap
elif choice == 3:
    poss += low
elif choice == 4:
    poss += numb
    poss += cap
elif choice == 5:
    poss += numb
    poss += low
elif choice == 6:
    poss += numb
    poss += cap
    poss += low
elif choice == 7:
    poss += cap
    poss += low

bigList = []
for i in poss:
    bigList.append(str(chr(i)))

MIN = raw_input("What is the min size of the word? ")
MIN = int(MIN)
MAX = raw_input("What is the max size of the word? ")
MAX = int(MAX)
for i in range(MIN,MAX+1):
    for s in xselections(bigList,i): f.write(''.join(s) + '\n')

If you’re familiar with programming and Python in particular then you could just grab the code and roll but I really wanted to discuss the usefulness of an application like this.  First I will discuss the basics of how to get this program up and running but will eventually jump into other implications such as time, storage, and usefulness of a password dictionary.

How to install and use the program

  1. You must have Python installed.  If you’re running Linux (you should be) then it’s probably already installed.  If you’re running then Windows then you will have to download Python.
  2. Now that you have Python installed simply copy and paste the code above into a text file and name it passwordDictionaryGenerator.py.  The .py extension is needed because that’s how Python recognizes code that it’s suppose to execute.
  3. Modify appropriate variables within the program.  The only variables you may want to modify are numb, cap, and low.  These variables contain the ASCII equivalent ranges for the letters and numbers you will be using to generate your dictionary.  You may want to modify these variables so that your dictionary does not contain a-z but only a-k, I’ll leave that up to you.
  4. Now to run the program simply type
    python passwordDictionaryGenerator.py

    You will have to answer the questions about which character set you want to use and how long / short your password dictionary is going to be.  Once you answer the questions it may seem like the program isn’t doing anything but it is, it will spit you back to the command line once the program has completed.  The output will be a file called wordlist.

So now you have this cool program that can generate a password dictionary for you, how big (size MB, GB, TB, etc) will this dictionary be?  How long will it take to generate this dictionary?  Let’s tackle the size question first as it will help us calculate the time as well.  The key to calculating the size is a math term called permutations.  Permutations is a simple equation to determine the number of words for that particular character set and length of word.  The basic equation is below.

nr

n = total character set (e.g.  a-z + A-Z + 0-9 = 62)

r = length of the word

Now you’ll have to calculate nr for each length to get every possible combination.  So for a 6 digit long password your equation will look like the following.

n6 + n5 + n4 + n3 + n2 + n1 = every possible combination

Let’s try an example where our character set is a-z (n = 26) and our password is no longer than 6 (r = 1-6) digits, how many words will be in our dictionary?

266 + 265 + 264 + 263 + 262 + 261 = 321,272,406 = total # of words

So now we understand how to calculate the total number of words in our dictionary.  How does that relate to the size?  Well for the most part if the length of the password is x then the size in bytes will be x + 1 for that particular line.  Then all we have to do is multiply each nr times the size of that particular line to get the size for that particular length.  That may have just sound really confusing so hopefully the following graph clears that up some.

I went ahead and generated this dictionary, it took about 30 minutes.  Turns out the size matched my calculations.

So now you have the basic formula for calculating the size of your desired dictionary.  Let’s take a look at a larger example just to cure our curiosity.  Let’s assume the following parameters.

  • character set = a-z, A-Z, & 0-9
  • password length = 1-8
  • n = 62
  • r = 1 – 8

With these parameters the size of our dictionary jumps to 1,800 terabytes or 1.8 petabytes. Take a look at the chart below.

You can see how quickly the size jumps up. I don’t know about you but I don’t have a two petabyte drive lying around. Generating this dictionary is just infeasible. I did calculate the time it would probably take to generate this dictionary, it came out to be about 11 days. So the time to create such a dictionary is nothing compared to the storage required to house it. Not only that I don’t know to many applications that can handle a large dictionary as input, so that’s another factor you’ll have to keep in mind when generating your dictionary.

Calculating the time it takes to generate these dictionaries I’ll leave up to you.  The basic idea is that you can run the python program for a particular length password for a set amount of time and then extrapolate form there.  For the most part time isn’t really a factor but storage is. The concepts I’ve talked about here are nothing new. The idea of generating a password came to me and my coworkers as we were thinking of ways to test a WPA wireless infrastructure. Attacking WPA can be done offline so we were thinking of generating a dictionary to accomplish this. Hours later we soon realized the difficulty with generating such a large dictionary. This was actually good news because it meant that an attacker would have an extremely difficult time attacking a WPA access point with a complex password. Renderman and the Church of Wifi have thought about this problem way before I did and came up with some rainbow tables to help test the strength of your WPA access point. You can’t really create a dictionary with every single combination for a lengthy password, your best bet is to create a dictionary with the most “common” passwords, which is no easy task either.

The moral of the story is to use lengthy complex passwords with a high character set, but you knew that already. So I just suggested that this program is somewhat useless, well it is but it isn’t. You can use this program to generate a small dictionary but a large dictionary (greater than a couple of terabytes) is probably out of the question. So use this program and let me know what your results are, I’m always interested in your feedback. Happy cracking.

29 Responses to “password dictionary generator”

  1. eskisehirli Says:

    hi
    before all thanks for this script

    im using vista 64 bit with intel
    trying to run this script but it says …..line 4 identation error expected an indentend block

    how can i solve this ?

    thank you

  2. travis Says:

    eskisehirli,

    good catch, i’ve run the code but not the version i copied and pasted into this article. python relies on indentations to differentiate sections of code which is why you were getting the error, there’s no indentations. i fixed the issue, give it a shot now.

  3. steve_j Says:

    Firstly, thanks for a very helpful script.
    I am looking to use this code to generate every combination of upper case characters, (AAAAAA to ZZZZZZ, 6 characters), which i know the program is already capable of, however I wish to precede these 6 characters with AA so that the resulting dictionary contents are AAAAAAAA to AAZZZZZZ. Is this possible?

    Many thanks

  4. Peekr Says:

    A great article that discusses the tool plus the downside as well.

    I would’ve liked to see additional info on using the dictionary, say with Hydra to crack routers login pages and it’s downside.

  5. travis Says:

    Peekr,

    I have given up on Hydra and now use Medusa. I teach network security and was giving a demo of Hydra brute forcing ftp. Hydra missed some of the ftp servers, for what reason I don’t know? Medusa seems to be more reliable for me.

    When it comes to using a dictionary with Medusa I have only tried smaller dictionaries (around 100 passwords). Medusa or Hydra may choke when you try to feed them a 200 MB file, not sure because I haven’t tried. In most scenarios at work I don’t have the time to let Medusa bang away at a protocol, typically I’m checking for low hanging fruit with Medusa and will only let it run for a couple of days at the most.

    Great suggestion though, I’ll look into what Medusa can handle and when/where it makes sense to use.

  6. help Says:

    i need help generating an 13 character sector to every possible combination cant quite figure this out so assistance would be nice thanks acdefginopsux these are the characters it can go all the way down to 2 characters up to the 13 listed if you can help email me

    the_vicious_one@hotmail.com

  7. help Says:

    also the password field I’m trying to make this list for requires you use a non alphabetical character in between each letter except at the begging and end so I’ve chosen % if you can incorporate this it would be greatly appreciated but it is not necessary as i can add it when I’m entering the fields manually.

  8. richie Says:

    this is a great tool however, i was wondering if it couldn’t be modified to pass the output directly to another application to be processed and ignored/accepted as opposed to writing the wordlist file, as this would greatly reduce the amount of space required. just a thought.

  9. travis Says:

    richie,

    i’m not sure what you mean, could you give me an example. most tools that i know of love a text file as input. for me the two tools that jump to mind are john the ripper and medusa. you could script the entire thing, meaning first run this python script and then run either john the ripper or medusa. i haven’t done this before but could definitely give it a shot.

  10. richie Says:

    @travis, thanks for your reply, to be honest, i didn’t have a specific tool in mind. i was just wondering if bruteforcing could be done on the fly without a txt file, as this would take the 2 terrabytes plus space requirement out of the equation. i realise this could drastically increase the amount of time required to test every possible permutation but it would mean that the hit and miss of dictionary attacks would be a thing of the past.

  11. Mihai Says:

    Hi how can i change the program so it can generate passwords just Capital Letters 8 lenght but NO CONSECUTIVE LETTERS? for ex: AGANFADO ?

  12. nick Says:

    Hello,

    Thank you very much for the script. I would like to add a symbols(!,@,#,$,%,^,&,*,(,),_,+,{,},|,?,,,.,,\,-) to the script, would please give me some guides.

    Thank you very much

  13. Mihai Says:

    On i mean No consecutive Duplicates letters .

  14. travis Says:

    nick,

    you could just add another variable that would grab those special characters, the code in this article may not be the way I would go about it but the problem was already solved. So below the variable “low” you could add a variable called “specialChars” and set it equal to “range(33,48) + range(58,65)”. These values are pulled from an ASCII table. Once you’ve added this variable you will either have to add an eighth option or modify one of the seven options to include the specialChar variable. If this isn’t fully clear please let me know.

  15. Elliot Says:

    @Richie

    Old post, no one cares, but so you know:

    You can have python output to sdout and then pipe the input into whatever program. (on linux ofcourse) with aircrack attacking a wpa network it would look like:

    ./pwgenerator.py | sudo aircrack-ng -b -w –

    People pipe JTR into aircrack also, because JTR can use rules, incremental and external methods.

  16. Matej Says:

    realy thanks for this script. i have been lookin for it. ….. and sorry my bad english. 🙂

  17. poti Says:

    stumbled upon this site http://dazzlepod.com/uniqpass/ from full-disclosure; there are some common passwords that may be useful here – not sure about the paid list though!

  18. travis Says:

    poti,

    Thanks for the link, I have not used their password list but I’ll keep this handy for future reference.

  19. Tristan Says:

    hi

  20. Mufasa Says:

    Thanks for posting this. It was exactly what I was looking for and works great on my Mac. 🙂

  21. zeroday Says:

    hello how to generate A-Z – 0-9 with exceptions (!) eq: 000 not possible; with minimum letter or minimum digit… this is the problem… ty

  22. travis Says:

    Mufasa,

    Mac’s are like Linux in that way, Python comes already installed.

  23. travis Says:

    zeroday,

    Not sure I understand your problem, could you restate your question?

  24. zeroday Says:

    hi travis.. i would like to generate wordlist with charset AZ-09 (10 chars) with conditions… because conditions (filter) is very important (for me).
    ex: if word have three A or B etc… skip three 0 to.. skip
    How to make intelligent filter ?
    Thank you and sorry my english is very bad im french.
    +

  25. travis Says:

    zeroday,

    So you’re saying option 4 with min size 0 and max size 10 will not work for you? I think I know what you are trying to do but could you give me another example? Sorry for the back and forth.

  26. zeroday Says:

    HT5HM5T3ZK
    U9ZRT47CVB
    … > OK

    00000012AAA
    11111KJLQ0S
    … SKIP IT BECAUSE IT’S PROBABLY IMPOSSIBLE THAT IS THE KEY

    do u understand my friend? :))

  27. travis Says:

    zeroday,

    I think I understand you but not sure the best way to code that solution.

  28. me Says:

    Thank you so much for taking the time to put together such great explanation!

  29. pirater un compte facebook a distance Says:

    I need to to thank you for this good read!! I definitely loved every bit of it.

    I have you book marked to look at new stuff you post…

Leave a Reply