Categories
dotnet programming reverse engineering

Reverse engineer an obfuscated .Net application

Some of the concepts I’ll be covering will be new to some people and may be hard to understand but for others who are familiar with this field will find the concepts simple. Hopefully no matter what your comfort level or experience you’ll get something out of this.

First I’ll give a brief intro to .Net and how it varies from other applications. There two main kinds of applications / programs that you install on your Windows 7 / XP / etc machine, these are either compiled or interpreted. Compiled applications can be a standalone executable but usually you’ll have to install it by clicking the usual “Install, Next, Next, Finish” on the other hand an interpreted application doesn’t require this. An interpreted application comes as an EXE but an interpreted application does require an “application virtual machine”, the interpreted application doesn’t run natively on your machine instead the virtual machine takes care of all the execution. In this case the .Net framework is the application virtual machine we’ll need to have installed. You may have downloaded an EXE in the past and got the message that it requires “.Net framework 3.5” that’s because you downloaded an interpreted application. Java is the very similar to the .Net framework in that it employs a virtual machine as well which explains why almost everyone has to have both Java and .Net installed on their windows machine.

Compiled apps are in assembly machine language and we don’t have source code and going from compiled assembly back to original source code is difficult which is why companies like microsoft and apple only give you compiled applications because they want to make it hard for you to get original source code. With interpreted applications it’s quite different, for the most part it’s easy to decompile interpreted applications back into their original source code. This is true generally because the application virtual machine expects much more information about the application it’s about to compile and execute which makes going in reverse easier as well. I may not have explained that well but hopefully this stackoverflow thread will shed some light.

So by nature of having a .Net application it’s easier to get source code from the compiled exe which makes it easier to reverse engineer. Some developers and most users are not aware how easy it is to get source code from applications that run in a virtual machine. Developers that are aware of this sometime utilize obfuscation techniques so that competitors aren’t easily able to obtain source code. Obfuscation is simply a way of hiding something you don’t want others to know about. Typically it goes “source code –> obfuscation –> hard to understand (garbage)”. This is a simple explanation of obfuscation and some techniques are better than others. There are a handful of obfuscation programs on the market that developers can use to hide their code. In the example I’ll be showing you the developer used an obfuscation technique but this isn’t going to stop us from reversing the program then modifying it to our content.

So the first thing you’ll need to do is determine if the exe you have is actually a .Net application and not some other compiled application. There are a couple of tools but the one I like the most is CFF Explorer. Once installed you can right click on any exe and it’ll try and determine what language and compiler was used to make the executable. Let’s browse to firefox.exe, right click and choose CFF Explorer.

From the output of CFF Explorer we can tell that the language Visual C++ was used to create this executable and that it’s a PE (portable executable) file type. PE’s are normally compiled using one of the C based family languages. This differs from application virtual machine compiled executables such as a .Net application. I’ve created a small .Net application called GuessPassword.exe that we’ll discuss in just a bit but first let’s look at the output from CFF Explorer on GuessPassword.exe.

Here we can confirm that it’s a .Net application. So in both cases we have an EXE that we’re not completely sure what type of executable it is but with CFF Explorer we can easily find out. So step one was identifying that you have a .Net application, step two will be to decompile the EXE. There are numerous tools out there that can decompile .Net applications. One of the more popular but paid tools is Reflector but an awesome free open source tool that I’ll be going over is ILSpy. So grab the GuessPassword.exe here and open it up in ILSpy. Keep in mind you’ll need to either have or install the .Net framework for all of this to work. Below is a screenshot of what GuessPassword.exe looks like in ILSpy.

ILSpy will automatically decompile the application, not all at once put as you go down into the tree of the application. So the screenshot above may not mean much to you if it’s your first time looking through a decompiled .Net application. There are a couple of things to keep in mind. One only focus on items within the tree of the application you are reviewing, in this case GuessPassword.

The items that I’ve marked as ignore are dependencies of GuessPassword.exe that ILSpy pulls in. Secondly you only really need to focus on the “pink bricks” inside of ILSpy. Most of the other components of the tree can be ignored as they’re simply tying up loose ends such as defining text boxes the “real” code lies in the pink bricks.

With this being said go ahead and load GuessPassword.exe into ILSpy and click on the first pink brink named “button1_Click(object, EventArgs): void”, you should see the following.

This is a very simple application and from the code above you can probably guess what happens. The application checks to see if the password entered into the text box is “monkey”, if it is the message “Correct” is presented if it isn’t then the message “Incorrect” will be seen. This is bad practice as you should never hard code passwords into your application but I wanted to show you what it would look like when you’re asking for input from the user. Go ahead and run GuessPassword.exe and give it a shot.

So this was fairly straight forward and you can see how easy it is to decompile, review code, reverse engineer, and determine ways to subvert the logic of the application. Even though this is considered straight forward the same concepts apply on all applications. At this point we could actually modify this compiled application and change the password “monkey” to anything we like or make it so that we don’t even need a password, more on that later.

Now it’s time to take our skills to the street. I randomly pulled down an application from download.com called SafeAsHouses. This application is a password safe that keeps all of your passwords in one place, or at least that’s the plan. Our goal is to reverse engineer this application so that we can subvert the authentication and glean all passwords. You can grab a copy of SafeAsHouses over at download.com but I’ve also saved a copy here in case it gets lost in the internets. If you grab it from download.com then the initial password = password and you can change it from there but my copy in the tools directory has the password of “monkey”. Next let’s throw SafeAsHouses into ILSpy and see what we get.

As you can see from the screenshot SafeAsHouses immediately shows some signs of obfuscation. In a normal situation the pink bricks will contain the name of the class used, in the case of GuessPassword.exe it’s “Form1” but SafeAsHouses is hiding this information. Let’s dig further and compare the decompilation of the first pink brick / class for both GuessPassword and SafeAsHouses, first GuessPassword.

Now SafeAsHouses.

Once again we see signs of obfuscation. Anywhere you see a gray box is a sign of obfuscation and this obfuscation technique is essentially hiding names within the code such as variable names. This won’t stop us but it does make the reversing process a little bit more difficult. So now you’ve successfully decompiled the application and see the signs of obfuscation, at this point you could start digging through code but it’s always to run the application to get a better understanding of what the application is doing.

So soon as you launch the application we see that it’s asking for a password, after typing in the correct password we’ll have access to all of our usernames and passwords. If we enter the incorrect password the application will show a message box saying “Password incorrect”.

So we’re trying to figure out how to get around the whole password authentication, that being said there are several areas we can focus on. The first is the fact that the application asks for input before allowing us full access into the application. The second is that the application will close if the input is incorrect. What strings can be seen during this process? Another thing to consider is the fact that the application presents you with one window for authentication then shows you a second window where your passwords are stored.  For each consideration you should ask yourself what would the code look like in a .Net application.

In the case of input it’s more than likely form input. In my simple GuessPassword.exe the code that takes is “this.textBox1.Text” where I provided the name textBox1 which is actually the default variable name give inside of Visual Studio. So in the application you’re trying to reverse you would need to look for a similar construct, this can be done manually or you can search through the decompiled code. Keep in mind when searching or viewing manually an obfuscated application you probably won’t see something similar to “this.Variable.Text”, at the very least an obfuscation tool will mangle “Variable” so keeping your eyes open for “this.something.Text” maybe your next best step.

So the application closes when an incorrect password is typed in, what does the code look like to close an application in .Net? For the most part the application will be closed with an Application.Exit or Environment.Exit so when you go to reverse the application those are the constructs you’ll need to keep in mind for when you search or manually review the decompiled code.

I mentioned “strings” earlier, one thing we notice when the application closes is “Password incorrect” and the title of the window is “ACCESS DENIED”. These strings have to be somewhere in the source code, they might be obfuscated but it’s definitely a good idea to search for them within the source code. If they aren’t in the source code then what other constructs generate these messages? If you decompile my GuessPassword.exe you’ll notice that my incorrect password message is “Incorrect” and that the title of the window is “Incorrect” but these are generated by MessageBox.Show so if the obfuscation tool hides the phrase “Incorrect” it may not hide the function “MessageBox.Show” so it’s just another thing to consider when trying to reverse a .Net application.

Last but not least is the fact that the application goes from one small window that takes your password to a second window that has all your passwords, how does the application do this and what does the code look like when this is performed in a .Net application? This can be handled in a number of different ways. Probably the most popular method is using the “window.Show” and “window.Hide” functions. Another option is using form opacity, form opacity is typically used to make a window more transparent but you could make the window completely transparent with this option as well. So when searching through code it’s a good idea to keep both of these functions in mind.

Now that I’ve laid out some things to consider now it’s time to put this to practice. For each consideration I’ve laid out you can manually look through the decompiled code for these constructs but ideally we’ll want to search for these constructs. At the time of this writing I’ve found a couple of ways to accomplish this. One is to decompile all the code of the application into a text file then search that text file with your favorite text editor. For this I prefer ILSpy to decompile into one text file then I can search inside that text file for any construct I like. First select the program you want to save into a file.

Next choose File > Save Code

Then save to a “single file”, you can save to a project but for searching through code I prefer one flat file.

Now that you have that single file you can open up your favorite text editor and search for any construct or item of interest. Go ahead and search for MessageBox.Show and see what hits you get.

So we see present a message box but these message boxes are all throughout the application so it’s hard to say that this is the functionality we’re looking for. Another thing to note is that the obfuscation is hiding what the message is with large negative numbers. Having all the code in one text file is helpful but it doesn’t put much context around what we’re trying to go after. Most developers don’t code inside one giant text file they usually develop inside an IDE such as Visual Studio which breaks up the code to help with contextual understanding. That being said whenever I’m looking at an application that can be decompiled I always save that decompiled code into text files so that I can search it at anytime for items of interest, IDE’s are nice but plain text rules.

Another great option to search for constructs is using the plugin CodeSearch for the .Net Reflector tool. What’s great about this tool is that if you’re already working inside Reflector or if Reflector is your favorite tool for assessing .Net applications then you have everything in one place. Let’s do the same messagebox.show search inside of the CodeSearch plugin. One thing you’ll need to keep in mind with the CodeSearch plugin is that it will only search through what you’ve clicked on in the left hand column. So if you want to search the entire tree of SafeAsHouses you’ll need to make sure you’re on the root of that application. Below you’ll see that Reflector tags SafeAsHouses as PasswordSafe, so first I clicked on PasswordSafe then performed my search in CodeSearch.

The nice thing about CodeSearch is that it will break down the results into the path were the results can be found. CodeSearch is case sensitive so that’s another thing you’ll need to keep in mind when performing searches. As you can see from the screenshot above we have quite a number of hits from “MessageBox.Show” this is probably because it’s a popular function to use in .Net applications. You can click through each result and it will take you to the location of where it found your search term, from there you’ll have to determine if that location has the functionality you’re looking for. Usually it’s best to search for all the relative constructs to get the best idea of where the functionality you’re looking for resides. For example you should also search for “this.*.Text” looking for form input, “Environment.Exit” when the application closes, “Opacity” “*.Hide” for hiding windows / forms, and “*.Show” for showing windows, etc. So it’s a good idea to search for all these terms as it relates to SafeAsHouses to find the code that actually does the authentication piece. In this case the search term “Opacity” does a really good job of narrowing down the location of the code we’re interested in.

So looking at the decompiled code in the top right column it appears that this is the authentication functionality we’ve been looking for in the SafeAsHouses application. Congratulations we found the functionality we’ve been searching for. It may not seem that obvious but let’s take a closer look, I prefer to use ILSpy in this case because it does a better job of showing the original code.

On line six the developer is assigning the variable to whatever you type into the password field, he’s then comparing that to the obfuscated value we see on line seven. If the password is correct you then enter the if statement, this is crucial you have to enter the if statement in order for the application to open / show you the passwords. I don’t completely understand the role of the opacity code but the code “base.Hide();” on line 17 hides the password window and the code “SO.Show” (where SO is the obfuscation) will show you the second window which contains all your passwords. Line 22 is the pop up message that tells you that your password is incorrect and line 23 closes the application because you entered the wrong password. So after looking at all the functionality in this section we can determine that this is definitely the section we need to focus on. Now we need to figure out how we can modify the code to bypass the authentication. The idea here is that someone has downloaded this application to their machine and use this application to store all their super secret passwords, what we want is to do is figure out all their passwords by modifying / patching the application. With a .Net application it’s fairly easy to modify to get your desired goal. There are a couple of tools that will allow us to modify a .Net application, GrayWolf and the Reflector plugin Reflexil. Both are great tools and must haves in your .Net reversing toolbox but for this demonstration I’ll stick to reflexil. I’m not going to discuss installing reflexil so please refer to the documentation on how to get that setup. Once the plugin is installed simply go to Tools > Reflexil inside of Reflector, you should see something similar to the screen shot below.

So the upper half of this screen shot is the authentication code we’ve identified for the SafeAsHouses application and the bottom half is reflexil. What reflexil shows you is the CIL (common intermediate language) often called IL for short. You can read through the wiki article that I’ve linked to but basically it’s the lower level language used by the .Net virtual machine. For example, you write your code in C Sharp, that code then gets converted to CIL. The CIL is ran through the virtual machine which got installed when you installed the .Net framework on your computer. It’s not important that you understand all the inner workings of the .Net framework all you really need to know is that if we want to modify a .Net application then having a general knowledge of CIL will come in handy. I wouldn’t try to completely understand CIL out of the box because it shouldn’t be too difficult to follow the CIL instructions and map them to the decompiled code. For instance we can compare lines 01, 02, 03 in the CIL to the decompiled code and piece together that System.Window.Forms.TextBox plus the other instructions are getting input from the user via a text box inside a windows form. This is also consistent with how the application behaves.

The next step is to use reflexil to modify the CIL then save those modifications. What we want to do is eliminate the need to provide a valid password, this can be accomplished a number of ways. You’ll notice in the CIL on line 07 there is a CIL instruction, “op_Equality”, that compares the user supplied password to the actual password associated with the application. You’ll notice on line 08 the OpCode instruction is “brfalse“, this essentially means that if the passwords don’t match (false) then go into that if clause which will pop up a message and the application will exit. You’ll also notice the operand for the brfalse opcode is “-> (36)”, this means branch to target on line 36 which jumps into the message box functionality. To get around providing the correct password we can simply change the condition from false to true that way if we provide the incorrect password it will open the application to whereas providing the correct password will cause the application to close. To change the CIL right click on brfalse and select edit.

Now change the opcode from brfalse to brtrue then click update.

Once you’ve modified the CIL instruction you’ll need to save your changes. To do this right click on the root of your application, then go to reflexil, then save as.

I typically stick with the *.Patched naming that way I don’t modify the original executable. After you’ve saved SaveAsHouses.Patched go ahead and double click on the application. It should work out that if you supply the incorrect password you’ll be authenticated but if you supply the correct password the application will close.

Congratulations, that’s it you’ve successfully reversed an obfuscated .Net application and modified the executable to subvert authentication. At this point you’ve accomplished your mission but you could have gone another route. Instead of changing the brfalse to brtrue we could have deleted all those instructions that way it doesn’t even check for the password. The code block below shows what can be deleted to achieve this functionality.

In order to delete this code within the application we’ll need to delete lines 1-28 within reflexil, you can do the usual select line one then while holding shift select line 28. After that right click and select delete then save the changes. Open your patched version of SafeAsHouses and type whatever you like into the password field, this can be the correct password or not it doesn’t matter. Congratulations you’ve accomplished another method to subvert functionality of a standalone .Net application. Just to prove we’ve removed that functionality of the application open up the patched version of SafeAsHouses in either Reflector or ILSpy, below is my screenshot that shows the removed code.

So there you have it, hopefully this explanation was helpful. I’ve been kind of hiding a secret Relfexil has an option called “Obfuscator search” which will automatically try and deobfuscate code. This tool actually works really well and it will deobfuscate SafeAsHouses for you automatically. Even though Reflexil can remove obfuscation on some applications it can’t do them all and even it could the concepts that I’ve covered still apply. So even if you have a deobfuscated application you’ll still need to identify certain constructs within the code to find the functionality you’re looking for.

I’m not here to bash on .Net it’s actually a great platform to quickly stand up applications but people need to be aware that it’s not incredibly difficult to reverse the application, tear it apart, and inject / subvert functionality. That’s all I got, if you see where I’ve made a mistake or left out some important information please let me know. Happy .Net hunting.

Categories
programming windows

Search windows open shares with python

It’s rare during a penetration test that I actually exploit a vulnerability to gain more information. Newcomers to my filed will often use the term “network security”. I don’t care about the network, have the network for all I care. What I’m more concerned about is the information inside the network. The better way to describe it is “information security”. Performing penetration tests one has to keep that in mind, yea it’s fun to exploit some user that’s running an old version of war-ftp but if that user doesn’t yield sensitive information then who cares to some extent.

I often see that professional penetration testers will highlight an open windows share that can be read or written to by everyone. They will often highlight other shares that are accessible by a large group such as Authenticated users. I don’t want to scoff at these types of open shares as they should be investigated by the business owner that created the open shares. The main thing to consider is what information lies within those open shares. Open shares are usually created for a reason, so that users easily share information. This is not bad unless the information in those shares is secret / classified material. To check for this possible sensitive information one would have to search all the files and folders in that share. Now you can use the cute little dog search feature inside of windows explorer to look for this information but using that your hands are somewhat tied. The search feature inside windows explorer actually does a nice job but if you wanted to automate the process to look at multiple shares and search for multiple terms then you’re out of luck. Because of this I wanted to script something that would automate the process. Powershell could have been an option but because I’m already familiar with python I stuck to what I know. This means that in order to run the script you’ll have to have python installed on windows. I could have written the script to work in Linux but that would have meant using cifs to map drives which seemed like more of a headache then just using python on windows.

You’ll need to open up a windows command prompt to run the script and it’s a good idead to add Python to the windows path. So the script takes two arguments. The first argument is the file containing all the shares that you want to search. The second argument is the file that contains all the terms you want to search for. So to run the script you would issue a command similar to below, where searchShares.py is the name of the python script.

python.exe  searchShares.py  shares.txt  searchTerms.txt

Your shares.txt file should look similar to below.

\\one\two
\\three\four\five
\\six\seven\eight\nine

Your searchTerms.txt file should look similar to below.

secret
password
username

In the example above the term “secret” will be recursively searched in all three shares. Then “password” will be recursively searched in all three shares, then so on and so on. The script will output any file, file name, or folder name that matches any of the search terms. Currently the script will read each file in binary format which means if it comes across a word document file (such as document.doc) it doesn’t open / read the file like microsoft word would. The current script reads each line of the binary file looking for your search term. Reading a text file as binary seems to work fine but reading in microsoft office documents as binary have different results. One thing I’ve noticed in my testing is that generally speaking it does just fine searching through a *.doc file but has trouble searching through a *.docx file. Binary searching is not ideal but it’s my current solution. Python has the capability to open microsoft office documents in a more native format but for my first go round I haven’t implemented that solution.

Once you run the script you will see output similar to below.

C:\temp>python searchShares.py shares.txt searchTerms.txt

Walking directory \\192.168.99.184\test

Found \\192.168.99.184\testtest.txt
Found \\192.168.99.184\testTravisAltmanResume.doc
Found \\192.168.99.184\test\onewordDoc1.docx
Found \\192.168.99.184\test\one\twopasswords.txt
Found \\192.168.99.184\test\one\two\threewordDoc2.docx
Searching file \\192.168.99.184\test\test.txt for term secret

Searching file \\192.168.99.184\test\TravisAltmanResume.doc for term secret

Searching file \\192.168.99.184\test\one\wordDoc1.docx for term secret

Searching file \\192.168.99.184\test\one\two\passwords.txt for term secret

Searching file \\192.168.99.184\test\one\two\three\wordDoc2.docx for term secret

Searching file \\192.168.99.184\test\test.txt for term password

Searching file \\192.168.99.184\test\TravisAltmanResume.doc for term password

Searching file \\192.168.99.184\test\one\wordDoc1.docx for term password

Searching file \\192.168.99.184\test\one\two\passwords.txt for term password

Searching file \\192.168.99.184\test\one\two\three\wordDoc2.docx for term password

Searching file \\192.168.99.184\test\test.txt for term username
Searching file \\192.168.99.184\test\TravisAltmanResume.doc for term username
Searching file \\192.168.99.184\test\one\wordDoc1.docx for term username
Searching file \\192.168.99.184\test\one\two\passwords.txt for term username
Searching file \\192.168.99.184\test\one\two\three\wordDoc2.docx for term username

This output on the command prompt is to given as a verbose message so that you know what’s going on with the script. The output on the command prompt will not tell you if it found a search term. The results of your searching is placed in a text file called output.txt located in the current directory. The content of output.txt should look similar to the following.

=== Directories or file names matching search criteria ===

\\192.168.99.184\test\one\two\passwords.txt

=== Files matching search criteria ===

found secret in file \\192.168.99.184\test\one\two\passwords.txt
found password in file \\192.168.99.184\test\one\two\passwords.txt

So you can see that it matches the file name as well as the contents of the file. One thing to keep in mind is that this script can take a while to run. There two factors that control how fast it runs, 1) Speed of the network and 2) Size (GB, MB, etc) of the share. It works best when your network is local and not in another city. The biggest factor is going to be the size of the share. Running this script on a major file sahre that is say 800 GB in size will take a very long time. Keep in mind you can specify specific directories, so instead of searching in the root share such as \\share\one maybe it’s a better idea to searh in \\share\one\two\three. So keep these factors in mind when running the script. Below is the script, simply cut and paste into your text editor of choice and save as searchShares.py


import os
import sys
import re

output = open('output.txt', 'a')
output.write('\n')
fileList = []
shareList = open(sys.argv[1])
eachShare = shareList.readlines();
for shares in eachShare:
    path = shares.rstrip('\r\n')
    print '\nWalking directory ' + path + '\n'
    for root, subFolders, files in os.walk(path):
        #print 'Indexing ' + root + '\n'
        for file in files:
            fileList.append(os.path.join(root,file))
            print 'Found ' + root + file
keywords = open(sys.argv[2])
searchTerm = keywords.readlines();
output.write('=== Directories or file names matching search criteria ===\n')
for term in searchTerm:
    strip = term.rstrip('\r\n')
    if any(strip in s for s in fileList):
        matching = [s for s in fileList if strip in s]
        for item in matching:
            output.write('\n' + item)
output.write('\n\n=== Files matching search criteria ===\n\n')
for term in searchTerm:
    strip = term.strip('\r\n')
    for item in fileList:
        print 'Searching file ' + item + ' for term ' + term
        searchFile = open(item, 'rb')
        for line in searchFile:
            if re.search(strip, line, re.IGNORECASE):
                output.write('found ' + strip + ' in file ' + item + '\n')
                break
    searchFile.close()
output.close()

Let me know if this works / doesn’t work and also let me know if you have any suggestions on how to make it better. One thing I might do in the future is to limit the types of files it searches to say only .txt, .doc, .xls, etc. Happy hunting for information on shares.

Categories
learning programming wireless

password dictionary generator

I had the need to generate a password dictionary that would cover every possible combination for a defined character set.  I first learned to program in Python so I was going to start there first.  Before writing the program I decided to Google and see if anyone else had tackled this problem via Python, turned out they had.  Siph0n posted his Python code to create a password dictionary over at the BackTrack forums.  I wanted to post it here as a mirror and to discuss the implications of creating a password dictionary with every possible combination.  Below is the Python code.

f=open('wordlist', 'w')

def xselections(items, n):
    if n==0: yield []
    else:
        for i in xrange(len(items)):
            for ss in xselections(items, n-1):
                yield [items[i]]+ss

# Numbers = 48 - 57
# Capital = 65 - 90
# Lower = 97 - 122
numb = range(48,58)
cap = range(65,91)
low = range(97,123)
choice = 0
while int(choice) not in range(1,8):
    choice = raw_input('''
    1) Numbers
    2) Capital Letters
    3) Lowercase Letters
    4) Numbers + Capital Letters
    5) Numbers + Lowercase Letters
    6) Numbers + Capital Letters + Lowercase Letters
    7) Capital Letters + Lowercase Letters
    : '''
)

choice = int(choice)
poss = []
if choice == 1:
    poss += numb
elif choice == 2:
    poss += cap
elif choice == 3:
    poss += low
elif choice == 4:
    poss += numb
    poss += cap
elif choice == 5:
    poss += numb
    poss += low
elif choice == 6:
    poss += numb
    poss += cap
    poss += low
elif choice == 7:
    poss += cap
    poss += low

bigList = []
for i in poss:
    bigList.append(str(chr(i)))

MIN = raw_input("What is the min size of the word? ")
MIN = int(MIN)
MAX = raw_input("What is the max size of the word? ")
MAX = int(MAX)
for i in range(MIN,MAX+1):
    for s in xselections(bigList,i): f.write(''.join(s) + '\n')

If you’re familiar with programming and Python in particular then you could just grab the code and roll but I really wanted to discuss the usefulness of an application like this.  First I will discuss the basics of how to get this program up and running but will eventually jump into other implications such as time, storage, and usefulness of a password dictionary.

How to install and use the program

  1. You must have Python installed.  If you’re running Linux (you should be) then it’s probably already installed.  If you’re running then Windows then you will have to download Python.
  2. Now that you have Python installed simply copy and paste the code above into a text file and name it passwordDictionaryGenerator.py.  The .py extension is needed because that’s how Python recognizes code that it’s suppose to execute.
  3. Modify appropriate variables within the program.  The only variables you may want to modify are numb, cap, and low.  These variables contain the ASCII equivalent ranges for the letters and numbers you will be using to generate your dictionary.  You may want to modify these variables so that your dictionary does not contain a-z but only a-k, I’ll leave that up to you.
  4. Now to run the program simply type
    python passwordDictionaryGenerator.py

    You will have to answer the questions about which character set you want to use and how long / short your password dictionary is going to be.  Once you answer the questions it may seem like the program isn’t doing anything but it is, it will spit you back to the command line once the program has completed.  The output will be a file called wordlist.

So now you have this cool program that can generate a password dictionary for you, how big (size MB, GB, TB, etc) will this dictionary be?  How long will it take to generate this dictionary?  Let’s tackle the size question first as it will help us calculate the time as well.  The key to calculating the size is a math term called permutations.  Permutations is a simple equation to determine the number of words for that particular character set and length of word.  The basic equation is below.

nr

n = total character set (e.g.  a-z + A-Z + 0-9 = 62)

r = length of the word

Now you’ll have to calculate nr for each length to get every possible combination.  So for a 6 digit long password your equation will look like the following.

n6 + n5 + n4 + n3 + n2 + n1 = every possible combination

Let’s try an example where our character set is a-z (n = 26) and our password is no longer than 6 (r = 1-6) digits, how many words will be in our dictionary?

266 + 265 + 264 + 263 + 262 + 261 = 321,272,406 = total # of words

So now we understand how to calculate the total number of words in our dictionary.  How does that relate to the size?  Well for the most part if the length of the password is x then the size in bytes will be x + 1 for that particular line.  Then all we have to do is multiply each nr times the size of that particular line to get the size for that particular length.  That may have just sound really confusing so hopefully the following graph clears that up some.

I went ahead and generated this dictionary, it took about 30 minutes.  Turns out the size matched my calculations.

So now you have the basic formula for calculating the size of your desired dictionary.  Let’s take a look at a larger example just to cure our curiosity.  Let’s assume the following parameters.

  • character set = a-z, A-Z, & 0-9
  • password length = 1-8
  • n = 62
  • r = 1 – 8

With these parameters the size of our dictionary jumps to 1,800 terabytes or 1.8 petabytes. Take a look at the chart below.

You can see how quickly the size jumps up. I don’t know about you but I don’t have a two petabyte drive lying around. Generating this dictionary is just infeasible. I did calculate the time it would probably take to generate this dictionary, it came out to be about 11 days. So the time to create such a dictionary is nothing compared to the storage required to house it. Not only that I don’t know to many applications that can handle a large dictionary as input, so that’s another factor you’ll have to keep in mind when generating your dictionary.

Calculating the time it takes to generate these dictionaries I’ll leave up to you.  The basic idea is that you can run the python program for a particular length password for a set amount of time and then extrapolate form there.  For the most part time isn’t really a factor but storage is. The concepts I’ve talked about here are nothing new. The idea of generating a password came to me and my coworkers as we were thinking of ways to test a WPA wireless infrastructure. Attacking WPA can be done offline so we were thinking of generating a dictionary to accomplish this. Hours later we soon realized the difficulty with generating such a large dictionary. This was actually good news because it meant that an attacker would have an extremely difficult time attacking a WPA access point with a complex password. Renderman and the Church of Wifi have thought about this problem way before I did and came up with some rainbow tables to help test the strength of your WPA access point. You can’t really create a dictionary with every single combination for a lengthy password, your best bet is to create a dictionary with the most “common” passwords, which is no easy task either.

The moral of the story is to use lengthy complex passwords with a high character set, but you knew that already. So I just suggested that this program is somewhat useless, well it is but it isn’t. You can use this program to generate a small dictionary but a large dictionary (greater than a couple of terabytes) is probably out of the question. So use this program and let me know what your results are, I’m always interested in your feedback. Happy cracking.