0% found this document useful (0 votes)
115 views4 pages

md5 Algorithm Analysis

it provides md5 algorithm foe encryption and decryption

Uploaded by

shivkant singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views4 pages

md5 Algorithm Analysis

it provides md5 algorithm foe encryption and decryption

Uploaded by

shivkant singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Proceedings of the 2nd International Symposium on Computer, Communication, Control and Automation (ISCCCA-13)

Security Analysis of MD5 algorithm in Password Storage

Mary Cindy Ah Kioon, ZhaoShun Wang and Shubra Deb Das


School of Computer and Communication Engineering
University of Science and Technology (USTB), Beijing
Beijing 100083, China
email: cindyak86@gmail.com , zhswang@sohu.com and shubra_ustb@yahoo.com

Abstract— Hashing algorithms are commonly used to convert MD5 (Message Digest Algorithm 5) was designed by
passwords into hashes which theoretically cannot be Ron Rivest in 1991. MD5 processes a variable-length
deciphered. This paper analyses the security risks of the message into a fixed-length output of 128 bits. MD5 is a
hashing algorithm MD5 in password storage and discusses popular hash function. It works on blocks of 512-bits, and
different solutions, such as salts and iterative hashing. We processes each block through 4 rounds, where each round in
propose a new approach to using MD5 in password storage by turn processes 16 sub-blocks (each 32-bits). The 512-bit
using external information, a calculated salt and a random key message is divided into 16 sub-blocks before processing. If a
to encrypt the password before the MD5 calculation. We message block is not up to 512-bits, it is padded as shown in
suggest using key stretching to make the hash calculation
Fig. 1. A detailed explanation of the algorithm can be found
slower and using XOR cipher to make the final hash value
impossible to find in any standard rainbow table.
at [1].

Keywords-component; MD5; Password Storage Security;


Data Security; Dictionary attacks; Rainbow Tables

I. INTRODUCTION
With the advent of computer technology, it became more
productive to store information in databases instead of
storing in paper documents. Web applications, needing user
authentication, typically validate the input passwords by Figure 1. Length of message after padding (in bits)
comparing them to the real passwords, which are commonly
stored in the company’s private databases. If the database III. APPLICATION OF MD5 ALGORITHM IN PASSWORD
and hence these passwords were to become compromised,
STORAGE SECURITY
the attackers would have unlimited access to these users’
personal. Nowadays, databases use a hash algorithm to It is highly insecure to store passwords in plaintext in the
secure the stored passwords but there are still security database. In order to increase the security of passwords,
breaches. Recently in 2012, Russian hackers released a big MD5 algorithms can be used to hash the original passwords
list of cracked passwords from the well-known social and the hash values, instead of the plaintext are stored in the
networking sites including LinkedIn. These attacks were database. During authentication, the input password is also
found to be successful due to the use of a weak hashing hashed by MD5 in a similar way, and the result hash value is
algorithm. compared with the hash value in the database for that
particular user.
II. HASH FUNCTION
IV. SECURITY ANALYSIS OF MD5
A hash function is a one-way encryption function that
takes a variable-size input plaintext m and generates a fixed- MD5 algorithm is prone to two main types of attack:
size hash output. It is computationally hard to decipher the dictionary attacks and rainbow tables.
hash and any attempt to crack it is practically infeasible. A
A. Dictionary Attacks
“secure” hash function should be able to resist pre-image
attacks and collision attacks. Due to the pigeonhole principle In dictionary attacks, an attacker tries all the possible
and birthday paradox, there will be some inputs that will passwords in an exhaustive list called a dictionary. The
produce the same hash result. The output length is of fixed attacker hashes each password from the dictionary and
size 128 bits, making a total of 2128 possible output hash performs a binary search on the compromised hashed
values. This value, as big as it may seem, is not infinite, passwords. This method can be made much quicker by pre-
hence resulting in collisions. computing the hash values of these possible passwords and
storing them in a hash table.
A. MD5 algorithm
B. Rainbow Tables
Rainbow tables are made up of hash chains and are more

Published by Atlantis Press, Paris, France.


© the authors, 2013
0706
Proceedings of the 2nd International Symposium on Computer, Communication, Control and Automation (ISCCCA-13)

efficient than hash tables as they optimize the storage cannot be used to decipher the salted hashes. However, two
requirements, although the lookup is made slightly slower. same passwords will still produce the same hash.
Rainbow tables differ from hash tables in that they are 2) Different random salt for each password and storing
created using both reduction and hash functions. Reduction the salt within the database itself: If we use different salts
functions convert a hash value to a plaintext. The plaintext is
for each password, two same passwords will have different
not the original plaintext from which the hash value was
generated, but another one. By alternating the hash function hashes. The attacker has to generate different rainbow tables
with the reduction function, chains of alternating passwords for each individual user, making it computationally harder
and hash values are formed. Only the first (chain’s start point) for an attacker to crack the hashes. These salts can be stored
and last plaintext (chain’s end point) generated are stored in in plaintext in the database. The purpose of the salt is not to
the table. To decipher a hashed password, we first process be secret, but to be random enough to defeat the use of
the hashed password through reduction functions until we rainbow tables.
find a match to a chain’s end point. We then take that chain’s 3) Use an existing column value: An existing column
corresponding start point and regenerate the hash chain and value like username can be used as salt. This solution is
find the original plaintext to the hashed password. Rainbow similar to the second solution discussed above. It defeats the
tables are very easily available online now. There are many use of a standard rainbow table, but a hacker might generate
password cracking systems and websites that use rainbow
tables also, for example, OphCrack. Of course, using a rainbow table for a specific username, for example, “root”
rainbow tables do not guarantee a 100% success rate of or “admin”.
cracking password systems. However, the bigger the 4) Use a variably located calculated salt: The salt
character set used for creating the rainbow table and the location can be prefix (salt appended in front of password),
longer the hash chain length, the bigger will the rainbow infix (salt appended within the password) or postfix (salt
table be. appended at the end of the password). If the salt’s location is
made random, then cracking the passwords is made harder.
V. COUNTERMEASURES RESEARCH
For example, we can set the salt location to be equal to the
A. Information Entropy password’s length modulo 3. The salt can be calculated by
using a random character sequence (stored in the database)
Password strength is usually measured in terms of
and using other characters (embedded within the code). For
information entropy. In simple terms, the higher the
information entropy, the stronger the password and hence the example, the salt can be made to be a combination of the first
more difficult it is to crack it. A password of 6 characters two letters of username, random sequence of characters and
would require only 26 attempts to exhaust all possibilities in the last 2 letters of username.
a brute-force attack, while a password with 42 characters 5) Use a variably located calculated salt including
would need 242 attempts. As can be seen, the longer the information outside the database and the application code:
password and the larger the character set from which it is The hacker now has to break into the database and the server
derived, the stronger the password. As best practice and containing the application code. He also needs to obtain the
preliminary requirement, the application should insist that additional information needed to crack the password.
the user uses a strong password during the registration
process. Strong passwords run less risk of existing in C. Improvement on MD5 processing
dictionaries. Common simple passwords like “123456” have The following methods can be used to improve the MD5
already been banned by Microsoft Hotmail. processing:
B. Salting 1) Improved hash function: The hash computation
function is altered, for example using one of the following
One of the most common reasons to successful password
cracking attacks like the one against LinkedIn was because functions as shown in (1), (2) and (3):
they were using unsalted hashes. This makes it much easier
for hackers to crack the system by using rainbow tables, hash = Hash (password + salt) (1)
especially given the fact that many users use very common,
simple passwords and these similar passwords result in
similar hashes. A salt is a secondary piece of information hash = Hash (Hash (password) + salt) (2)
made of a string of characters which are appended to the
plaintext and then hashed. Salting makes passwords more
resistant to rainbow tables as the salted hashed password will hash = Hash (password + salt + key) (3)
have higher information entropy and hence much less likely
to exist in pre-computed rainbow tables. Typically, a salt 2) Iterative hashing: The password is hashed a number
should be at least 48 bits. Salting can be implemented using of times. MD5 is a fast hashing function, that is, it is
the following ways: computationally fast to calculate. Iterative hashing makes the
1) Single salt for all passwords: Given that the salt is calculation slower, hence computationally slower and more
sufficiently long and complex, a standard rainbow table,

Published by Atlantis Press, Paris, France.


© the authors, 2013
0707
Proceedings of the 2nd International Symposium on Computer, Communication, Control and Automation (ISCCCA-13)

difficult to crack. The number of iterations can typically be


made to be equal to 1000.
3) Key stretching: This makes a password more resistant
to pre-computation attacks by making the attack workload
bigger. Iterative hashing is used, where a weak key is fed
into the hash algorithm and the output results in a stronger
key. There are 3 key stretching methods depending on the
input used for the iterative hashing:
a) Simple Key stretching: Only the key is hashed
Figure 2. Generate complex password through columnar transposition
iteratively, as in (4). No salt is involved.
Fourthly, an additional random information string of 128
key = Hash (key) (4) bits is generated for each user and stored in an external file,
e.g. in a flash drive.
b) Password Key stretching: The password along with Finally, the password is hashed using a formula based on
the key are both used in the loop. key stretching. The hashing process is similar to a cipher
c) Salted Key Stretching: The key, password and salt block chaining method, where the output of one round is
are used in the loop. This method is the best of the three key used in the input of the other round, as shown in Fig. 3. By
stretching methods. calculating the XOR result of the hash value at one round
4) Transform the password before hashing: Before with the one at the previous round, the resulting hashed value
is made impossible to find in any standard rainbow table.
calculating the MD5 hash for the password, the latter is
The final hashed password is then stored in the database.
transformed using a simple cipher method. The system authenticates a user by calculating the hash value
5) Chaining method and XOR(Exclusive OR) cipher: If (the random key is retrieved from the database for use),
iterative hashing is used, the hash output from the current which is then compared to the stored hashed password. An
iteration is used in the input for the next iteration. We use a example of how the hashed passwords will appear in the
simple XOR cipher to compute the final hash by “XORing” database is shown in Fig. 4.
the hash output value from all iterations. A simple XOR
cipher is typically of the form shown in (5). If the key is finalHash = HV0 ^ HV1 ^ ……^ HVN ,
made random enough, the ciphertext will be almost
impossible to crack. HV0 = Hash (CpxPassword, additionalInfo);
HV1 = Hash (CpxPassword, HV0, salt);
plaintext XOR key = ciphertext (5) HVN= Hash (CpxPassword, HVN-1, salt);
N is the number of iterations and ^ is XOR.
HV: Hash Value and
D. Example of an improved MD5 processing CpxPassword: Complex Password
We will now demonstrate how we can hash passwords in
Figure 3. Example of hashed passwords in a database using the improved
databases using an improved version of MD5. There are five MD5
main steps involved.
First, a random key string of random length is first
generated. Its character set is {0-9, a-z, A-Z}.This random
key string is used to generate the complex password and is
also stored in the database for later use during password
authentication.
Secondly, the password is transformed into a complex
password through columnar transposition cipher. Assuming
that the random key is “YDCiA” and the password is
“crazyrichard”, the password is first converted into a matrix
of 5 columns (same as length of random key) and the blank
cells are alternately filled with “*” and “@”, as shown in
Fig.2. Using columnar transposition cipher, the complex
password generated is “ya*ac*ridcrrzh@”.
Thirdly, the salt is calculated by finding the XOR value
of the random key string with the complex password, row by Figure 4. Example of hashed passwords in a database using the improved
MD5
row. In our example, the salt is " %i\b\"s6*\r\"".
The overall algorithm can be summarized in Fig. 5. The
initialization vector used here is the additional information,

Published by Atlantis Press, Paris, France.


© the authors, 2013
0708
Proceedings of the 2nd International Symposium on Computer, Communication, Control and Automation (ISCCCA-13)

which is a random string of 128 bits. Each user has a initialization vectors and salt are different for each user. For
different initialization vector value. two users with the same passwords, the final hash value will
be completely different.
VII. CONCLUSION
Password storage security is one important aspect of data
security as most systems nowadays require an authentication
method using passwords. Hashing algorithms such as MD5
are commonly used for encrypting plaintext passwords into
strings that theoretically cannot be deciphered by hackers
due to their one-way encryption feature. However, with time,
attacks became possible through the use of dictionary tables
and rainbow tables. In this paper, we discussed different
Figure 5. Improved MD5 processing (IV: Initialization Vector) methods to thwart these attacks: (1) the use of a strong
password to reduce the probability of it existing in a
dictionary, (2) using salts, (3) key stretching and iteration
VI. PERFORMANCE ANALYSIS hashing to make the MD5 computation slower, (4) chaining
First of all, an attack using a standard rainbow table method, where the output of one iteration is used in the input
would fail because the hashed password stored in the of the next iteration and the use of a different initialization
database is not of hexadecimal form and hence would not vector for each password, (5) encrypting the password before
exist in any standard rainbow table. We tried a few online hashing and (6) XOR cipher to make the final hash value
MD5 decryption tools like http://www.md5decrypter.co.uk/ impossible to find in any rainbow table. An implementation
and downloaded software tools like Cain and Abel. But, of the proposed approach is carried out using C# as
since all of them use rainbow tables where the MD5 hashes programming language and Microsoft SQL Server as
are in hex form and our stored hash value is in ASCII, the database. With our proposed approach, the attacker will now
attacks would already fail from the beginning, as shown in have to hack into the database, the server containing the
Fig. 6. application code as well as the external file.
Also, by XORing the output hash values from each ACKNOWLEDGMENT
iteration makes it almost impossible to find out the original
hash output at the first round. Generally, if we XOR the The work reported in this paper was supported by
plaintext with a key to calculate the cipher text, we can get the Beijing Natural Science Foundation of China (Grant
back the plaintext simply by using XORing the cipher text No. 4112037).
with the key. However, if the key is random, then it is almost The authors would like to thank and acknowledge the
impossible to get back the plaintext. In our example, the key support and assistance of relatives, friends and colleagues.
is totally random as we are hashing intermediate hash output
values and hence given the final hash value, it is impossible REFERENCES
to decipher it. [1] Rivest, R. The MD5 message-digest algorithm. RFC 1321, 37 (April
1992).
[2] Zhang Shaolan, Xing Guobo, Yang Yixian, Improvement and
Security Analysis on MD5 [J]. Computer Application, 2009, vol.
29(4):947-949.
[3] Xiaoling Zheng, JiDong Jin, Research for the Application and Safety
of MD5 Algorithm in Password Authentication, 9th International
Conference on Fuzzy Systems and Knowledge Discovery, 2012.
[4] H. Mirvaziri, Kasmiran Jumari, Mahamod Ismail, Z. Mohd Hanapi, A
new Hash Function Based on Combination of Existing Digest
Algorithms , The 5th Student Conference on Research and
Development – SCOReD 2007, 11-12 December 2007, Malaysia.
[5] Md. Didarul Alam Chawdhury, and A.H.M. Ashfak Habib, Security
Figure 6. Invalid character in the MD5 Hash in Cain and Abel
Enhancement of MD5 Hashed Passwords by using the Unused Bits of
TCP Header, Proceedings of 11th International Conference on
Also, even if two users have two same passwords, the Computer and Information Technology (ICCIT 2008) 25-27
random key used to encrypt the passwords will be different, December, 2008, Khulna, Bangladesh
resulting in different complex passwords. Furthermore, the

Published by Atlantis Press, Paris, France.


© the authors, 2013
0709

You might also like