Daffodil International University
Cryptography and Information Security
5. List cryptographic hash algorithms and list their applications. Discuss
MD5 hash algorithm or any other hash algorithm in detail.
Muhammad Ashik Iqbal
M.Sc. in CSE
Cryptographic Hash Algorithms
A cryptographic hash function is a deterministic procedure that takes an arbitrary block of data and
returns a fixed-size bit string, the (cryptographic) hash value, such that an accidental or intentional
change to the data will change the hash value. The data to be encoded is often called the "message",
and the hash value is sometimes called the Message Digest or simply Digest.
The ideal cryptographic hash function has four main properties:
• It is easy to compute the hash value for any given message.
• It is infeasible to find a message that has a given hash.
• It is infeasible to modify a message without changing its hash.
• It is infeasible to find two different messages with the same hash.
Cryptographic hash functions have many information security applications, notably in digital signatures,
message authentication codes (MACs), and other forms of authentication. They can also be used as
ordinary hash functions, to index data in hash tables; as fingerprints, to detect duplicate data or
uniquely identify files; or as checksums to detect accidental data corruption. Indeed, in information
security contexts, cryptographic hash values are sometimes called (digital) fingerprints, checksums, or
just hash values, even though all these terms stand for functions with rather different properties and
There is a long list of cryptographic hash functions, although many have been found to be vulnerable
and should not be used. Even if a hash function has never been broken, a successful attack against a
weakened variant thereof may undermine the experts' confidence and lead to its abandonment. For
instance, in August 2004 weaknesses were found in a number of hash functions that were popular at
the time, including SHA-0, RIPEMD, and MD5. This has called into question the long-term security of
later algorithms which are derived from these hash functions in particular, SHA-1 (a strengthened
version of SHA-0), RIPEMD-128, and RIPEMD-160 (both strengthened versions of RIPEMD). Neither
SHA-0 nor RIPEMD are widely used since they were replaced by their strengthened versions.
As of 2009, the two most commonly used cryptographic hash functions are MD5 and SHA-1. However,
MD5 has been broken; an attack against it was used to break SSL in 2008.
SHA-0 and SHA-1 are members of the SHA family of hash functions developed by the NSA. In February
2005, a successful attack on SHA-1 was reported, finding collisions in about 269 hashing operations,
rather than the 280 expected for a 160-bit hash function. In August 2005, another successful attack on
SHA-1 was reported, finding collisions in 263 operations. Theoretical weaknesses of SHA-1 exist as well,
suggesting that it may be practical to break within years. Most recently, in June 2009 an attack on
SHA-1 was found which can theoretically find a collision in 252 operations. New applications can avoid
these problems by using more advanced members of the SHA family, such as SHA-2, or using techniques
such as randomized hashing that do not require collision resistance.
However, to ensure the long-term robustness of applications that use hash functions, there is a
competition to design a replacement for SHA-2, which will be given the name SHA-3 and become a FIPS
standard around 2012.
Some of the following algorithms are known to be insecure; consult the article for each specific
algorithm for more information on the status of each algorithm. Note that this list does not include
candidates in the current NIST hash function competition. For additional hash functions see the box at
the bottom of the page.
List of Hash Algorithms
Internal Block Length Word
Algorithm Output size (bits) attacks
state size size size size
HAVAL 256/224/192/160/128 256 1024 64 32 Yes
MD2 128 384 128 No 8 Almost
MD4 128 128 512 64 32 Yes (28)
MD5 128 128 512 64 32 Yes (25)
PANAMA 256 8736 256 No 32 Yes
RadioGatún Arbitrarily long 58 words 3 words No 1-64 Yes
RIPEMD 128 128 512 64 32 Yes
RIPEMD-128/2 128/256 128/256 512 64 32 No
RIPEMD-160/3 160/320 160/320 512 64 32 No
SHA-0 160 160 512 64 32 Yes (239)
SHA-1 160 160 512 64 40 With flaws (252)
SHA-256/224 256/224 256 512 64 32 No
SHA-512/384 512/384 512 1024 128 64 No
Tiger(2)-192/1 192/160/128 192 512 64 64 No
Applications of Hash Algorithms
A typical use of a cryptographic hash would be as follows: Alice poses a tough math problem to Bob,
and claims she has solved it. Bob would like to try it himself, but would yet like to be sure that Alice is
not bluffing. Therefore, Alice writes down her solution, appends a random nonce, computes its hash
and tells Bob the hash value (whilst keeping the solution and nonce secret). This way, when Bob comes
up with the solution himself a few days later, Alice can prove that she had the solution earlier by
revealing the nonce to Bob. (This is an example of a simple commitment scheme; in actual practice,
Alice and Bob will often be computer programs, and the secret would be something less easily spoofed
than a claimed puzzle solution).
Another important application of secure hashes is verification of message integrity. Determining
whether any changes have been made to a message (or a file), for example, can be accomplished by
comparing message digests calculated before, and after, transmission (or any other event).
A message digest can also serve as a means of reliably identifying a file; several source code
management systems, including Git, Mercurial and Monotone, use the sha1sum of various types of
content (file content, directory trees, ancestry information, etc) to uniquely identify them.
A related application is password verification. Passwords are usually not stored in cleartext, for obvious
reasons, but instead in digest form. To authenticate a user, the password presented by the user is
hashed and compared with the stored hash. This is sometimes referred to as one-way encryption.
For both security and performance reasons, most digital signature algorithms specify that only the
digest of the message be "signed", not the entire message. Hash functions can also be used in the
generation of pseudorandom bits.
Hashes are used to identify files on peer-to-peer filesharing networks. For example, in an ed2k link, an
MD4-variant hash is combined with the file size, providing sufficient information for locating file
sources, downloading the file and verifying its contents. Magnet links are another example. Such file
hashes are often the top hash of a hash list or a hash tree which allows for additional benefits.
• Used Alone
o File integrity verification
o Public key fingerprint
o Password storage
• Combined with encryption functions
MD5 Hash Algorithm
In cryptography, MD5 (Message-Digest algorithm 5) is a widely used cryptographic hash function with a
128-bit hash value. As an Internet standard (RFC 1321), MD5 has been employed in a wide variety of
security applications, and is also commonly used to check the integrity of files. However, it has been
shown that MD5 is not collision resistant; as such, MD5 is not suitable for applications like SSL
certificates or digital signatures that rely on this property. An MD5 hash is typically expressed as a 32
digit hexadecimal number.
MD5 was designed by Ron Rivest in 1991 to replace an earlier hash function, MD4. In 1996, a flaw was
found with the design of MD5. While it was not a clearly fatal weakness, cryptographers began
recommending the use of other algorithms, such as SHA-1 (which has since been found vulnerable). In
2004, more serious flaws were discovered, making further use of the algorithm for security purposes
questionable. In 2007 a group of researchers including Arjen Lenstra described how to create a pair of
files that share the same MD5 checksum. In an attack on MD5 published in December 2008, a group of
researchers used this technique to fake SSL certificate validity. US-CERT of the U. S. Department of
Homeland Security said MD5 "should be considered cryptographically broken and unsuitable for further
use," and most U.S. government applications will be required to move to the SHA-2 family of hash
functions by 2010.
MD5 digests have been widely used in the software world to provide some assurance that a transferred
file has arrived intact. For example, file servers often provide a pre-computed MD5 checksum for the
files, so that a user can compare the checksum of the downloaded file to it. Unix-based operating
systems include MD5 sum utilities in their distribution packages, whereas Windows users use third-
However, now that it is easy to generate MD5 collisions, it is possible for the person who created the
file to create a second file with the same checksum, so this technique cannot protect against some
forms of malicious tampering. Also, in some cases the checksum cannot be trusted (for example, if it
was obtained over the same channel as the downloaded file), in which case MD5 can only provide error-
checking functionality: it will recognize a corrupt or incomplete download, which becomes more likely
when downloading larger files.
MD5 is widely used to store passwords. To mitigate against the vulnerabilities mentioned above, one
can add a salt to the passwords before hashing them. Some implementations may apply the hashing
function more than once.
MD5 processes a variable-length message into a fixed-length output of 128 bits. The input message is
broken up into chunks of 512-bit blocks (sixteen 32-bit little endian integers); the message is padded so
that its length is divisible by 512. The padding works as follows: first a single bit, 1, is appended to the
end of the message. This is followed by as many zeros as are required to bring the length of the
message up to 64 bits less than a multiple of 512. The remaining bits are filled up with a 64-bit integer
representing the length of the original message, in bits.
For 32-bit words A,B,C, define:
F(A,B,C) = (A ∧B) ∨(¬A ∧C)
G(A,B,C) = (A ∧C) ∨(B ∧ ¬C)
H(A,B,C) = A ⊕B ⊕C
I(A,B,C) = B ⊕(A ∨ ¬C)
Where ∧, ∨, ¬, ⊕are AND, OR, NOT, XOR respectively
• Round 0: Steps 0 thru 15, uses F function
• Round 1: Steps 16 thru 31, uses G function
• Round 2: Steps 32 thru 47, uses H function
• Round 3: Steps 48 thru 63, uses I function
MD5 One Step
MD5 One Step Notation
• Let MD5i…j(A,B,C,D,M) be steps i thru j
o “Initial value” (A,B,C,D) at i, message M
• Note that MD50…63(IV,M) ≠ h(M)
o Due to padding and final transformation
• Let f(IV,M) = (Q60,Q63,Q62,Q61) + IV
o Where “+” is addition mod 232 per 32-bit word
• Then f is the MD5 compression function
MD5 Compression Function
• Let M = (M0,M1), each Mi is 512 bits
• Then h(M) = f(f(IV,M0),M1)
o Assuming M includes padding
• That is, f(IV,M0) acts as “IV” for M1
o Can be extended to any number of Mi
MD5 Attack History
• Dobbertin “almost” able to break MD5 using his MD4 attack (ca 1996)
o Showed that MD5 might be vulnerable
• In 2004, Wang published one MD5 collision
o No explanation of method was given
• Based on one collision, Wang’s method was reverse engineered by Australian team
o Ironically, this reverse engineering work has been primary source to improve Wang’s
MD5 Attack: Overview
• Determine two 1024-bit messages:
o M′ = (M′0,M′1) and M = (M0,M1)
• So that MD5 hashes are the same
o That is, a collision attack
• Attack is efficient
o Many improvements to Wang’s original approach
• Note that
o Each Mi and M′i is a 512-bit block
o Each block is 16 words, 32 bits/word
• A differential cryptanalysis attack
• Idea is to use first block to generate desired “IV” for 2nd block
o Can be viewed as a “chosen IV” attack
Cryptographic Hashing is very important for today. It is necessary for storing passwords, files integrity
verification, public key fingerprint and combined with various encryption functions. In its applications it
is not possible to avoid its use. Collisions and various attacks are some possibilities to hack the hashing
cryptography. However cryptographic hashing plays an important role in Cryptography and Information
I tried my level best to complete this report on Cryptography & Information Security. From internet and
books I gathered lot of information about the Cryptographic Hashing Algorithms and their use to make
this report. Also MD5 was an interesting topic to know- how its algorithm works, how many rounds,
etc. I got a brief idea on MD5 hacking, which was important to know.
I would like to thank our honorable course teacher who helped us great to learn various important
arenas on Cryptography & Information Security. It was really very much difficult for us to have a good
coverage on this important subject of computer science without a teacher like him.
• Wikipedia Links
• SlideShare Links
• Data Communications and Networking, by- Forouzan, 4th Edition
• Cryptography & Network Security, by- William Stallings, 4th Edition