Upcoming SlideShare
×

# Message Authentication using Message Digests and the MD5 Algorithm

5,955 views

Published on

Published in: Technology, Education
5 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

Views
Total views
5,955
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
439
0
Likes
5
Embeds 0
No embeds

No notes for slide

### Message Authentication using Message Digests and the MD5 Algorithm

1. 1. Message authentication is important whereundetected manipulation of messages can have disastrous effects. 1
2. 2.  Message authentication is important where undetected manipulation of messages can have disastrous effects. Examples include Internet Commerce and Network Management. 2
3. 3. 3
4. 4. 4
5. 5. 5
6. 6. 6
7. 7.  A hash function H is a transformation that takes an input m and returns a fixed-size string, which is called the hash value h (that is, h = H(m)). Hash functions with just this property have a variety of general computational uses, but when employed in cryptography, the hash functions are usually chosen to have some additional properties. 7
8. 8.  The basic requirements for a cryptographic hash function are as follows. ◦ The input can be of any length. ◦ The output has a fixed length. ◦ H(x) is relatively easy to compute for any given x. ◦ H(x) is one-way. ◦ H(x) is collision-free. 8
9. 9.  A hash function H is said to be one-way if it is hard to invert, where ``hard to invert means that given a hash value h, it is computationally infeasible to find some input x such that H(x) = h. 9
10. 10.  The hash value represents concisely the longer message or document from which it was computed; this value is called the message digest. One can think of a message digest as a ``digital fingerprint of the larger document. Examples of well known hash functions are MD2 and MD5 and SHA 10
11. 11.  Damgard and Merkle greatly influenced cryptographic hash function design by defining a hash function in terms of what is called a compression function. A compression function takes a fixed-length input and returns a shorter, fixed-length output. Given a compression function, a hash function can be defined by repeated applications of the compression function until the entire message has been processed. 11
12. 12.  In this process, a message of arbitrary length is broken into blocks whose length depends on the compression function, and “padded” (for security reasons) so the size of the message is a multiple of the block size. The blocks are then processed sequentially, taking as input the result of the hash so far and the current message block, with the final output being the hash value for the message. 12
13. 13.  The following five steps are performed to compute the message digest of the message. Step 1. Append Padding Bits Step 2. Append Length Step 3. Initialize MD Buffer Step 4. Process Message in 16-Word Blocks Step 5. Output 13
14. 14.  The message is "padded" (extended) so that its length (in bits) is congruent to 448, modulo 512. That is, the message is extended so that it is just 64 bits shy of being a multiple of 512 bits long. Padding is always performed, even if the length of the message is already congruent to 448, modulo 512. Padding is performed as follows: a single "1" bit is appended to the message, and then "0" bits are appended so that the length in bits of the padded message becomes congruent to 448, modulo 512. In all, at least one bit and at most 512 bits are appended. 14
15. 15.  A 64-bit representation of b (the length of the message before the padding bits were added) is appended to the result of the previous step. In the unlikely event that b is greater than 2^64, then only the low-order 64 bits of b are used. (These bits are appended as two 32-bit words and appended low-order word first in accordance with the previous conventions.) 15
16. 16.  A four-word buffer (A,B,C,D) is used to compute the message digest. Here each of A, B, C, D is a 32-bit register. These registers are initialized to the following values in hexadecimal, low- order bytes first): 16
17. 17. 17
18. 18. 18
19. 19. 19
20. 20. 20
21. 21. 21
22. 22. 22
23. 23. The functions G, H, and I are similar to the function F, in that they act in "bitwise parallel" to produce their output from the bits of X, Y, and Z, in such a manner that if the corresponding bits of X, Y, and Z are independent and unbiased, then each bit of G(X,Y,Z), H(X,Y,Z), and I(X,Y,Z) will be independent and unbiased. Note that the function H is the bit-wise "xor" or "parity" function of its inputs. 23
24. 24. This step uses a 64-element table T[1 ... 64]constructed from the sine function. Let T[i] denotethe i-th element of the table, which is equal to theinteger part of 4294967296 times abs(sin(i)), wherei is in radians. The elements of the table are givenin the following slide. 24
25. 25. 25
26. 26.  The message digest produced as output is A, B, C, D. That is, we begin with the low-order byte of A, and end with the high-order byte of D. 26
27. 27.  MD4 SHA-1 RIPEMD-160 27
28. 28. 28
29. 29. 29
30. 30.  2004: First MD5 collision attack ◦ Only difference between messages in random looking 128 collision bytes ◦ Currently < 1 second on PC MD5( ) = MD5( )
31. 31.  Attack scenarios ◦ Generate specific collision blocks ◦ Use document format IF…THEN…ELSE ◦ Both payloads present in both files ◦ Colliding PostScript files with different contents ◦ Similar examples with other formats: DOC, PDF ◦ Colliding executables with different execution flows
32. 32.  2007: Stronger collision attack ◦ Chosen-Prefix Collisions ◦ Messages can differ freely up to the random looking 716 collision bytes ◦ Currently approx. 1 day on PS3+PC MD5( ) = MD5( )
33. 33.  Second generation attack scenarios ◦ Using chosen-prefix collisions ◦ No IF…THEN…ELSE necessary  Each file contains single payload instead of both  Collision blocks not actively used in format ◦ Colliding executables  Malicious payload cannot be scanned in harmless executable ◦ Colliding documents (PDF, DOC, …)  Collision blocks put inside hidden raw image data
34. 34. Certificates with colliding to-be-signed parts ◦ generate a pair of certificates ◦ sign the legitimate certificate ◦ copy the signature into the rogue certPrevious work ◦ Different RSA public keys in 2005  using 2004 collision attack ◦ Different identities in 2006  using chosen-prefix collisions  the theory is well known since 2007
35. 35. set by serial number serial numberthe CA validity period validity period chosen prefix (difference) real cert rogue cert domain name domain name real cert real cert RSA key RSA key collision bits (computed) X.509 extensions identical bytes X.509 extensions (copied from real cert) signature signature
36. 36. ◦ We collected 30,000 website certificates  9,000 of them were signed with MD5  97% of those were issued by RapidSSL◦ CAs still using MD5 in 2008:  RapidSSL  FreeSSL  TrustCenter  RSA Data Security  Thawte  verisign.co.jp
37. 37. ◦ RapidSSL uses a fully automated system◦ The certificate is issued exactly 6 seconds after we click the button and expires in one year.
38. 38. RapidSSL uses sequential serial numbers:Nov 3 07:42:02 2008 GMT 643004Nov 3 07:43:02 2008 GMT 643005Nov 3 07:44:08 2008 GMT 643006Nov 3 07:45:02 2008 GMT 643007Nov 3 07:46:02 2008 GMT 643008Nov 3 07:47:03 2008 GMT 643009Nov 3 07:48:02 2008 GMT 643010Nov 3 07:49:02 2008 GMT 643011Nov 3 07:50:02 2008 GMT 643012Nov 3 07:51:12 2008 GMT 643013Nov 3 07:51:29 2008 GMT 643014Nov 3 07:52:02 2008 GMT ?
39. 39. ◦ Remote counter  increases only when people buy certs  we can do a query-and-increment operation at a cost of buying one certificate◦ Cost  \$69 for a new certificate  renewals are only \$45  up to 20 free reissues of a certificate  \$2.25/query-and-increment operation
40. 40. 1. Get the serial number S on Friday2. Predict the value for time T on Sunday to be S+10003. Generate the collision bits4. Shortly before time T buy enough certs to increment the counter to S+9995. Send colliding request at time T and get serial number S+1000
41. 41. Based on the 2007chosen-prefix collisionspaper with newimprovements1-2 days on a cluster of200 PlayStation 3’sEquivalent to 8000desktop CPU cores or\$20,000 on Amazon EC2
42. 42. serial number validity period rogue CA cert chosen prefix (difference) rogue CA RSA keyreal cert domain name rogue CA X.509 CA bit! extensions real cert collision bits Netscape Comment RSA key (computed) Extension (contents ignored by browsers)X.509 extensions identical bytes (copied from real cert) signature signature
43. 43. ◦ 3 failed attempts  problems with timing  other CA requests stealing our serial number◦ Finally success on the 4th attempt!◦ Total cost of certificates: USD \$657
44. 44. Part IV
45. 45. ◦ We can sign fully trusted certificates◦ Perfect man-in-the-middle attacks◦ A malicious attacker can pick a more realistic CA name and fool even experts
46. 46. MITM requires connection hijacking: ◦ Insecure wireless networks ◦ ARP spoofing ◦ Proxy autodiscovery ◦ DNS spoofing ◦ Owning routers
47. 47. Part V
48. 48. ◦ We’re not releasing the private key◦ Our CA cert was backdated to Aug 2004  just for demo purposes, a real malicious attacker can get a cert that never expires◦ Browser vendors can blacklist our cert  we notified them in advance◦ Users might be able to blacklist our cert
49. 49. Our CA cert is not easily revocable! ◦ CRL and OCSP get the revocation URL from the cert itself ◦ Our cert contains no such URL ◦ Revocation checking is disabled in Firefox 2 and IE6 anywaysPossiblefixes: Large organizations can set uptheir own custom OCSP server and force OCSPrevocation checking.
50. 50. Extended Validation (EV) certs: ◦ supported by all major browsers ◦ EV CAs are not allowed to use MD5 ◦ safe against this attackDo users really know how to tell the differencebetween EV and regular certs?
51. 51. With optimizations the attack might be donefor \$2000 on Amazon EC2 in 1 dayWe want to prevent malicious entities fromrepeating the attack: ◦ We are not releasing our collision finding implementation or improved methods until we feel it’s safe ◦ We’ve talked to the affected CAs: they will switch to SHA-1 very, very soon
52. 52. No way to tell. ◦ The theory has been public since 2007 ◦ Our legitimate certificate is completely innocuous, the collision bits are hidden in the RSA key, but they look randomCan we still trust CA certs that have been usedto sign anything with MD5 in the last few years?
53. 53. ◦ We need defense in depth  random serial numbers  random delay when signing certs◦ Future challenges:  second preimage against MD5  collisions in SHA-1◦ Dropping support for a broken crypto primitive is very hard in practice  but crypto can be broken overnight  what do we do if SHA-1 or RSA falls tomorrow?
54. 54. Part VI
55. 55. ◦ No need to panic, the Internet is not completely broken◦ The affected CAs are switching to SHA-1◦ Making the theoretical possible is sometimes the only way you can affect change and secure the Internet