Published on

Published in: Technology, News & Politics
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. • What is Unicode? • How Apps deal with Unicode • Unicode Transformation Attack • Real World Examples • How To Manipulate Applications • Remediation
  2. 2. <scrİpt> <script> < < g g
  3. 3. • Unicode lets computer systems support more languages, allowing for world wide use • Stores characters with multiple bytes • It provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language
  4. 4. • Every character has a unique number • A = U+0041 • < = U+003C
  5. 5. • Classic example: c0rn ;) o=U+006f, ο=U+03bf, о=U+043e • Latin Small o, Greek Small O, Cyrillic Small Letter o • Searches for the above can turn up different results in Google
  6. 6. • Data can be entered using Unicode to disguise malicious code and permit various Unicode transformation issues, such as Best-Fit Mapping
  7. 7. • Occurs when a character X gets transformed to an entirely different character Y. • Character X in the source encoding doesn't exist in the destination encoding, so the App attempts to find a best match. • So the characters are transcoded between Unicode and another encoding language.
  8. 8. Bypass filters:
  9. 9. • Lowercase operation on the input after filtering. • The string "script" is prevented by the filter, but the string "scrİpt" is allowed. • Possibility of using many lookalikes: AΑАᐱᗅᗋᗩᴀᴬ⍲A
  10. 10. • Unicode character U+FF1C FULLWIDTH LESS-THAN SIGN (<) transformed into U+003C LESS-THAN SIGN (<) due to best-fit. • Unicode Transformation for Cross-Site Scripting or SQL Injection; • %C0%BE = > • %C0%BC = <
  11. 11. • URL encoded GET input locale is set to acux5291%C0%BEz1%C0%BCz2a%90bcxuca5291 • Here is a part of the HTTP request. https://vendors-unit.prudential.com/OA_HTML/help?locale= acux5291%C0%BEz1%C0%BCz2a%90bcxuca5291 &group=FND:LIBRARY:US&topic=US/FND/@ICX_FWK_LABS_H OME_PAGE
  12. 12. • In the HTTP response, this character was converted to the short form (<) <input type="hidden" value="acux5291&gt;z1<z2a�bcxuca5291" name="group"> • Unicode character acux5291%C0%BEz1%C0%BCz2a%90bcxuca5291 is transformed into acux5291&gt;z1<z2a�bcxuca5291
  13. 13. • ?locale=%c0%bcscript%3E&group=FND:LIBRARY:US&topic=US/FND/@ICX _FWK_LABS_HOME_PAGE • ?locale=%3E&group=FND:LIBRARY:US&topic=US/FND/@ICX_FWK_LABS_H OME_PAGE • ?locale=%c0%bcscript/%3E&group=FND:LIBRARY:US&topic=US/FND/@IC X_FWK_LABS_HOME_PAGE
  14. 14. • Supported Unicode usernames. • Existing user account bigbird hijacked. • Attacker created a new Spotify account with username ᴮᴵᴳᴮᴵᴿᴰ (string u’u1d2eu1d35u1d33u1d2eu1d35u1d3fu1d30′). • Send a request for a password reset for your new account. • A password reset link is sent to the email for your new account. Use it to change the password. • Instead of logging into that account with username ᴮᴵᴳᴮᴵᴿᴰ, logged with username bigbird with the new password. • Account compromised.
  15. 15. • The canonical_username function only implemented the first time. Function like “toLower” implemented. • Users signs up with username BigBird, normalized to bigbird. • Another user signs up as ᴮᴵᴳᴮᴵᴿᴰ, which also gets normalized to BIGBIRD the first time, but bigbird the next time. • ᴮᴵᴳᴮᴵᴿᴰ requests a password reset email, but with it can reset bigbird’s account.
  16. 16. • Use Canonicalizing – Important aspect of input sanitization – Converting data with various possible representations into a standard "canonical" representation deemed acceptable by the application mapping all characters to lower case – Treat “BigBird”, “ ᴮᴵᴳᴮᴵᴿᴰ ” and “bigbird” as the same by Canonicalizing as they would all be mapped to ‘bigbird’
  17. 17. • The vulnerability was noticed when the compromised accounts started RETWEETING a tweet with a "♥" symbol that was followed by a string of code/Parameter. • Users didn’t even have to click on the tweet sent out by the Twitter account @derGeruhn. Just the act of viewing the tweet would cause the user to automatically retweet • Affected accounts also involuntarily re-tweeted a cross-site scripting (XSS) code as a result of the vulnerability • That tweet hit the max re-tweet over 84,000 times
  18. 18. • TweetDeck didn’t escape HTML-chars if a Unicode- char is in the tweet -text • The Unicode-Heart (which gets replaced with an image by TweetDeck) somehow prevents the Tweet from being HTML-escaped. • TweetDeck was not supposed to display this as an image. Because it's simple Text, which should be escaped to "&amp;hearts;".
  19. 19. 1. When converting strings used in security- sensitive operations, use documented options which prevent the use of best-fit mappings. 2. A suitable canonical form should be chosen and all user input canonicalized into that form before any authorization decisions are performed. 3. Security checks should be carried out after UTF- 8 decoding is completed. X is only allowed if X==canonical(X)
  20. 20. • Here’s a chart with all the new emoji in yellow including my favorite “1F595” which will be a hit on Twitter. • http://www.unicode.org/charts/PDF/Unicode- 7.0/U70-1F300.pdf
  21. 21. • http://www.rishida.net/tools/conversion/ • http://www.fileformat.info/info/unicode/char/a.htm • http://www.panix.com/~eli/unicode/convert.cgi?text= Unicode • http://unicode-table.com/en/ • http://www.unicode.org/charts/PDF/Unicode-7.0/U70- 1F300.pdf