SlideShare a Scribd company logo
UNICODE
Hacking The International Character System
Introduction
• Standard for representing text for most of
the world’s writing systems	

• The most recent version is Unicode 6.0	

• Widely adopted by most programming
platforms, operating systems and The Web	

• The most widely used unicode encodings
are UTF-8 and UTF-16
Introduction to UTF-8
• UTF-8 (UCS Transformation Format - 8bit)	

• Backwards compatible with ASCII	

• Simple ASCII chars are represented by a
single byte	

• Other characters can include up to 4
bytes but 31 bits in total spanning across
6 physical bytes
UTF-8 Encoding Table
Bits Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 Byte 6
7 0XXXXXX
11 110XXXXX 10XXXXXX
16 1110XXXX 10XXXXXX 10XXXXXX
21 11110XXX 10XXXXXX 10XXXXXX 10XXXXXX
26 111110XX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX
31 1111110X 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX
UTF-8 Encoding Rules
• Every ASCII character is also valid UTF-8 character
(up to 7 bits or 128 characters)	

• For every other UTF-8 byte sequence the first byte
indicates the length of the sequence in bytes	

• The rest of the bytes from the byte sequence have 10
as the two most significant bits	

• This helps to easily find where a byte sequence
starts and ends	

• There are more rules but this is a good start...
Interesting UTF-8
Characters
• UTF-8 also provides a lot of function characters such as	

• Byte Order Mark (BOM) - 0xEF, 0xBB, 0xBF are placed at the start of the document to indicate UTF-8	

• Left to Right Mark (LRM) - 0xE2, 0x80, 0x8E are placed to indicate text orientation	

• In HTML - ‎ ‎ or ‎	

• Right to Left Mark (RLM) - 0xE2, 0x80, 0x8F are placed to indicate text orientation	

• In HTML - ‏ ‏ or ‏	

• Left to Right Embedding (LRE) - 0xE2, 0x80, 0xAA	

• In HTML - ‪	

• Right to Left Embedding (RLE) - 0xE2, 0x80, 0xAB	

• In HTML - ‫	

• There are more...
Clarifications
• How exactly the hex sequence 0xE2, 0x80, 0x8E maps to
‎ in HTML?	

• 0xE2, 0x80, 0x8E is UTF-8	

• ‎ is 0x20, 0x0E in UTF-16	

• also known as 0x0000200E in UTF-32	

• There is no magic!You simply need to know which
encoding system you are working with and find out what
characters it supports.	

• http://www.decodeunicode.org - is a good reference
Multiple
Representations
• The same character can be represented multiple ways	

• For example	

• . (DOT) is represented as 0x2E	

• It is also the equivalent of 0xC0, 0xAE	

• It is also the equivalent of 0xE0, 0x80, 0xAE	

• It is also the equivalent of 0xF0, 0x80, 0x80, 0xAE	

• It is also the equivalent of 0xF8, 0x80, 0x80, 0x80, 0xAE	

• It is also the equivalent of 0xFC, 0x80, 0x80, 0x80, 0x80, 0xAE
Translating the . (DOT)
HEX Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 Byte 6
2E 00101110
C0 AE 11000000 10101110
E0 80 AE 11100000 10000000 10101110
F0 80 80 AE 11110000 10000000 10000000 10101110
F8 80 80 80 AE 11111000 10000000 10000000 10000000 10101110
FC 80 80 80 80 AE 11111100 10000000 10000000 10000000 10000000 10101110
Half and Full Width
Forms
• Graphic characters are traditionally classed as
halfwidth and fullwidth characters	

• In a fixed width font a halfwidth character takes
the half of the width of a fullwidth character	

• In Unicode you can find characters which are
presented in their halfwidth and fullwidth forms	

• http://www.unicode.org/charts/PDF/UFF00.pdf -
for more information
Fullwidth Latin
Characters
• Halfwidth and Fullwidth notations make sense when
used for characters such as those found in the Japanese
and Chinese character sets	

• The specifications also talk about latin characters
presented in their fullwidth forms	

• As a result the following mappings are possible	

• A - 0x41 (halfwidth) = A - 0xEF, 0xBC, 0xA1 (fullwidth)	

• B - 0x42 (halfwidth) = B - 0xEF, 0xBC, 0xA2 (fullwidth)	

• etc.
Security Considerations
• Visual Security Issues	

• Internationalized names	

• Left to Right and Right to Left representations	

• Charset Translation Issues	

• Occurs when strings are normalized before and after
translation between character sets	

• Characters in multiple representation	

• The same character can be represented in multiple ways
Case Study:Windows
Filename Mangling
• Consider the following files	

• [RTLO]cod.stnemucodtnatropmi.exe	

• [RTLO]cod.yrammusevituc[LTRO]n1c[LTRO].exe	

• [RTLO]gpj.!nuf_stohsnee[LTRO]n1c[LTRO].scr	

• Visually these files look different	

• exe.importantdocuments.doc	

• n1c.executivesummary.doc	

• n1c.screenshots_fun!.jpg
Case Study:The
PAYPAL Scam
• What is the difference between paypal.com
and paypai.com or between intel.com and
lntel.com?	

• How about citybank.com?	

• 0000000: d181 6974 7962 616e 6b2e 636f 6d ..itybank.com	

• 0xd1, 0x81 is the Cyrillic letter c which looks like the latin letter c
although they are very different
Case Study: Directory
Traversal
• Let’s say an application shows images by requesting /getimage.jsp?
name=image.jpg	

• The attacker tries to retrieve an arbitrary file by requesting /
getimage.jsp?name=../../../../boot.ini	

• Unfortunately the attack fails because the application checks
for the presence of ../ character sequence	

• ../ is 0x2E, 0x2E, 0x5C in hex	

• ../ is also 0x2E, 0xC0, 0xAE, 0x5C in overlong UTF-8	

• Since 0x2E, 0xC0, 0xAE, 0x5C is not equal to 0x2E, 0x2E, 0x5C
the security check is bypassed and the file content retrieved
References
• http://en.wikipedia.org/wiki/Mapping_of_Unicode_characters	

• http://decodeunicode.org	

• http://unicode.org/reports/tr36/	

• http://www.fileformat.info	

• http://blog.commtouch.com/cafe/email-security-news/using-unicode-to-trick-users-to-
install-malware/	

• https://dc414.org/wp-content/uploads/2011/01/righttoleften-override.pdf	

• http://norman.com/security_center/security_center_archive/2011/rtlo_unicode_hole/	

• http://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms	

• http://www.unicode.org/charts/PDF/UFF00.pdf

More Related Content

What's hot

Character Sets
Character SetsCharacter Sets
Character Sets
Leo Hernandez
 
Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)
Project Student
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
Andrei Zmievski
 
Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicode
Ulf Mattsson
 
Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...
Ulf Mattsson
 
20180324 leveraging unix tools
20180324 leveraging unix tools20180324 leveraging unix tools
20180324 leveraging unix tools
David Horvath
 
Using unicode with php
Using unicode with phpUsing unicode with php
Using unicode with php
Elizabeth Smith
 
File handling in vb.net
File handling in vb.netFile handling in vb.net
File handling in vb.net
Everywhere
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xmlphanleson
 
Unicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsUnicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set Collisions
Ray Paseur
 
Understanding Character Encodings
Understanding Character EncodingsUnderstanding Character Encodings
Understanding Character Encodings
Mobisoft Infotech
 
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
stepheneisenhauer
 
Anton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealAnton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealDefconRussia
 
Overview of character encoding
Overview of character encodingOverview of character encoding
Overview of character encoding
Duy Lâm
 
Abap slide class4 unicode-plusfiles
Abap slide class4 unicode-plusfilesAbap slide class4 unicode-plusfiles
Abap slide class4 unicode-plusfilesMilind Patil
 
What character is that
What character is thatWhat character is that
What character is that
Anders Karlsson
 

What's hot (20)

Character Sets
Character SetsCharacter Sets
Character Sets
 
Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
 
Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicode
 
Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...
 
20180324 leveraging unix tools
20180324 leveraging unix tools20180324 leveraging unix tools
20180324 leveraging unix tools
 
Using unicode with php
Using unicode with phpUsing unicode with php
Using unicode with php
 
Ascii 03
Ascii 03Ascii 03
Ascii 03
 
File handling in vb.net
File handling in vb.netFile handling in vb.net
File handling in vb.net
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xml
 
Filehandling
FilehandlingFilehandling
Filehandling
 
Unicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set CollisionsUnicode, PHP, and Character Set Collisions
Unicode, PHP, and Character Set Collisions
 
Understanding Character Encodings
Understanding Character EncodingsUnderstanding Character Encodings
Understanding Character Encodings
 
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...
 
Anton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can revealAnton Dorfman - Reversing data formats what data can reveal
Anton Dorfman - Reversing data formats what data can reveal
 
Php Unicode I18n
Php Unicode I18nPhp Unicode I18n
Php Unicode I18n
 
ASCII-EBCDIC-HEX
ASCII-EBCDIC-HEXASCII-EBCDIC-HEX
ASCII-EBCDIC-HEX
 
Overview of character encoding
Overview of character encodingOverview of character encoding
Overview of character encoding
 
Abap slide class4 unicode-plusfiles
Abap slide class4 unicode-plusfilesAbap slide class4 unicode-plusfiles
Abap slide class4 unicode-plusfiles
 
What character is that
What character is thatWhat character is that
What character is that
 

Viewers also liked

Advanced JS Deobfuscation
Advanced JS DeobfuscationAdvanced JS Deobfuscation
Advanced JS Deobfuscation
Minded Security
 
Secure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best PracticesSecure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best PracticesWebsecurify
 
CODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
CODE BLUE 2014 : Joy of a bug hunter by Masato KinugawaCODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
CODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
CODE BLUE
 
Bug-hunter's Sorrow
Bug-hunter's SorrowBug-hunter's Sorrow
Bug-hunter's Sorrow
Masato Kinugawa
 
NoSQL Injections in Node.js - The case of MongoDB
NoSQL Injections in Node.js - The case of MongoDBNoSQL Injections in Node.js - The case of MongoDB
NoSQL Injections in Node.js - The case of MongoDB
Sqreen
 
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS FilterX-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
Masato Kinugawa
 
SecurityCamp2015「バグハンティング入門」
SecurityCamp2015「バグハンティング入門」SecurityCamp2015「バグハンティング入門」
SecurityCamp2015「バグハンティング入門」
Masato Kinugawa
 

Viewers also liked (7)

Advanced JS Deobfuscation
Advanced JS DeobfuscationAdvanced JS Deobfuscation
Advanced JS Deobfuscation
 
Secure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best PracticesSecure Coding - Web Application Security Vulnerabilities and Best Practices
Secure Coding - Web Application Security Vulnerabilities and Best Practices
 
CODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
CODE BLUE 2014 : Joy of a bug hunter by Masato KinugawaCODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
CODE BLUE 2014 : Joy of a bug hunter by Masato Kinugawa
 
Bug-hunter's Sorrow
Bug-hunter's SorrowBug-hunter's Sorrow
Bug-hunter's Sorrow
 
NoSQL Injections in Node.js - The case of MongoDB
NoSQL Injections in Node.js - The case of MongoDBNoSQL Injections in Node.js - The case of MongoDB
NoSQL Injections in Node.js - The case of MongoDB
 
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS FilterX-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
 
SecurityCamp2015「バグハンティング入門」
SecurityCamp2015「バグハンティング入門」SecurityCamp2015「バグハンティング入門」
SecurityCamp2015「バグハンティング入門」
 

Similar to Unicode - Hacking The International Character System

Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
Alula Tafere
 
Applied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codesApplied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codesJaphet Munnah
 
expect("").length.toBe(1)
expect("").length.toBe(1)expect("").length.toBe(1)
expect("").length.toBe(1)
Philip Hofstetter
 
C101 – Intro to Programming with C
C101 – Intro to Programming with CC101 – Intro to Programming with C
C101 – Intro to Programming with C
gpsoft_sk
 
Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - IT
guest6ddfb98
 
Pipiot - the double-architecture shellcode constructor
Pipiot - the double-architecture shellcode constructorPipiot - the double-architecture shellcode constructor
Pipiot - the double-architecture shellcode constructor
Moshe Zioni
 
COMPUTER INTRODUCTION
COMPUTER INTRODUCTIONCOMPUTER INTRODUCTION
COMPUTER INTRODUCTION
Amit Sharma
 
Unicode 101
Unicode 101Unicode 101
Unicode 101
davidfstr
 
Character Encoding issue with PHP
Character Encoding issue with PHPCharacter Encoding issue with PHP
Character Encoding issue with PHP
Ravi Raj
 
U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)
Michael Rys
 
presentation_python_7_1569170870_375360.pptx
presentation_python_7_1569170870_375360.pptxpresentation_python_7_1569170870_375360.pptx
presentation_python_7_1569170870_375360.pptx
ansariparveen06
 
Unicode
UnicodeUnicode
Unicode
ESUG
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode format
AdityaSharma1452
 
introduction for computers
introduction for computersintroduction for computers
introduction for computers
Yogesh Chaure
 
Intro compute
Intro computeIntro compute
Intro compute
Usman Shah
 
Intro compute
Intro computeIntro compute
Intro compute
GHOTRAANGEL
 
Intro computeRRR
Intro computeRRRIntro computeRRR
Intro computeRRR
GHOTRAANGEL
 
Lesson4.2 u4 l1 binary squences
Lesson4.2 u4 l1 binary squencesLesson4.2 u4 l1 binary squences
Lesson4.2 u4 l1 binary squences
Lexume1
 

Similar to Unicode - Hacking The International Character System (20)

Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
 
Using unicode with php
Using unicode with phpUsing unicode with php
Using unicode with php
 
Applied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codesApplied physics iii lecture3 digital_codes
Applied physics iii lecture3 digital_codes
 
expect("").length.toBe(1)
expect("").length.toBe(1)expect("").length.toBe(1)
expect("").length.toBe(1)
 
C101 – Intro to Programming with C
C101 – Intro to Programming with CC101 – Intro to Programming with C
C101 – Intro to Programming with C
 
Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - IT
 
Pipiot - the double-architecture shellcode constructor
Pipiot - the double-architecture shellcode constructorPipiot - the double-architecture shellcode constructor
Pipiot - the double-architecture shellcode constructor
 
COMPUTER INTRODUCTION
COMPUTER INTRODUCTIONCOMPUTER INTRODUCTION
COMPUTER INTRODUCTION
 
C# basics...
C# basics...C# basics...
C# basics...
 
Unicode 101
Unicode 101Unicode 101
Unicode 101
 
Character Encoding issue with PHP
Character Encoding issue with PHPCharacter Encoding issue with PHP
Character Encoding issue with PHP
 
U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)
 
presentation_python_7_1569170870_375360.pptx
presentation_python_7_1569170870_375360.pptxpresentation_python_7_1569170870_375360.pptx
presentation_python_7_1569170870_375360.pptx
 
Unicode
UnicodeUnicode
Unicode
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode format
 
introduction for computers
introduction for computersintroduction for computers
introduction for computers
 
Intro compute
Intro computeIntro compute
Intro compute
 
Intro compute
Intro computeIntro compute
Intro compute
 
Intro computeRRR
Intro computeRRRIntro computeRRR
Intro computeRRR
 
Lesson4.2 u4 l1 binary squences
Lesson4.2 u4 l1 binary squencesLesson4.2 u4 l1 binary squences
Lesson4.2 u4 l1 binary squences
 

More from Websecurify

Security Challenges in Node.js
Security Challenges in Node.jsSecurity Challenges in Node.js
Security Challenges in Node.js
Websecurify
 
Next Generation of Web Application Security Tools
Next Generation of Web Application Security ToolsNext Generation of Web Application Security Tools
Next Generation of Web Application Security Tools
Websecurify
 
Web Application Security 101 - 14 Data Validation
Web Application Security 101 - 14 Data ValidationWeb Application Security 101 - 14 Data Validation
Web Application Security 101 - 14 Data Validation
Websecurify
 
Web Application Security 101 - 12 Logging
Web Application Security 101 - 12 LoggingWeb Application Security 101 - 12 Logging
Web Application Security 101 - 12 Logging
Websecurify
 
Web Application Security 101 - 10 Server Tier
Web Application Security 101 - 10 Server TierWeb Application Security 101 - 10 Server Tier
Web Application Security 101 - 10 Server Tier
Websecurify
 
Web Application Security 101 - 07 Session Management
Web Application Security 101 - 07 Session ManagementWeb Application Security 101 - 07 Session Management
Web Application Security 101 - 07 Session Management
Websecurify
 
Web Application Security 101 - 06 Authentication
Web Application Security 101 - 06 AuthenticationWeb Application Security 101 - 06 Authentication
Web Application Security 101 - 06 Authentication
Websecurify
 
Web Application Security 101 - 05 Enumeration
Web Application Security 101 - 05 EnumerationWeb Application Security 101 - 05 Enumeration
Web Application Security 101 - 05 Enumeration
Websecurify
 
Web Application Security 101 - 04 Testing Methodology
Web Application Security 101 - 04 Testing MethodologyWeb Application Security 101 - 04 Testing Methodology
Web Application Security 101 - 04 Testing Methodology
Websecurify
 
Web Application Security 101 - 03 Web Security Toolkit
Web Application Security 101 - 03 Web Security ToolkitWeb Application Security 101 - 03 Web Security Toolkit
Web Application Security 101 - 03 Web Security Toolkit
Websecurify
 
Web Application Security 101 - 02 The Basics
Web Application Security 101 - 02 The BasicsWeb Application Security 101 - 02 The Basics
Web Application Security 101 - 02 The Basics
Websecurify
 

More from Websecurify (11)

Security Challenges in Node.js
Security Challenges in Node.jsSecurity Challenges in Node.js
Security Challenges in Node.js
 
Next Generation of Web Application Security Tools
Next Generation of Web Application Security ToolsNext Generation of Web Application Security Tools
Next Generation of Web Application Security Tools
 
Web Application Security 101 - 14 Data Validation
Web Application Security 101 - 14 Data ValidationWeb Application Security 101 - 14 Data Validation
Web Application Security 101 - 14 Data Validation
 
Web Application Security 101 - 12 Logging
Web Application Security 101 - 12 LoggingWeb Application Security 101 - 12 Logging
Web Application Security 101 - 12 Logging
 
Web Application Security 101 - 10 Server Tier
Web Application Security 101 - 10 Server TierWeb Application Security 101 - 10 Server Tier
Web Application Security 101 - 10 Server Tier
 
Web Application Security 101 - 07 Session Management
Web Application Security 101 - 07 Session ManagementWeb Application Security 101 - 07 Session Management
Web Application Security 101 - 07 Session Management
 
Web Application Security 101 - 06 Authentication
Web Application Security 101 - 06 AuthenticationWeb Application Security 101 - 06 Authentication
Web Application Security 101 - 06 Authentication
 
Web Application Security 101 - 05 Enumeration
Web Application Security 101 - 05 EnumerationWeb Application Security 101 - 05 Enumeration
Web Application Security 101 - 05 Enumeration
 
Web Application Security 101 - 04 Testing Methodology
Web Application Security 101 - 04 Testing MethodologyWeb Application Security 101 - 04 Testing Methodology
Web Application Security 101 - 04 Testing Methodology
 
Web Application Security 101 - 03 Web Security Toolkit
Web Application Security 101 - 03 Web Security ToolkitWeb Application Security 101 - 03 Web Security Toolkit
Web Application Security 101 - 03 Web Security Toolkit
 
Web Application Security 101 - 02 The Basics
Web Application Security 101 - 02 The BasicsWeb Application Security 101 - 02 The Basics
Web Application Security 101 - 02 The Basics
 

Recently uploaded

Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
lorraineandreiamcidl
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 

Recently uploaded (20)

Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOMLORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
LORRAINE ANDREI_LEQUIGAN_HOW TO USE ZOOM
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 

Unicode - Hacking The International Character System

  • 2. Introduction • Standard for representing text for most of the world’s writing systems • The most recent version is Unicode 6.0 • Widely adopted by most programming platforms, operating systems and The Web • The most widely used unicode encodings are UTF-8 and UTF-16
  • 3. Introduction to UTF-8 • UTF-8 (UCS Transformation Format - 8bit) • Backwards compatible with ASCII • Simple ASCII chars are represented by a single byte • Other characters can include up to 4 bytes but 31 bits in total spanning across 6 physical bytes
  • 4. UTF-8 Encoding Table Bits Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 Byte 6 7 0XXXXXX 11 110XXXXX 10XXXXXX 16 1110XXXX 10XXXXXX 10XXXXXX 21 11110XXX 10XXXXXX 10XXXXXX 10XXXXXX 26 111110XX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX 31 1111110X 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX 10XXXXXX
  • 5. UTF-8 Encoding Rules • Every ASCII character is also valid UTF-8 character (up to 7 bits or 128 characters) • For every other UTF-8 byte sequence the first byte indicates the length of the sequence in bytes • The rest of the bytes from the byte sequence have 10 as the two most significant bits • This helps to easily find where a byte sequence starts and ends • There are more rules but this is a good start...
  • 6. Interesting UTF-8 Characters • UTF-8 also provides a lot of function characters such as • Byte Order Mark (BOM) - 0xEF, 0xBB, 0xBF are placed at the start of the document to indicate UTF-8 • Left to Right Mark (LRM) - 0xE2, 0x80, 0x8E are placed to indicate text orientation • In HTML - ‎ ‎ or ‎ • Right to Left Mark (RLM) - 0xE2, 0x80, 0x8F are placed to indicate text orientation • In HTML - ‏ ‏ or ‏ • Left to Right Embedding (LRE) - 0xE2, 0x80, 0xAA • In HTML - ‪ • Right to Left Embedding (RLE) - 0xE2, 0x80, 0xAB • In HTML - ‫ • There are more...
  • 7. Clarifications • How exactly the hex sequence 0xE2, 0x80, 0x8E maps to ‎ in HTML? • 0xE2, 0x80, 0x8E is UTF-8 • ‎ is 0x20, 0x0E in UTF-16 • also known as 0x0000200E in UTF-32 • There is no magic!You simply need to know which encoding system you are working with and find out what characters it supports. • http://www.decodeunicode.org - is a good reference
  • 8. Multiple Representations • The same character can be represented multiple ways • For example • . (DOT) is represented as 0x2E • It is also the equivalent of 0xC0, 0xAE • It is also the equivalent of 0xE0, 0x80, 0xAE • It is also the equivalent of 0xF0, 0x80, 0x80, 0xAE • It is also the equivalent of 0xF8, 0x80, 0x80, 0x80, 0xAE • It is also the equivalent of 0xFC, 0x80, 0x80, 0x80, 0x80, 0xAE
  • 9. Translating the . (DOT) HEX Byte 1 Byte 2 Byte 3 Byte 4 Byte 5 Byte 6 2E 00101110 C0 AE 11000000 10101110 E0 80 AE 11100000 10000000 10101110 F0 80 80 AE 11110000 10000000 10000000 10101110 F8 80 80 80 AE 11111000 10000000 10000000 10000000 10101110 FC 80 80 80 80 AE 11111100 10000000 10000000 10000000 10000000 10101110
  • 10. Half and Full Width Forms • Graphic characters are traditionally classed as halfwidth and fullwidth characters • In a fixed width font a halfwidth character takes the half of the width of a fullwidth character • In Unicode you can find characters which are presented in their halfwidth and fullwidth forms • http://www.unicode.org/charts/PDF/UFF00.pdf - for more information
  • 11. Fullwidth Latin Characters • Halfwidth and Fullwidth notations make sense when used for characters such as those found in the Japanese and Chinese character sets • The specifications also talk about latin characters presented in their fullwidth forms • As a result the following mappings are possible • A - 0x41 (halfwidth) = A - 0xEF, 0xBC, 0xA1 (fullwidth) • B - 0x42 (halfwidth) = B - 0xEF, 0xBC, 0xA2 (fullwidth) • etc.
  • 12. Security Considerations • Visual Security Issues • Internationalized names • Left to Right and Right to Left representations • Charset Translation Issues • Occurs when strings are normalized before and after translation between character sets • Characters in multiple representation • The same character can be represented in multiple ways
  • 13. Case Study:Windows Filename Mangling • Consider the following files • [RTLO]cod.stnemucodtnatropmi.exe • [RTLO]cod.yrammusevituc[LTRO]n1c[LTRO].exe • [RTLO]gpj.!nuf_stohsnee[LTRO]n1c[LTRO].scr • Visually these files look different • exe.importantdocuments.doc • n1c.executivesummary.doc • n1c.screenshots_fun!.jpg
  • 14. Case Study:The PAYPAL Scam • What is the difference between paypal.com and paypai.com or between intel.com and lntel.com? • How about citybank.com? • 0000000: d181 6974 7962 616e 6b2e 636f 6d ..itybank.com • 0xd1, 0x81 is the Cyrillic letter c which looks like the latin letter c although they are very different
  • 15. Case Study: Directory Traversal • Let’s say an application shows images by requesting /getimage.jsp? name=image.jpg • The attacker tries to retrieve an arbitrary file by requesting / getimage.jsp?name=../../../../boot.ini • Unfortunately the attack fails because the application checks for the presence of ../ character sequence • ../ is 0x2E, 0x2E, 0x5C in hex • ../ is also 0x2E, 0xC0, 0xAE, 0x5C in overlong UTF-8 • Since 0x2E, 0xC0, 0xAE, 0x5C is not equal to 0x2E, 0x2E, 0x5C the security check is bypassed and the file content retrieved
  • 16. References • http://en.wikipedia.org/wiki/Mapping_of_Unicode_characters • http://decodeunicode.org • http://unicode.org/reports/tr36/ • http://www.fileformat.info • http://blog.commtouch.com/cafe/email-security-news/using-unicode-to-trick-users-to- install-malware/ • https://dc414.org/wp-content/uploads/2011/01/righttoleften-override.pdf • http://norman.com/security_center/security_center_archive/2011/rtlo_unicode_hole/ • http://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms • http://www.unicode.org/charts/PDF/UFF00.pdf