In this presentation we explore some of the problems of unicode and how they can be used for nefarious purposes in order to exploit a range of critical vulnerabilities including SQL Injection, XSS and many other.
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6Andrei Zmievski
n the halcyon days of early 2005, a project was launched to bring long overdue native Unicode and internationalization support to PHP. It was deemed so far reaching and important that PHP needed to have a version bump. After more than 4 years of development, the project (and PHP 6 for now) was shelved. This talk will introduce Unicode and i18n concepts, explain why Web needs Unicode, why PHP needs Unicode, how we tried to solve it (with examples), and what eventually happened. No sordid details will be left uncovered.
Data encryption and tokenization for international unicodeUlf Mattsson
Unicode is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard is maintained by the Unicode Consortium, and as of March 2020, it has a total of 143,859 characters, with Unicode 13.0 (these characters consist of 143,696 graphic characters and 163 format characters) covering 154 modern and historic scripts, as well as multiple symbol sets and emoji. The character repertoire of the Unicode Standard is synchronized with ISO/IEC 10646, each being code-for-code identical with the other.
The Unicode Standard consists of a set of code charts for visual reference, an encoding method and set of standard character encodings, a set of reference data files, and a number of related items, such as character properties, rules for normalization, decomposition, collation, rendering, and bidirectional text display order (for the correct display of text containing both right-to-left scripts, such as Arabic and Hebrew, and left-to-right scripts). Unicode's success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, Java (and other programming languages), and the .NET Framework.
Unicode can be implemented by different character encodings. The Unicode standard defines Unicode Transformation Formats (UTF) UTF-8, UTF-16, and UTF-32, and several other encodings. The most commonly used encodings are UTF-8, UTF-16, and UCS-2 (a precursor of UTF-16 without full support for Unicode)
Jun 29 new privacy technologies for unicode and international data standards ...Ulf Mattsson
Protecting the increasing use International Unicode characters is required by a growing number of Privacy Laws in many countries and general Privacy Concerns with private data. Current approaches to protect International Unicode characters will increase the size and change the data formats. This will break many applications and slow down business operations. The current approach is also randomly returning data in new and unexpected languages. New approach with significantly higher performance and a memory footprint can be customizable and fit on small IoT devices.
We will discuss new approaches to achieve portability, security, performance, small memory footprint and language preservation for privacy protecting of Unicode data. These new approaches provide granular protection for all Unicode languages and customizable alphabets and byte length preserving protection of privacy protected characters.
Old Approaches
Major Issues
Protecting the increasing use International Unicode characters is required by a growing number of Privacy Laws in many countries and general Privacy Concerns with private data.
Old approaches to protect International Unicode characters will typically increase the size and change the data formats.
This will break many applications and slow down business operations. This is an example of an old approach that is also randomly returning data in new and unexpected languages
our application is great – and popular. You have translation efforts underway, everything is going well – and wait a minute, what’s the report of strange question mark characters all over the page? Unicode is pain. UTF-32, UTF-16, UTF-8 and then something else is thrown in the mix … Multibyte and codepoints, it all sounds like greek. But it doesn’t have to be so scary. PHP support for Unicode has been improving, even without native unicode string support. Learn the basics of unicode is and how it works, why you would add support for it in your application, how to deal with issues, and the pain points of implementation.
Unicode, PHP, and Character Set CollisionsRay Paseur
In recent years UTF-8 has become the dominant character encoding scheme, supplanting extended ASCII. This has led to an uneasy transition for users of PHP, where the assumption has always been that one character equals one byte. This presentation is for the DC PHP Developers' Community meeting on September 10, 2014. It examines the history of character set encoding and the ways that the PHP community is responding to the transition to UTF-8. Not surprisingly, there are surprises in the process! The slides are derived from the article here:
http://iconoun.com/articles/collisions
A character is a sign or a symbol in a writing system. In computing a character can be, a letter, a digit, a punctuation or mathematical symbol or a control character.Computers only understand binary data. To represents the characters as required by human languages, the concept of character sets was introduced. In this PPT I have explained the charactor encoding. More info: http://mobisoftinfotech.com/resources/media/understanding-character-encodings
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...stepheneisenhauer
An informal introduction to file format identification using tools like "file" and DROID, originally presented internally as a workshop at UNT Libraries.
Character sets and collations are am important part of the database setup. In this presentation I show you the history of character sets and how they are used today, how UTF-8 works and how MySQL handles all this.
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6Andrei Zmievski
n the halcyon days of early 2005, a project was launched to bring long overdue native Unicode and internationalization support to PHP. It was deemed so far reaching and important that PHP needed to have a version bump. After more than 4 years of development, the project (and PHP 6 for now) was shelved. This talk will introduce Unicode and i18n concepts, explain why Web needs Unicode, why PHP needs Unicode, how we tried to solve it (with examples), and what eventually happened. No sordid details will be left uncovered.
Data encryption and tokenization for international unicodeUlf Mattsson
Unicode is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard is maintained by the Unicode Consortium, and as of March 2020, it has a total of 143,859 characters, with Unicode 13.0 (these characters consist of 143,696 graphic characters and 163 format characters) covering 154 modern and historic scripts, as well as multiple symbol sets and emoji. The character repertoire of the Unicode Standard is synchronized with ISO/IEC 10646, each being code-for-code identical with the other.
The Unicode Standard consists of a set of code charts for visual reference, an encoding method and set of standard character encodings, a set of reference data files, and a number of related items, such as character properties, rules for normalization, decomposition, collation, rendering, and bidirectional text display order (for the correct display of text containing both right-to-left scripts, such as Arabic and Hebrew, and left-to-right scripts). Unicode's success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, Java (and other programming languages), and the .NET Framework.
Unicode can be implemented by different character encodings. The Unicode standard defines Unicode Transformation Formats (UTF) UTF-8, UTF-16, and UTF-32, and several other encodings. The most commonly used encodings are UTF-8, UTF-16, and UCS-2 (a precursor of UTF-16 without full support for Unicode)
Jun 29 new privacy technologies for unicode and international data standards ...Ulf Mattsson
Protecting the increasing use International Unicode characters is required by a growing number of Privacy Laws in many countries and general Privacy Concerns with private data. Current approaches to protect International Unicode characters will increase the size and change the data formats. This will break many applications and slow down business operations. The current approach is also randomly returning data in new and unexpected languages. New approach with significantly higher performance and a memory footprint can be customizable and fit on small IoT devices.
We will discuss new approaches to achieve portability, security, performance, small memory footprint and language preservation for privacy protecting of Unicode data. These new approaches provide granular protection for all Unicode languages and customizable alphabets and byte length preserving protection of privacy protected characters.
Old Approaches
Major Issues
Protecting the increasing use International Unicode characters is required by a growing number of Privacy Laws in many countries and general Privacy Concerns with private data.
Old approaches to protect International Unicode characters will typically increase the size and change the data formats.
This will break many applications and slow down business operations. This is an example of an old approach that is also randomly returning data in new and unexpected languages
our application is great – and popular. You have translation efforts underway, everything is going well – and wait a minute, what’s the report of strange question mark characters all over the page? Unicode is pain. UTF-32, UTF-16, UTF-8 and then something else is thrown in the mix … Multibyte and codepoints, it all sounds like greek. But it doesn’t have to be so scary. PHP support for Unicode has been improving, even without native unicode string support. Learn the basics of unicode is and how it works, why you would add support for it in your application, how to deal with issues, and the pain points of implementation.
Unicode, PHP, and Character Set CollisionsRay Paseur
In recent years UTF-8 has become the dominant character encoding scheme, supplanting extended ASCII. This has led to an uneasy transition for users of PHP, where the assumption has always been that one character equals one byte. This presentation is for the DC PHP Developers' Community meeting on September 10, 2014. It examines the history of character set encoding and the ways that the PHP community is responding to the transition to UTF-8. Not surprisingly, there are surprises in the process! The slides are derived from the article here:
http://iconoun.com/articles/collisions
A character is a sign or a symbol in a writing system. In computing a character can be, a letter, a digit, a punctuation or mathematical symbol or a control character.Computers only understand binary data. To represents the characters as required by human languages, the concept of character sets was introduced. In this PPT I have explained the charactor encoding. More info: http://mobisoftinfotech.com/resources/media/understanding-character-encodings
Digging into File Formats: Poking around at data using file, DROID, JHOVE, an...stepheneisenhauer
An informal introduction to file format identification using tools like "file" and DROID, originally presented internally as a workshop at UNT Libraries.
Character sets and collations are am important part of the database setup. In this presentation I show you the history of character sets and how they are used today, how UTF-8 works and how MySQL handles all this.
CODE BLUE 2014 : Joy of a bug hunter by Masato KinugawaCODE BLUE
Recently, The number of enterprise which pays rewards for reporting security bugs is increasing. I am also received a large amount of rewards through the reward programs for reporting bugs. Actually, I earn a living with rewards, so it is not exaggeration to say that I am a professional bug hunter. I will make a speech such as how to be a professional bug hunter, actual of rules from the point of view of a positive attendance and how to discover vulnerabilities including technical topics.
My talks at Voxxed Days Zurich 2016. This is about he history of character encodings and unicode. And it's all about APIs stuck in the 90ies where things were very different.
Pipiot - the double-architecture shellcode constructorMoshe Zioni
Presentation Abstract:
When compiling shellcode - it is always constrained to what architecture you are intended it to run on. So with that thought in mind - I started my latest challenge/research/journey into assembly polyglotism - focusing on the two top main architectures around - x86 and ARM.
Through research, sleepless nights and a lot of coffee (thank you, coffee) I found out that it is, indeed, possible - and, while exploring different routes and directions, devised a constructive, repeatable method.
The Pipiot method is a constructuive way to break the limitation enforced by previously known shellcode construction and make a payload that can run on more than only one architecture of choice.
In this session you will learn on how this system works and how to apply its logic to your exploit payload construction, discuss possible impacts, strong and weak points of the method and of course - all provided with a follow-through live demo.
I hope You all like it. I hope It is very beneficial for you all. I really thought that you all get enough knowledge from this presentation. This presentation is about materials and their classifications. After you read this presentation you knowledge is not as before.
The following illustrates some of the common security challanges Node.js developers are up against. The presentation covers various types of JavaScript-related hacks and NoSQL injection hacking via Express and MongoDB.
Next Generation of Web Application Security ToolsWebsecurify
In this presentation we explore what makes Websecurify Suite unique. There are a few demos of Websecurify Suite itself and Cohesion - Websecurify's continuous integration security toolkit.
Web Application Security 101 - 14 Data ValidationWebsecurify
In part 14 of Web Application Security 101 you will learn about SQL Injection, Cross-site Scripting, Local File Includes and other common types of data validation problems.
Web Application Security 101 - 10 Server TierWebsecurify
In part 10 of Web Application Security 101 we explore the various issues that effect the server tier such as default files, default configuration, misconfigured insecure servers and more.
Web Application Security 101 - 07 Session ManagementWebsecurify
In part 7 of Web Application Security 101 we will explore the various security aspects of modern session management systems. We will particularly explore vulnerabilities such as weak session management and more. We will also look into session bruteforce attacks
Web Application Security 101 - 06 AuthenticationWebsecurify
In part 6 of Web Application Security 101 we will look into vulnerabilities effecting the authentication system. You will learn about password bruteforce attacks, cracking captures, bypassing the login system and more.
Web Application Security 101 - 05 EnumerationWebsecurify
In part 5 of Web Application Security 101 we will dive into the various enumeration techniques attackers use to fingerprint web applications. This steps is very important because it gives a lot of insight about weak areas that can be exploited at later stage. You will learn about fingerprinting software versions and firewalls, discovering virtual hosts, google hacking and more.
Web Application Security 101 - 04 Testing MethodologyWebsecurify
In part 4 of Web Application Security 101 we will dive deep into the standard testing methodology used by penetration testers and vulnerability researchers when testing web application for security vulnerabilities.
Web Application Security 101 - 03 Web Security ToolkitWebsecurify
In part 3 of Web Application Security 101 you will get introduced to the standard security toolkit. You will get access to Websecurify Suite to start hacking your way through the rest of the course.
Web Application Security 101 - 02 The BasicsWebsecurify
In part 2 of Web Application Security 101 we cover the basics of HTTP, HTML, XML, JSON, JavaScript, CSS and more in order to get you up to speed with the technology. This knowledge will be used during the rest of the course to explore the various security aspects effecting web applications today.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Mobile App Development Company In Noida | Drona InfotechDrona Infotech
Looking for a reliable mobile app development company in Noida? Look no further than Drona Infotech. We specialize in creating customized apps for your business needs.
Visit Us For : https://www.dronainfotech.com/mobile-application-development/
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Crescat
Crescat is industry-trusted event management software, built by event professionals for event professionals. Founded in 2017, we have three key products tailored for the live event industry.
Crescat Event for concert promoters and event agencies. Crescat Venue for music venues, conference centers, wedding venues, concert halls and more. And Crescat Festival for festivals, conferences and complex events.
With a wide range of popular features such as event scheduling, shift management, volunteer and crew coordination, artist booking and much more, Crescat is designed for customisation and ease-of-use.
Over 125,000 events have been planned in Crescat and with hundreds of customers of all shapes and sizes, from boutique event agencies through to international concert promoters, Crescat is rigged for success. What's more, we highly value feedback from our users and we are constantly improving our software with updates, new features and improvements.
If you plan events, run a venue or produce festivals and you're looking for ways to make your life easier, then we have a solution for you. Try our software for free or schedule a no-obligation demo with one of our product specialists today at crescat.io
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppGoogle
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-fusion-buddy-review
AI Fusion Buddy Review: Key Features
✅Create Stunning AI App Suite Fully Powered By Google's Latest AI technology, Gemini
✅Use Gemini to Build high-converting Converting Sales Video Scripts, ad copies, Trending Articles, blogs, etc.100% unique!
✅Create Ultra-HD graphics with a single keyword or phrase that commands 10x eyeballs!
✅Fully automated AI articles bulk generation!
✅Auto-post or schedule stunning AI content across all your accounts at once—WordPress, Facebook, LinkedIn, Blogger, and more.
✅With one keyword or URL, generate complete websites, landing pages, and more…
✅Automatically create & sell AI content, graphics, websites, landing pages, & all that gets you paid non-stop 24*7.
✅Pre-built High-Converting 100+ website Templates and 2000+ graphic templates logos, banners, and thumbnail images in Trending Niches.
✅Say goodbye to wasting time logging into multiple Chat GPT & AI Apps once & for all!
✅Save over $5000 per year and kick out dependency on third parties completely!
✅Brand New App: Not available anywhere else!
✅ Beginner-friendly!
✅ZERO upfront cost or any extra expenses
✅Risk-Free: 30-Day Money-Back Guarantee!
✅Commercial License included!
See My Other Reviews Article:
(1) AI Genie Review: https://sumonreview.com/ai-genie-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
#AIFusionBuddyReview,
#AIFusionBuddyFeatures,
#AIFusionBuddyPricing,
#AIFusionBuddyProsandCons,
#AIFusionBuddyTutorial,
#AIFusionBuddyUserExperience
#AIFusionBuddyforBeginners,
#AIFusionBuddyBenefits,
#AIFusionBuddyComparison,
#AIFusionBuddyInstallation,
#AIFusionBuddyRefundPolicy,
#AIFusionBuddyDemo,
#AIFusionBuddyMaintenanceFees,
#AIFusionBuddyNewbieFriendly,
#WhatIsAIFusionBuddy?,
#HowDoesAIFusionBuddyWorks
Utilocate offers a comprehensive solution for locate ticket management by automating and streamlining the entire process. By integrating with Geospatial Information Systems (GIS), it provides accurate mapping and visualization of utility locations, enhancing decision-making and reducing the risk of errors. The system's advanced data analytics tools help identify trends, predict potential issues, and optimize resource allocation, making the locate ticket management process smarter and more efficient. Additionally, automated ticket management ensures consistency and reduces human error, while real-time notifications keep all relevant personnel informed and ready to respond promptly.
The system's ability to streamline workflows and automate ticket routing significantly reduces the time taken to process each ticket, making the process faster and more efficient. Mobile access allows field technicians to update ticket information on the go, ensuring that the latest information is always available and accelerating the locate process. Overall, Utilocate not only enhances the efficiency and accuracy of locate ticket management but also improves safety by minimizing the risk of utility damage through precise and timely locates.
Zoom is a comprehensive platform designed to connect individuals and teams efficiently. With its user-friendly interface and powerful features, Zoom has become a go-to solution for virtual communication and collaboration. It offers a range of tools, including virtual meetings, team chat, VoIP phone systems, online whiteboards, and AI companions, to streamline workflows and enhance productivity.
E-commerce Application Development Company.pdfHornet Dynamics
Your business can reach new heights with our assistance as we design solutions that are specifically appropriate for your goals and vision. Our eCommerce application solutions can digitally coordinate all retail operations processes to meet the demands of the marketplace while maintaining business continuity.
Software Engineering, Software Consulting, Tech Lead, Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Transaction, Spring MVC, OpenShift Cloud Platform, Kafka, REST, SOAP, LLD & HLD.
Do you want Software for your Business? Visit Deuglo
Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions.
Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC).
Requirement — Collecting the Requirements is the first Phase in the SSLC process.
Feasibility Study — after completing the requirement process they move to the design phase.
Design — in this phase, they start designing the software.
Coding — when designing is completed, the developers start coding for the software.
Testing — in this phase when the coding of the software is done the testing team will start testing.
Installation — after completion of testing, the application opens to the live server and launches!
Maintenance — after completing the software development, customers start using the software.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
2. Introduction
• Standard for representing text for most of
the world’s writing systems
• The most recent version is Unicode 6.0
• Widely adopted by most programming
platforms, operating systems and The Web
• The most widely used unicode encodings
are UTF-8 and UTF-16
3. Introduction to UTF-8
• UTF-8 (UCS Transformation Format - 8bit)
• Backwards compatible with ASCII
• Simple ASCII chars are represented by a
single byte
• Other characters can include up to 4
bytes but 31 bits in total spanning across
6 physical bytes
5. UTF-8 Encoding Rules
• Every ASCII character is also valid UTF-8 character
(up to 7 bits or 128 characters)
• For every other UTF-8 byte sequence the first byte
indicates the length of the sequence in bytes
• The rest of the bytes from the byte sequence have 10
as the two most significant bits
• This helps to easily find where a byte sequence
starts and ends
• There are more rules but this is a good start...
6. Interesting UTF-8
Characters
• UTF-8 also provides a lot of function characters such as
• Byte Order Mark (BOM) - 0xEF, 0xBB, 0xBF are placed at the start of the document to indicate UTF-8
• Left to Right Mark (LRM) - 0xE2, 0x80, 0x8E are placed to indicate text orientation
• In HTML - ‎ ‎ or ‎
• Right to Left Mark (RLM) - 0xE2, 0x80, 0x8F are placed to indicate text orientation
• In HTML - ‏ ‏ or ‏
• Left to Right Embedding (LRE) - 0xE2, 0x80, 0xAA
• In HTML - ‪
• Right to Left Embedding (RLE) - 0xE2, 0x80, 0xAB
• In HTML - ‫
• There are more...
7. Clarifications
• How exactly the hex sequence 0xE2, 0x80, 0x8E maps to
‎ in HTML?
• 0xE2, 0x80, 0x8E is UTF-8
• ‎ is 0x20, 0x0E in UTF-16
• also known as 0x0000200E in UTF-32
• There is no magic!You simply need to know which
encoding system you are working with and find out what
characters it supports.
• http://www.decodeunicode.org - is a good reference
8. Multiple
Representations
• The same character can be represented multiple ways
• For example
• . (DOT) is represented as 0x2E
• It is also the equivalent of 0xC0, 0xAE
• It is also the equivalent of 0xE0, 0x80, 0xAE
• It is also the equivalent of 0xF0, 0x80, 0x80, 0xAE
• It is also the equivalent of 0xF8, 0x80, 0x80, 0x80, 0xAE
• It is also the equivalent of 0xFC, 0x80, 0x80, 0x80, 0x80, 0xAE
10. Half and Full Width
Forms
• Graphic characters are traditionally classed as
halfwidth and fullwidth characters
• In a fixed width font a halfwidth character takes
the half of the width of a fullwidth character
• In Unicode you can find characters which are
presented in their halfwidth and fullwidth forms
• http://www.unicode.org/charts/PDF/UFF00.pdf -
for more information
11. Fullwidth Latin
Characters
• Halfwidth and Fullwidth notations make sense when
used for characters such as those found in the Japanese
and Chinese character sets
• The specifications also talk about latin characters
presented in their fullwidth forms
• As a result the following mappings are possible
• A - 0x41 (halfwidth) = A - 0xEF, 0xBC, 0xA1 (fullwidth)
• B - 0x42 (halfwidth) = B - 0xEF, 0xBC, 0xA2 (fullwidth)
• etc.
12. Security Considerations
• Visual Security Issues
• Internationalized names
• Left to Right and Right to Left representations
• Charset Translation Issues
• Occurs when strings are normalized before and after
translation between character sets
• Characters in multiple representation
• The same character can be represented in multiple ways
13. Case Study:Windows
Filename Mangling
• Consider the following files
• [RTLO]cod.stnemucodtnatropmi.exe
• [RTLO]cod.yrammusevituc[LTRO]n1c[LTRO].exe
• [RTLO]gpj.!nuf_stohsnee[LTRO]n1c[LTRO].scr
• Visually these files look different
• exe.importantdocuments.doc
• n1c.executivesummary.doc
• n1c.screenshots_fun!.jpg
14. Case Study:The
PAYPAL Scam
• What is the difference between paypal.com
and paypai.com or between intel.com and
lntel.com?
• How about citybank.com?
• 0000000: d181 6974 7962 616e 6b2e 636f 6d ..itybank.com
• 0xd1, 0x81 is the Cyrillic letter c which looks like the latin letter c
although they are very different
15. Case Study: Directory
Traversal
• Let’s say an application shows images by requesting /getimage.jsp?
name=image.jpg
• The attacker tries to retrieve an arbitrary file by requesting /
getimage.jsp?name=../../../../boot.ini
• Unfortunately the attack fails because the application checks
for the presence of ../ character sequence
• ../ is 0x2E, 0x2E, 0x5C in hex
• ../ is also 0x2E, 0xC0, 0xAE, 0x5C in overlong UTF-8
• Since 0x2E, 0xC0, 0xAE, 0x5C is not equal to 0x2E, 0x2E, 0x5C
the security check is bypassed and the file content retrieved