My talks at Voxxed Days Zurich 2016. This is about he history of character encodings and unicode. And it's all about APIs stuck in the 90ies where things were very different.
Character sets and collations are am important part of the database setup. In this presentation I show you the history of character sets and how they are used today, how UTF-8 works and how MySQL handles all this.
Unicode - Hacking The International Character SystemWebsecurify
In this presentation we explore some of the problems of unicode and how they can be used for nefarious purposes in order to exploit a range of critical vulnerabilities including SQL Injection, XSS and many other.
Character sets and collations are am important part of the database setup. In this presentation I show you the history of character sets and how they are used today, how UTF-8 works and how MySQL handles all this.
Unicode - Hacking The International Character SystemWebsecurify
In this presentation we explore some of the problems of unicode and how they can be used for nefarious purposes in order to exploit a range of critical vulnerabilities including SQL Injection, XSS and many other.
our application is great – and popular. You have translation efforts underway, everything is going well – and wait a minute, what’s the report of strange question mark characters all over the page? Unicode is pain. UTF-32, UTF-16, UTF-8 and then something else is thrown in the mix … Multibyte and codepoints, it all sounds like greek. But it doesn’t have to be so scary. PHP support for Unicode has been improving, even without native unicode string support. Learn the basics of unicode is and how it works, why you would add support for it in your application, how to deal with issues, and the pain points of implementation.
Unicode, PHP, and Character Set CollisionsRay Paseur
In recent years UTF-8 has become the dominant character encoding scheme, supplanting extended ASCII. This has led to an uneasy transition for users of PHP, where the assumption has always been that one character equals one byte. This presentation is for the DC PHP Developers' Community meeting on September 10, 2014. It examines the history of character set encoding and the ways that the PHP community is responding to the transition to UTF-8. Not surprisingly, there are surprises in the process! The slides are derived from the article here:
http://iconoun.com/articles/collisions
Slides for a college course at City College San Francisco. Based on "The Shellcoder's Handbook: Discovering and Exploiting Security Holes ", by Chris Anley, John Heasman, Felix Lindner, Gerardo Richarte; ASIN: B004P5O38Q.
Instructor: Sam Bowne
Class website: https://samsclass.info/127/127_S17.shtml
Pipiot - the double-architecture shellcode constructorMoshe Zioni
Presentation Abstract:
When compiling shellcode - it is always constrained to what architecture you are intended it to run on. So with that thought in mind - I started my latest challenge/research/journey into assembly polyglotism - focusing on the two top main architectures around - x86 and ARM.
Through research, sleepless nights and a lot of coffee (thank you, coffee) I found out that it is, indeed, possible - and, while exploring different routes and directions, devised a constructive, repeatable method.
The Pipiot method is a constructuive way to break the limitation enforced by previously known shellcode construction and make a payload that can run on more than only one architecture of choice.
In this session you will learn on how this system works and how to apply its logic to your exploit payload construction, discuss possible impacts, strong and weak points of the method and of course - all provided with a follow-through live demo.
A character is a sign or a symbol in a writing system. In computing a character can be, a letter, a digit, a punctuation or mathematical symbol or a control character.Computers only understand binary data. To represents the characters as required by human languages, the concept of character sets was introduced. In this PPT I have explained the charactor encoding. More info: http://mobisoftinfotech.com/resources/media/understanding-character-encodings
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSveerababupersonal22
It consists of cw radar and fmcw radar ,range measurement,if amplifier and fmcw altimeterThe CW radar operates using continuous wave transmission, while the FMCW radar employs frequency-modulated continuous wave technology. Range measurement is a crucial aspect of radar systems, providing information about the distance to a target. The IF amplifier plays a key role in signal processing, amplifying intermediate frequency signals for further analysis. The FMCW altimeter utilizes frequency-modulated continuous wave technology to accurately measure altitude above a reference point.
our application is great – and popular. You have translation efforts underway, everything is going well – and wait a minute, what’s the report of strange question mark characters all over the page? Unicode is pain. UTF-32, UTF-16, UTF-8 and then something else is thrown in the mix … Multibyte and codepoints, it all sounds like greek. But it doesn’t have to be so scary. PHP support for Unicode has been improving, even without native unicode string support. Learn the basics of unicode is and how it works, why you would add support for it in your application, how to deal with issues, and the pain points of implementation.
Unicode, PHP, and Character Set CollisionsRay Paseur
In recent years UTF-8 has become the dominant character encoding scheme, supplanting extended ASCII. This has led to an uneasy transition for users of PHP, where the assumption has always been that one character equals one byte. This presentation is for the DC PHP Developers' Community meeting on September 10, 2014. It examines the history of character set encoding and the ways that the PHP community is responding to the transition to UTF-8. Not surprisingly, there are surprises in the process! The slides are derived from the article here:
http://iconoun.com/articles/collisions
Slides for a college course at City College San Francisco. Based on "The Shellcoder's Handbook: Discovering and Exploiting Security Holes ", by Chris Anley, John Heasman, Felix Lindner, Gerardo Richarte; ASIN: B004P5O38Q.
Instructor: Sam Bowne
Class website: https://samsclass.info/127/127_S17.shtml
Pipiot - the double-architecture shellcode constructorMoshe Zioni
Presentation Abstract:
When compiling shellcode - it is always constrained to what architecture you are intended it to run on. So with that thought in mind - I started my latest challenge/research/journey into assembly polyglotism - focusing on the two top main architectures around - x86 and ARM.
Through research, sleepless nights and a lot of coffee (thank you, coffee) I found out that it is, indeed, possible - and, while exploring different routes and directions, devised a constructive, repeatable method.
The Pipiot method is a constructuive way to break the limitation enforced by previously known shellcode construction and make a payload that can run on more than only one architecture of choice.
In this session you will learn on how this system works and how to apply its logic to your exploit payload construction, discuss possible impacts, strong and weak points of the method and of course - all provided with a follow-through live demo.
A character is a sign or a symbol in a writing system. In computing a character can be, a letter, a digit, a punctuation or mathematical symbol or a control character.Computers only understand binary data. To represents the characters as required by human languages, the concept of character sets was introduced. In this PPT I have explained the charactor encoding. More info: http://mobisoftinfotech.com/resources/media/understanding-character-encodings
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSveerababupersonal22
It consists of cw radar and fmcw radar ,range measurement,if amplifier and fmcw altimeterThe CW radar operates using continuous wave transmission, while the FMCW radar employs frequency-modulated continuous wave technology. Range measurement is a crucial aspect of radar systems, providing information about the distance to a target. The IF amplifier plays a key role in signal processing, amplifying intermediate frequency signals for further analysis. The FMCW altimeter utilizes frequency-modulated continuous wave technology to accurately measure altitude above a reference point.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
8. What is a string?
• Computers are really good at dealing with
numbers
• Humans are really good at dealing with text
• Marketing demands that computers do text
• We need to translate between text and
numbers
9. When all you have is a hammer,
every problem looks like a nail
10. In the beginning
• Array of characters
• C says char*
• char is defined as the “smallest addressable
unit that can contain basic character set”.
Integer type. Might be signed or unsigned
• char ends up being a byte
• Assign some meaning to each byte value
11. Interacting with the
world
• Just dump the contents of the memory into
a file
• Read back the same contents and put it in
memory
• Problem solved.
• Until you need to do this across machines
12. Interoperability
• char and the mapping from number to
letter is inherently implementation
dependent
• So is by definition the file you dump your
char* into
• Can’t move files between machines
13. ASCII
• “American Standard Code for Information
Interchange”
• Published 1963
• Uses 7 bits per character (circumventing
the signedness-issue)
• Perfectly fine for what everybody is using
(English)
14. But I need ümläüte
• Machines were used where people speak strange
languages (i.e. not English)
• ASCII is 7bit.
• Most machine have 8 bit bytes.
• Using that additional bit bit gives us another 127
characters!
• Depending on your country, these upper 127 characters
had different meanings
• No problem as texts usually don’t leave their country
17. Thüs wäs nöt pюssїҌlҿ!
I apologize to all Russians for butchering their script.
18. Unicode 1.0
• 16 bits per character
• Published in 1991, revised in 1992
• Jumped on by everybody who wanted “to
do it right”
• APIs were made Unicode compliant by
extending the size of a character to 16 bits.
20. Still just dumping
memory
• wchar is 16 bits
• Endianness? See if we care!
• To save to a file: Dump memory contents.
• To load from a file: Read file into memory
• Note they didn’t dare extending char to 16
bits
• Let’s call this “Unicode”
21. 16 bits everywhere
• Windows API (XxxxXxxW uses wchar
which is 16 bit wide)
• Java uses 16 bits
• Objective C uses 16 bits
• And of course, JavaScript uses 16 bits
• C and by extension Unix stayed away from
this.
22. That’s perfect. By using
16 bit characters, we
can store all of Unicode!
25. It didn’t work out so
well
• By just dumping memory, there’s no way to
know how to read it back
• You have no clue whether you have just read
Unicode or old-style data.
• Remember: it’s just numbers.
• Heuristics suck (try typing “Bush hid the
facts” in Windows Notepad, saving,
reloading)
27. We learned
• Unicode Transformation Format has
happened
• specifically UTF-8 happened
• Unicode 2.0 happened
• Programming environments learned
28. Unicode 2.0+
• Theoretically unlimited code space
• Doesn’t talk about bits any more
• The terminology is code point.
• Currently 1.1M code points
• The old characters (0000 - FFFF) are on
the BMP
29. Unicode Transformation
Format
• Specifies how to store Unicode on disk
• Specifies exact byte encoding for every
Unicode code point
• Available for 8-, 16- and 32 bit encodings
per code point
• Not every byte sequence is a valid UTF
byte sequence (finally!)
30. UTF-8
• Uses an 8bit encoding to store code points
• Is the same as ASCII for whatever’s in ASCII
• Uses multiple bytes to encode code points
outside of ASCII
• The old algorithms don’t work any more
31. UTF-16
• Combines the worst of both worlds
• Uses 16bit to encode a code point
• Uses surrogate pairs to encode a code point
outside of the BMP
• Wastes memory for ASCII, has byte-ordering-
issues and still breaks the old algorithms.
• Is the only way for these 16bit bandwagon
jumpers to support Unicode 2.0 and later
32. UTF-32
• 4 bytes per character
• Byte ordering issues
• Still breaking the old algorithms due to
combining marks
34. It’s not a collection of
bytes
• A string is a sequence of graphemes
• Or a sequence of Unicode Code Points
• A byte array is a sequence of bytes
• Both are incompatible with each other
• You can encode a string into a byte array
• You can decode a byte array into a string
36. Intermission: Combining
Marks
• ä is not the same as ä
• ä can be “LATIN SMALL LETTER A WITH
DIAERESIS”
• it can also be “LATIN SMALL LETTER A”
followed by “COMBINING DIAERESIS”
• both look exactly the same
37. Counting Lengths
• You could count the length in graphemes
• Or the length in unicode code points
• Or the length of your binary blob once
your string has been encoded.
• Or something in between because you
were trigger-happy in the 90ies
38. Which brings us back to
JS, Java and friends
• Live back in 1996
• Strings specified as being stored in UCS-2
(Fixed 16 bits per character)
• Leak its implementation in the API
• Don’t know about Unicode 2.0
39. Cheating abound
• Applications still want to support Unicode
2.0
• We need to display these piles of poo!
• String APIs use UTF-16, encoding
characters outside of the BMP as surrogate
pairs
• String APIs often don’t know about UTF-16
41. String methods are leaky
• String.length returns mish-mash of byte
length and code point length for strings
outside the BMP
• substr() can break strings
• charAt() can return non-existing code-
points
• and let’s not talk about to*Case
45. Real-World example
«("💩")» is 5 graphemes long. I counted
6 underscores :-)
Also: Mixed Content Warning? What gives?
46. Some did it ok
• Python 3.3 (PEP 393)
• Ruby 1.9 (avoids political issues by giving a
lot of freedom)
• Perl (awesome libraries since forever)
• Swift after a very rough start
• ICU, ICU4C (http://icu-project.org/)
49. This was just the tip of
the iceberg!
• Localization issues (Collation, Case change)
• Security issues (Encoding, Homographs)
• Broken Software (including “US UTF-8”)