SlideShare a Scribd company logo
Üɳîḉỗḋę
ᨐЉⰖ닖ぼຢഐဩᚠඐ༃ꘐ
Character Encoding
Maps characters to numbers that can be
represented in binary form.
Character Encoding Humor
Terms
• Repertoire
o Full set of abstract characters that a system supports
• Coded Character Set
o Assigns code points (integers) to characters
• Character Encoding Form
o Maps code points to code values that can be represented in binary in
a limited number of bits
• Character Encoding Scheme
o Maps code values to octets
In the beginning, there was ASCII.
ISO-8859-1 (Latin 1)
ISO-8859-2 (Central Europe)
ISO-8859-6 (Arabic)
History of Unicode
Not very interesting.
Fun Facts About Unicode
• 1.1 million code points, of which over 110,000 are currently assigned.
• Codespace is divided into 17 planes, each with 216 (65,536) code points.
o Basic Multilingual Plane
 Almost all modern languages
 Most code points are CJK
o Supplementary Multilingual Plane
 Historic scripts, hieroglyphs, emoji, card suit symbols, etc.
o Supplementary Ideographic Plane
 CJK
Basic Multilingual Plane
Character Mapping
‫ج‬is U+062C (ARABIC LETTER JEEM)
http://en.wikibooks.org/wiki/Unicode/Character_reference
pâté
U+0070 U+00E2 U+0074 U+00E9
UTF-32
• Every code points is represented with 32 bits
• Direct representation of a code point.
U+0070 U+00E2 U+0074 U+00E9
UTF-16
• Code points in BMP are mapped to single
16-bit values
• Code points from other planes use surrogate
pairs
U+0070 U+00E2 U+0074 U+00E9
UTF-8
• Highly recommended and widely adopted for
internet use.
• Uses 1 octet for ASCII characters, between
2 and 4 octets for other code points.
UTF-8 and ISO-8859-1
http://www.unicode.org/Public/MAPPINGS/ISO8859/8859-1.TXT
UTF-8 and ISO-8859-1 (cont.)
The Latin 1 supplement (upper half of ISO-
8859-1) are assigned 2 octets in UTF-8.
When UTF-8 data is interpreted as ISO-8859-1,
a Latin 1 supplement character will appear as
 or à followed by another character.
Here is my résumé becomes Here is my résumé
UTF-8 ISO-8859-1
Encoding UTF-8
1. The Unicode code point for "€" is U+20AC.
2. According to the scheme table above, this will take three bytes to encode, since it is between U+0800 and U+FFFF.
3. Hexadecimal 20AC is binary 0010000010101100. The two leading zeros are added because, as the scheme table shows,
a three-byte encoding needs exactly sixteen bits from the code point.
4. Because it is a three-byte encoding, the leading byte starts with three 1s, then a 0 (1110...)
5. The remaining bits of this byte are taken from the code point (11100010), leaving ...000010101100.
6. Each of the continuation bytes starts with 10 and takes six bits of the code point (so 10000010, then 10101100).
The three bytes 11100010 10000010 10101100 can be more concisely written in hexadecimal, as E2 82 AC.
https://github.com/dhumbert/Unicode/blob/master/utf8.c
UTF-8 URL Encoding
Byte Order Mark
• U+FEFF
• Little endian vs. big endian
• Commonly used for UTF-16 and UTF-32
• Unnecessary in UTF-8
Heuristics for Detecting Unicode
Collation
Unicode collation algorithm
http://www.unicode.org/reports/tr10/
Criticisms of Unicode
Too complex
Criticisms of Unicode
Inefficient compared to
single-byte encodings
Criticisms of Unicode
Klingon script not present

More Related Content

What's hot

Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)
Project Student
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xmlphanleson
 
Unicode - Hacking The International Character System
Unicode - Hacking The International Character SystemUnicode - Hacking The International Character System
Unicode - Hacking The International Character System
Websecurify
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
Andrei Zmievski
 
Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - IT
guest6ddfb98
 
Lecture 2
Lecture 2Lecture 2
Lecture 2Muuluu
 
Overview of character encoding
Overview of character encodingOverview of character encoding
Overview of character encoding
Duy Lâm
 
Storing text
Storing textStoring text
Storing text
missstevenson01
 
Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)
Jerome Eteve
 
Io
IoIo
Sinhala Unicode and Usage
Sinhala Unicode and UsageSinhala Unicode and Usage
Sinhala Unicode and Usage
Harshana Weerasinghe
 
Character Encoding issue with PHP
Character Encoding issue with PHPCharacter Encoding issue with PHP
Character Encoding issue with PHP
Ravi Raj
 
Using unicode with php
Using unicode with phpUsing unicode with php
Using unicode with php
Elizabeth Smith
 
Camomile : A Unicode library for OCaml
Camomile : A Unicode library for OCamlCamomile : A Unicode library for OCaml
Camomile : A Unicode library for OCamlYamagata Yoriyuki
 
Issues with SignWriting in Unicode 8
Issues with SignWriting in Unicode 8Issues with SignWriting in Unicode 8
Issues with SignWriting in Unicode 8
Stephen Slevinski
 
HTTP 완벽가이드 16장
HTTP 완벽가이드 16장HTTP 완벽가이드 16장
HTTP 완벽가이드 16장
HyeonSeok Choi
 
SignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerationsSignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerations
Stephen Slevinski
 

What's hot (20)

Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)
 
Xml For Dummies Chapter 6 Adding Character(S) To Xml
Xml For Dummies   Chapter 6 Adding Character(S) To XmlXml For Dummies   Chapter 6 Adding Character(S) To Xml
Xml For Dummies Chapter 6 Adding Character(S) To Xml
 
Unicode - Hacking The International Character System
Unicode - Hacking The International Character SystemUnicode - Hacking The International Character System
Unicode - Hacking The International Character System
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
 
Comprehasive Exam - IT
Comprehasive Exam - ITComprehasive Exam - IT
Comprehasive Exam - IT
 
ASCII-EBCDIC-HEX
ASCII-EBCDIC-HEXASCII-EBCDIC-HEX
ASCII-EBCDIC-HEX
 
Ascii 03
Ascii 03Ascii 03
Ascii 03
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Overview of character encoding
Overview of character encodingOverview of character encoding
Overview of character encoding
 
Storing text
Storing textStoring text
Storing text
 
Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)Understand unicode & utf8 in perl (2)
Understand unicode & utf8 in perl (2)
 
Io
IoIo
Io
 
Sinhala Unicode and Usage
Sinhala Unicode and UsageSinhala Unicode and Usage
Sinhala Unicode and Usage
 
Character Encoding issue with PHP
Character Encoding issue with PHPCharacter Encoding issue with PHP
Character Encoding issue with PHP
 
Using unicode with php
Using unicode with phpUsing unicode with php
Using unicode with php
 
Camomile : A Unicode library for OCaml
Camomile : A Unicode library for OCamlCamomile : A Unicode library for OCaml
Camomile : A Unicode library for OCaml
 
Strings and encodings
Strings and encodingsStrings and encodings
Strings and encodings
 
Issues with SignWriting in Unicode 8
Issues with SignWriting in Unicode 8Issues with SignWriting in Unicode 8
Issues with SignWriting in Unicode 8
 
HTTP 완벽가이드 16장
HTTP 완벽가이드 16장HTTP 완벽가이드 16장
HTTP 완벽가이드 16장
 
SignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerationsSignWriting in Unicode and rich text considerations
SignWriting in Unicode and rich text considerations
 

Similar to Unicode

Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicode
Ulf Mattsson
 
Unicode and character sets
Unicode and character setsUnicode and character sets
Unicode and character setsrenchenyu
 
004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf
MaheShiva
 
CCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, BinaryCCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, Binary
Richard Homa
 
4 character encoding-unicode
4 character encoding-unicode4 character encoding-unicode
4 character encoding-unicodeirdginfo
 
Data Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data TypesData Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data Types
Dr Rajiv Srivastava
 
Journey of Bsdconv
Journey of BsdconvJourney of Bsdconv
Journey of Bsdconv
Buganini Chiu
 
Number System & Logic Gate
Number System & Logic GateNumber System & Logic Gate
Number System & Logic Gate
Ashfakur Rahman
 
Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...
Ulf Mattsson
 
binarycode.pptx
binarycode.pptxbinarycode.pptx
binarycode.pptx
WilliamLugo12
 
Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
Alula Tafere
 
Binary codes
Binary codesBinary codes
Binary codes
GargiKhanna1
 
CSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & BinaryCSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & Binary
Richard Homa
 
Digital logic degin, Number system
Digital logic degin, Number systemDigital logic degin, Number system
Digital logic degin, Number system
Ashish Kumar Thakur
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode format
AdityaSharma1452
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representationekul
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representation
Kyle
 
Numbersystemcont
NumbersystemcontNumbersystemcont
Numbersystemcont
Sajib
 

Similar to Unicode (20)

Data encryption and tokenization for international unicode
Data encryption and tokenization for international unicodeData encryption and tokenization for international unicode
Data encryption and tokenization for international unicode
 
Unicode and character sets
Unicode and character setsUnicode and character sets
Unicode and character sets
 
004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf
 
CCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, BinaryCCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, Binary
 
4 character encoding-unicode
4 character encoding-unicode4 character encoding-unicode
4 character encoding-unicode
 
W 9 numbering system
W 9 numbering systemW 9 numbering system
W 9 numbering system
 
W 9 numbering system
W 9 numbering systemW 9 numbering system
W 9 numbering system
 
Data Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data TypesData Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data Types
 
Journey of Bsdconv
Journey of BsdconvJourney of Bsdconv
Journey of Bsdconv
 
Number System & Logic Gate
Number System & Logic GateNumber System & Logic Gate
Number System & Logic Gate
 
Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...Jun 29 new privacy technologies for unicode and international data standards ...
Jun 29 new privacy technologies for unicode and international data standards ...
 
binarycode.pptx
binarycode.pptxbinarycode.pptx
binarycode.pptx
 
Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
 
Binary codes
Binary codesBinary codes
Binary codes
 
CSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & BinaryCSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & Binary
 
Digital logic degin, Number system
Digital logic degin, Number systemDigital logic degin, Number system
Digital logic degin, Number system
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode format
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representation
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representation
 
Numbersystemcont
NumbersystemcontNumbersystemcont
Numbersystemcont
 

Recently uploaded

Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
symbo111
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
aqil azizi
 
An Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering TechniquesAn Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering Techniques
ambekarshweta25
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
dxobcob
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
ssuser7dcef0
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Teleport Manpower Consultant
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
zwunae
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 

Recently uploaded (20)

Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
Building Electrical System Design & Installation
Building Electrical System Design & InstallationBuilding Electrical System Design & Installation
Building Electrical System Design & Installation
 
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdfTutorial for 16S rRNA Gene Analysis with QIIME2.pdf
Tutorial for 16S rRNA Gene Analysis with QIIME2.pdf
 
An Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering TechniquesAn Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering Techniques
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
 
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdfTop 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 

Unicode

Editor's Notes

  1. 1987 by engineers from Xerox and Apple
  2. Unicode is a superset of ISO-8859-1. Unicode code points of the Latin Supplement are the same as the values in ISO-8859. However, these characters require 2 bytes to encode in UTF-8.
  3. The reason is that the first octet of the encoded form is 11000010 or 11000011 in binary, C2 or C3 in hexadecimal, which means  or à in ISO-8859-1. The second octet has "10" as the first 2 bits, so it would be interpreted as some Latin 1 Supplement character.
  4. Browser encodes into UTF-8
  5. UTF-8 byte order is always the same
  6. However, use of BOM, especially for UTF-8, is discouraged by Unicode Consortium
  7. But human script is complex. Example, are diacritics two characters or one?
  8. Rejected in 2001 by representatives of Adobe, Apple, IBM, Microsoft, and Sun. Boycott appropriately