SlideShare a Scribd company logo
Unicode
• A standard character encoding designed to
support all of the world's languages
• Unicode represents characters differently than
ASCII
• Characters are mapped to a code point
A 65
Code Point
1000001
UTF-32
UTF-16
UTF-8
UTF-32
• Uses 4 bytes (32 bits)
• Example:
– A (100 0001)
0 1 0 0 0 0 0 1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
UTF-32
• Problem:
1 KB
in
ASCII
4 KB
in
UTF-32
UTF-16
• Stores each char in either 16-bit or two 16-bit
0 0 …. 0 0
0 0 …. 0 0 0 0 …. 0 0
16 bits
16 bits16 bits
UTF-16
• Problem:
1 KB
in
ASCII
2 KB
in
UTF-16
UTF-8
• It supports every language you’ll probably ever need.
• No need for Windows-1252 this and Windows-1253 that.
• Its code point range is from 0x00 to 0x10FFFF
• It uses a variable (1 to 4) byte encoding.
UTF-8 (1-byte)
• 1-byte UTF-8 is used for code points in the range 0x00 to 0x7F.
• 1-byte UTF-8  ASCII
MSBit is 0
code point  representation
• Examples of 1-byte UTF-8:
– “A” -> 0100 0001
– “&” -> 0010 0110
– “5” -> 0011 01010 X X X X X X X
UTF-8 (2-byte)
• 2-byte UTF-8
code point != representation
• The code point is broken apart into two pieces.
• The five MSBits of the code point are assigned to the first byte
and the six LSBits are assigned to the second byte.
UTF-8 (2-byte)
For the first byte of 2-byte UTF-8:
• The three MSBits are set to 110
• The remaining bits are the five MSBits of the code point.
For the second byte of 2-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the six LSBits of the code point.
UTF-8 (2-byte)
1 1 0 X X X X X
1 0 X X X X X X
Leading Byte
Continuation Byte
UTF-8 (3-byte)
• 3-byte UTF-8 is used for code points in the range 0x0800 to
0xFFFF.
• 3-byte UTF-8
code point != representation
• The code point is broken apart into three pieces.
UTF-8 (3-byte)
• The four MSBits of the code point are assigned to the first
byte.
• The middle six bits are assigned to the second byte.
• The six LSBits are assigned to the third byte.
UTF-8 (3-byte)
For the first byte of 3-byte UTF-8
• The four MSBits are set to 1110
• The remaining bits are the four MSBits of the code point.
For the second byte of 3-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the six middle bits of the code point.
UTF-8 (3-byte)
For the third byte of 3-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the six LSBits of the code point.
UTF-8 (3-byte)
1 1 1 0 X X X X
1 0 X X X X X X
Leading Byte
Continuation Byte
1 0 X X X X X X
Continuation Byte
UTF-8 (4-byte)
• 4-byte UTF-8 is used for code points in the range 0x10000 to
0x10FFFF.
• 4-byte UTF-8
code point != representation
• The code point is broken apart into four pieces.
UTF-8 (4-byte)
• The three MSBits of the code point are assigned to the first
byte.
• The next six MSBits are assigned to the second byte.
• Another of the next six MSBits are assigned to the third byte.
• The six LSBits are assigned to the fourth byte.
UTF-8 (4-byte)
For the first byte of 4-byte UTF-8
• The five MSBits are set to 11110
• The remaining bits are the three MSBits of the code point.
For the second byte of 4-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the next six middle bits of the code point.
UTF-8 (4-byte)
For the third byte of 4-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the next six middle bits of the code point.
For the fourth byte of 4-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the six LSBits of the code point.
Examoles
10011100101001

More Related Content

What's hot

IP address
IP addressIP address
IP address
luckychauhan33
 
IP classes
IP classesIP classes
IP classes
Meenakshi Paul
 
IP Configuration
IP ConfigurationIP Configuration
IP ConfigurationStephen Raj
 
Wireshark course, Ch 03: Capture and display filters
Wireshark course, Ch 03: Capture and display filtersWireshark course, Ch 03: Capture and display filters
Wireshark course, Ch 03: Capture and display filters
Yoram Orzach
 
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
ON.LAB
 
Linux : Booting and runlevels
Linux : Booting and runlevelsLinux : Booting and runlevels
Linux : Booting and runlevels
John Ombagi
 
클라우드 환경을 위한 네트워크 가상화와 NSX(기초편)
클라우드 환경을 위한 네트워크 가상화와 NSX(기초편)클라우드 환경을 위한 네트워크 가상화와 NSX(기초편)
클라우드 환경을 위한 네트워크 가상화와 NSX(기초편)
Laehyoung Kim
 
Iptables Configuration
Iptables ConfigurationIptables Configuration
Iptables Configuration
stom123
 
Job Automation using Linux
Job Automation using LinuxJob Automation using Linux
Job Automation using Linux
Jishnu Pradeep
 
Introdução ao Software Livre
Introdução ao Software LivreIntrodução ao Software Livre
Introdução ao Software Livre
PeslPinguim
 
Ip addressing
Ip addressingIp addressing
Ip addressing
Online
 
Ipv4 and Ipv6
Ipv4 and Ipv6Ipv4 and Ipv6
Ipv4 and Ipv6
Rishav Bhurtel
 
Apache web service
Apache web serviceApache web service
Apache web service
Manash Kumar Mondal
 
Basic Cisco 800 Router Configuration for Internet Access
Basic Cisco 800 Router Configuration for Internet AccessBasic Cisco 800 Router Configuration for Internet Access
Basic Cisco 800 Router Configuration for Internet Access
Harris Andrea
 
Introdução ao Linux - Aula 01
Introdução ao Linux - Aula 01Introdução ao Linux - Aula 01
Introdução ao Linux - Aula 01Ivaldo Cardoso
 
IP Address - IPv4 & IPv6
IP Address - IPv4 & IPv6IP Address - IPv4 & IPv6
IP Address - IPv4 & IPv6
Adeel Rasheed
 
Share File easily between computers using sftp
Share File easily between computers using sftpShare File easily between computers using sftp
Share File easily between computers using sftp
Tushar B Kute
 
Internet Protocols
Internet ProtocolsInternet Protocols
Internet Protocols
Anil Neupane
 
mobile_我家已經有一臺有線路由器,請問WF2412要如何設定才能接在這原有的路由器下正常運作?
mobile_我家已經有一臺有線路由器,請問WF2412要如何設定才能接在這原有的路由器下正常運作?mobile_我家已經有一臺有線路由器,請問WF2412要如何設定才能接在這原有的路由器下正常運作?
mobile_我家已經有一臺有線路由器,請問WF2412要如何設定才能接在這原有的路由器下正常運作?臺灣塔米歐
 

What's hot (20)

Ip address
Ip addressIp address
Ip address
 
IP address
IP addressIP address
IP address
 
IP classes
IP classesIP classes
IP classes
 
IP Configuration
IP ConfigurationIP Configuration
IP Configuration
 
Wireshark course, Ch 03: Capture and display filters
Wireshark course, Ch 03: Capture and display filtersWireshark course, Ch 03: Capture and display filters
Wireshark course, Ch 03: Capture and display filters
 
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
ONOS: Open Network Operating System. An Open-Source Distributed SDN Operating...
 
Linux : Booting and runlevels
Linux : Booting and runlevelsLinux : Booting and runlevels
Linux : Booting and runlevels
 
클라우드 환경을 위한 네트워크 가상화와 NSX(기초편)
클라우드 환경을 위한 네트워크 가상화와 NSX(기초편)클라우드 환경을 위한 네트워크 가상화와 NSX(기초편)
클라우드 환경을 위한 네트워크 가상화와 NSX(기초편)
 
Iptables Configuration
Iptables ConfigurationIptables Configuration
Iptables Configuration
 
Job Automation using Linux
Job Automation using LinuxJob Automation using Linux
Job Automation using Linux
 
Introdução ao Software Livre
Introdução ao Software LivreIntrodução ao Software Livre
Introdução ao Software Livre
 
Ip addressing
Ip addressingIp addressing
Ip addressing
 
Ipv4 and Ipv6
Ipv4 and Ipv6Ipv4 and Ipv6
Ipv4 and Ipv6
 
Apache web service
Apache web serviceApache web service
Apache web service
 
Basic Cisco 800 Router Configuration for Internet Access
Basic Cisco 800 Router Configuration for Internet AccessBasic Cisco 800 Router Configuration for Internet Access
Basic Cisco 800 Router Configuration for Internet Access
 
Introdução ao Linux - Aula 01
Introdução ao Linux - Aula 01Introdução ao Linux - Aula 01
Introdução ao Linux - Aula 01
 
IP Address - IPv4 & IPv6
IP Address - IPv4 & IPv6IP Address - IPv4 & IPv6
IP Address - IPv4 & IPv6
 
Share File easily between computers using sftp
Share File easily between computers using sftpShare File easily between computers using sftp
Share File easily between computers using sftp
 
Internet Protocols
Internet ProtocolsInternet Protocols
Internet Protocols
 
mobile_我家已經有一臺有線路由器,請問WF2412要如何設定才能接在這原有的路由器下正常運作?
mobile_我家已經有一臺有線路由器,請問WF2412要如何設定才能接在這原有的路由器下正常運作?mobile_我家已經有一臺有線路由器,請問WF2412要如何設定才能接在這原有的路由器下正常運作?
mobile_我家已經有一臺有線路由器,請問WF2412要如何設定才能接在這原有的路由器下正常運作?
 

Similar to 4 character encoding-unicode

Unicode Encoding Forms
Unicode Encoding FormsUnicode Encoding Forms
Unicode Encoding Forms
Mehdi Hasan
 
Data Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data TypesData Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data Types
Dr Rajiv Srivastava
 
004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf
MaheShiva
 
expect("").length.toBe(1)
expect("").length.toBe(1)expect("").length.toBe(1)
expect("").length.toBe(1)
Philip Hofstetter
 
CCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, BinaryCCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, Binary
Richard Homa
 
Unicode - Hacking The International Character System
Unicode - Hacking The International Character SystemUnicode - Hacking The International Character System
Unicode - Hacking The International Character System
Websecurify
 
CSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & BinaryCSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & Binary
Richard Homa
 
HTTP 완벽가이드 16장
HTTP 완벽가이드 16장HTTP 완벽가이드 16장
HTTP 완벽가이드 16장
HyeonSeok Choi
 
What character is that
What character is thatWhat character is that
What character is that
Anders Karlsson
 
Numbersystemcont
NumbersystemcontNumbersystemcont
Numbersystemcont
Sajib
 
Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
Alula Tafere
 
Pentium Real Mode.ppt
Pentium Real Mode.pptPentium Real Mode.ppt
Pentium Real Mode.ppt
PramodBorole2
 
UTF-8: The Secret of Character Encoding
UTF-8: The Secret of Character EncodingUTF-8: The Secret of Character Encoding
UTF-8: The Secret of Character EncodingBert Pattyn
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode format
AdityaSharma1452
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representationekul
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representation
Kyle
 

Similar to 4 character encoding-unicode (20)

Unicode Encoding Forms
Unicode Encoding FormsUnicode Encoding Forms
Unicode Encoding Forms
 
Data Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data TypesData Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data Types
 
004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf
 
expect("").length.toBe(1)
expect("").length.toBe(1)expect("").length.toBe(1)
expect("").length.toBe(1)
 
CCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, BinaryCCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, Binary
 
Unicode - Hacking The International Character System
Unicode - Hacking The International Character SystemUnicode - Hacking The International Character System
Unicode - Hacking The International Character System
 
Unicode
UnicodeUnicode
Unicode
 
CSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & BinaryCSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & Binary
 
HTTP 완벽가이드 16장
HTTP 완벽가이드 16장HTTP 완벽가이드 16장
HTTP 완벽가이드 16장
 
Cache mapping
Cache mappingCache mapping
Cache mapping
 
What character is that
What character is thatWhat character is that
What character is that
 
Numbersystemcont
NumbersystemcontNumbersystemcont
Numbersystemcont
 
lec2.pdf
lec2.pdflec2.pdf
lec2.pdf
 
Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
 
Pentium Real Mode.ppt
Pentium Real Mode.pptPentium Real Mode.ppt
Pentium Real Mode.ppt
 
Ip v4
Ip v4Ip v4
Ip v4
 
UTF-8: The Secret of Character Encoding
UTF-8: The Secret of Character EncodingUTF-8: The Secret of Character Encoding
UTF-8: The Secret of Character Encoding
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode format
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representation
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representation
 

More from irdginfo

Quicksort Presentation
Quicksort PresentationQuicksort Presentation
Quicksort Presentation
irdginfo
 
10 merge sort
10 merge sort10 merge sort
10 merge sortirdginfo
 
9 big o-notation
9 big o-notation9 big o-notation
9 big o-notationirdginfo
 
8 elementary sorts-bubble
8 elementary sorts-bubble8 elementary sorts-bubble
8 elementary sorts-bubbleirdginfo
 
8 elementary sorts-shell
8 elementary sorts-shell8 elementary sorts-shell
8 elementary sorts-shellirdginfo
 
8 elementary sorts-insertion
8 elementary sorts-insertion8 elementary sorts-insertion
8 elementary sorts-insertionirdginfo
 
8 elementary sorts-selection
8 elementary sorts-selection8 elementary sorts-selection
8 elementary sorts-selectionirdginfo
 
7 searching injava-binary
7 searching injava-binary7 searching injava-binary
7 searching injava-binaryirdginfo
 
6 arrays injava
6 arrays injava6 arrays injava
6 arrays injavairdginfo
 
5 data structures-hashtable
5 data structures-hashtable5 data structures-hashtable
5 data structures-hashtableirdginfo
 
5 data structures-tree
5 data structures-tree5 data structures-tree
5 data structures-treeirdginfo
 
5 data structures-stack
5 data structures-stack5 data structures-stack
5 data structures-stackirdginfo
 
5 data structures-arraysandlinkedlist
5 data structures-arraysandlinkedlist5 data structures-arraysandlinkedlist
5 data structures-arraysandlinkedlistirdginfo
 
4 character encoding-ascii
4 character encoding-ascii4 character encoding-ascii
4 character encoding-asciiirdginfo
 
4 character encoding
4 character encoding4 character encoding
4 character encodingirdginfo
 
3 number systems-floatingpoint
3 number systems-floatingpoint3 number systems-floatingpoint
3 number systems-floatingpointirdginfo
 
2 number systems-scientificnotation
2 number systems-scientificnotation2 number systems-scientificnotation
2 number systems-scientificnotationirdginfo
 
1 number systems-hex
1 number systems-hex1 number systems-hex
1 number systems-hexirdginfo
 
1 number systems-unsignedsignedintegers
1 number systems-unsignedsignedintegers1 number systems-unsignedsignedintegers
1 number systems-unsignedsignedintegersirdginfo
 
1 number systems-octal
1 number systems-octal1 number systems-octal
1 number systems-octalirdginfo
 

More from irdginfo (20)

Quicksort Presentation
Quicksort PresentationQuicksort Presentation
Quicksort Presentation
 
10 merge sort
10 merge sort10 merge sort
10 merge sort
 
9 big o-notation
9 big o-notation9 big o-notation
9 big o-notation
 
8 elementary sorts-bubble
8 elementary sorts-bubble8 elementary sorts-bubble
8 elementary sorts-bubble
 
8 elementary sorts-shell
8 elementary sorts-shell8 elementary sorts-shell
8 elementary sorts-shell
 
8 elementary sorts-insertion
8 elementary sorts-insertion8 elementary sorts-insertion
8 elementary sorts-insertion
 
8 elementary sorts-selection
8 elementary sorts-selection8 elementary sorts-selection
8 elementary sorts-selection
 
7 searching injava-binary
7 searching injava-binary7 searching injava-binary
7 searching injava-binary
 
6 arrays injava
6 arrays injava6 arrays injava
6 arrays injava
 
5 data structures-hashtable
5 data structures-hashtable5 data structures-hashtable
5 data structures-hashtable
 
5 data structures-tree
5 data structures-tree5 data structures-tree
5 data structures-tree
 
5 data structures-stack
5 data structures-stack5 data structures-stack
5 data structures-stack
 
5 data structures-arraysandlinkedlist
5 data structures-arraysandlinkedlist5 data structures-arraysandlinkedlist
5 data structures-arraysandlinkedlist
 
4 character encoding-ascii
4 character encoding-ascii4 character encoding-ascii
4 character encoding-ascii
 
4 character encoding
4 character encoding4 character encoding
4 character encoding
 
3 number systems-floatingpoint
3 number systems-floatingpoint3 number systems-floatingpoint
3 number systems-floatingpoint
 
2 number systems-scientificnotation
2 number systems-scientificnotation2 number systems-scientificnotation
2 number systems-scientificnotation
 
1 number systems-hex
1 number systems-hex1 number systems-hex
1 number systems-hex
 
1 number systems-unsignedsignedintegers
1 number systems-unsignedsignedintegers1 number systems-unsignedsignedintegers
1 number systems-unsignedsignedintegers
 
1 number systems-octal
1 number systems-octal1 number systems-octal
1 number systems-octal
 

Recently uploaded

CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 

Recently uploaded (20)

CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 

4 character encoding-unicode

  • 1. Unicode • A standard character encoding designed to support all of the world's languages • Unicode represents characters differently than ASCII • Characters are mapped to a code point A 65 Code Point 1000001 UTF-32 UTF-16 UTF-8
  • 2. UTF-32 • Uses 4 bytes (32 bits) • Example: – A (100 0001) 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  • 4. UTF-16 • Stores each char in either 16-bit or two 16-bit 0 0 …. 0 0 0 0 …. 0 0 0 0 …. 0 0 16 bits 16 bits16 bits
  • 6. UTF-8 • It supports every language you’ll probably ever need. • No need for Windows-1252 this and Windows-1253 that. • Its code point range is from 0x00 to 0x10FFFF • It uses a variable (1 to 4) byte encoding.
  • 7. UTF-8 (1-byte) • 1-byte UTF-8 is used for code points in the range 0x00 to 0x7F. • 1-byte UTF-8  ASCII MSBit is 0 code point  representation • Examples of 1-byte UTF-8: – “A” -> 0100 0001 – “&” -> 0010 0110 – “5” -> 0011 01010 X X X X X X X
  • 8. UTF-8 (2-byte) • 2-byte UTF-8 code point != representation • The code point is broken apart into two pieces. • The five MSBits of the code point are assigned to the first byte and the six LSBits are assigned to the second byte.
  • 9. UTF-8 (2-byte) For the first byte of 2-byte UTF-8: • The three MSBits are set to 110 • The remaining bits are the five MSBits of the code point. For the second byte of 2-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the six LSBits of the code point.
  • 10. UTF-8 (2-byte) 1 1 0 X X X X X 1 0 X X X X X X Leading Byte Continuation Byte
  • 11. UTF-8 (3-byte) • 3-byte UTF-8 is used for code points in the range 0x0800 to 0xFFFF. • 3-byte UTF-8 code point != representation • The code point is broken apart into three pieces.
  • 12. UTF-8 (3-byte) • The four MSBits of the code point are assigned to the first byte. • The middle six bits are assigned to the second byte. • The six LSBits are assigned to the third byte.
  • 13. UTF-8 (3-byte) For the first byte of 3-byte UTF-8 • The four MSBits are set to 1110 • The remaining bits are the four MSBits of the code point. For the second byte of 3-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the six middle bits of the code point.
  • 14. UTF-8 (3-byte) For the third byte of 3-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the six LSBits of the code point.
  • 15. UTF-8 (3-byte) 1 1 1 0 X X X X 1 0 X X X X X X Leading Byte Continuation Byte 1 0 X X X X X X Continuation Byte
  • 16. UTF-8 (4-byte) • 4-byte UTF-8 is used for code points in the range 0x10000 to 0x10FFFF. • 4-byte UTF-8 code point != representation • The code point is broken apart into four pieces.
  • 17. UTF-8 (4-byte) • The three MSBits of the code point are assigned to the first byte. • The next six MSBits are assigned to the second byte. • Another of the next six MSBits are assigned to the third byte. • The six LSBits are assigned to the fourth byte.
  • 18. UTF-8 (4-byte) For the first byte of 4-byte UTF-8 • The five MSBits are set to 11110 • The remaining bits are the three MSBits of the code point. For the second byte of 4-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the next six middle bits of the code point.
  • 19. UTF-8 (4-byte) For the third byte of 4-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the next six middle bits of the code point. For the fourth byte of 4-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the six LSBits of the code point.