SlideShare a Scribd company logo
1 of 20
Unicode
• A standard character encoding designed to
support all of the world's languages
• Unicode represents characters differently than
ASCII
• Characters are mapped to a code point
A 65
Code Point
1000001
UTF-32
UTF-16
UTF-8
UTF-32
• Uses 4 bytes (32 bits)
• Example:
– A (100 0001)
0 1 0 0 0 0 0 1
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
UTF-32
• Problem:
1 KB
in
ASCII
4 KB
in
UTF-32
UTF-16
• Stores each char in either 16-bit or two 16-bit
0 0 …. 0 0
0 0 …. 0 0 0 0 …. 0 0
16 bits
16 bits16 bits
UTF-16
• Problem:
1 KB
in
ASCII
2 KB
in
UTF-16
UTF-8
• It supports every language you’ll probably ever need.
• No need for Windows-1252 this and Windows-1253 that.
• Its code point range is from 0x00 to 0x10FFFF
• It uses a variable (1 to 4) byte encoding.
UTF-8 (1-byte)
• 1-byte UTF-8 is used for code points in the range 0x00 to 0x7F.
• 1-byte UTF-8  ASCII
MSBit is 0
code point  representation
• Examples of 1-byte UTF-8:
– “A” -> 0100 0001
– “&” -> 0010 0110
– “5” -> 0011 01010 X X X X X X X
UTF-8 (2-byte)
• 2-byte UTF-8
code point != representation
• The code point is broken apart into two pieces.
• The five MSBits of the code point are assigned to the first byte
and the six LSBits are assigned to the second byte.
UTF-8 (2-byte)
For the first byte of 2-byte UTF-8:
• The three MSBits are set to 110
• The remaining bits are the five MSBits of the code point.
For the second byte of 2-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the six LSBits of the code point.
UTF-8 (2-byte)
1 1 0 X X X X X
1 0 X X X X X X
Leading Byte
Continuation Byte
UTF-8 (3-byte)
• 3-byte UTF-8 is used for code points in the range 0x0800 to
0xFFFF.
• 3-byte UTF-8
code point != representation
• The code point is broken apart into three pieces.
UTF-8 (3-byte)
• The four MSBits of the code point are assigned to the first
byte.
• The middle six bits are assigned to the second byte.
• The six LSBits are assigned to the third byte.
UTF-8 (3-byte)
For the first byte of 3-byte UTF-8
• The four MSBits are set to 1110
• The remaining bits are the four MSBits of the code point.
For the second byte of 3-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the six middle bits of the code point.
UTF-8 (3-byte)
For the third byte of 3-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the six LSBits of the code point.
UTF-8 (3-byte)
1 1 1 0 X X X X
1 0 X X X X X X
Leading Byte
Continuation Byte
1 0 X X X X X X
Continuation Byte
UTF-8 (4-byte)
• 4-byte UTF-8 is used for code points in the range 0x10000 to
0x10FFFF.
• 4-byte UTF-8
code point != representation
• The code point is broken apart into four pieces.
UTF-8 (4-byte)
• The three MSBits of the code point are assigned to the first
byte.
• The next six MSBits are assigned to the second byte.
• Another of the next six MSBits are assigned to the third byte.
• The six LSBits are assigned to the fourth byte.
UTF-8 (4-byte)
For the first byte of 4-byte UTF-8
• The five MSBits are set to 11110
• The remaining bits are the three MSBits of the code point.
For the second byte of 4-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the next six middle bits of the code point.
UTF-8 (4-byte)
For the third byte of 4-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the next six middle bits of the code point.
For the fourth byte of 4-byte UTF-8
• The two MSBits are set to 10
• The remaining bits are the six LSBits of the code point.
Examoles
10011100101001

More Related Content

What's hot

Final draft intel core i5 processors architecture
Final draft intel core i5 processors architectureFinal draft intel core i5 processors architecture
Final draft intel core i5 processors architectureJawid Ahmad Baktash
 
IPv6 next generation protocol
IPv6 next generation protocolIPv6 next generation protocol
IPv6 next generation protocolRupshanker Mishra
 
Internet Protocol Version 6
Internet Protocol Version 6Internet Protocol Version 6
Internet Protocol Version 6sandeepjain
 
Overview of Spanning Tree Protocol
Overview of Spanning Tree ProtocolOverview of Spanning Tree Protocol
Overview of Spanning Tree ProtocolArash Foroughi
 
Project ACRN Device Passthrough Introduction
Project ACRN Device Passthrough IntroductionProject ACRN Device Passthrough Introduction
Project ACRN Device Passthrough IntroductionProject ACRN
 
Read-only rootfs: theory and practice
Read-only rootfs: theory and practiceRead-only rootfs: theory and practice
Read-only rootfs: theory and practiceChris Simmonds
 
Approaches to Network Automation
Approaches to Network AutomationApproaches to Network Automation
Approaches to Network AutomationAPNIC
 
Linux Binary Exploitation - Return-oritend Programing
Linux Binary Exploitation - Return-oritend ProgramingLinux Binary Exploitation - Return-oritend Programing
Linux Binary Exploitation - Return-oritend ProgramingAngel Boy
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareLinaro
 
The sparc architecture (3)
The sparc architecture (3)The sparc architecture (3)
The sparc architecture (3)vishuupra
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningCastLabKAIST
 
How to Choose a Software Update Mechanism for Embedded Linux Devices
How to Choose a Software Update Mechanism for Embedded Linux DevicesHow to Choose a Software Update Mechanism for Embedded Linux Devices
How to Choose a Software Update Mechanism for Embedded Linux DevicesLeon Anavi
 

What's hot (20)

Final draft intel core i5 processors architecture
Final draft intel core i5 processors architectureFinal draft intel core i5 processors architecture
Final draft intel core i5 processors architecture
 
IPv6 address
IPv6 addressIPv6 address
IPv6 address
 
IPv6 next generation protocol
IPv6 next generation protocolIPv6 next generation protocol
IPv6 next generation protocol
 
Internet Protocol Version 6
Internet Protocol Version 6Internet Protocol Version 6
Internet Protocol Version 6
 
Overview of Spanning Tree Protocol
Overview of Spanning Tree ProtocolOverview of Spanning Tree Protocol
Overview of Spanning Tree Protocol
 
Project ACRN Device Passthrough Introduction
Project ACRN Device Passthrough IntroductionProject ACRN Device Passthrough Introduction
Project ACRN Device Passthrough Introduction
 
CCNP Route EIGRP Overview
CCNP Route  EIGRP OverviewCCNP Route  EIGRP Overview
CCNP Route EIGRP Overview
 
Read-only rootfs: theory and practice
Read-only rootfs: theory and practiceRead-only rootfs: theory and practice
Read-only rootfs: theory and practice
 
Approaches to Network Automation
Approaches to Network AutomationApproaches to Network Automation
Approaches to Network Automation
 
Linux Binary Exploitation - Return-oritend Programing
Linux Binary Exploitation - Return-oritend ProgramingLinux Binary Exploitation - Return-oritend Programing
Linux Binary Exploitation - Return-oritend Programing
 
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted FirmwareHKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
HKG15-505: Power Management interactions with OP-TEE and Trusted Firmware
 
Dynamic NAT
Dynamic NATDynamic NAT
Dynamic NAT
 
Developing for polar fire soc
Developing for polar fire socDeveloping for polar fire soc
Developing for polar fire soc
 
The sparc architecture (3)
The sparc architecture (3)The sparc architecture (3)
The sparc architecture (3)
 
Introduction to VTK
Introduction to VTKIntroduction to VTK
Introduction to VTK
 
Hardware Acceleration for Machine Learning
Hardware Acceleration for Machine LearningHardware Acceleration for Machine Learning
Hardware Acceleration for Machine Learning
 
How to Choose a Software Update Mechanism for Embedded Linux Devices
How to Choose a Software Update Mechanism for Embedded Linux DevicesHow to Choose a Software Update Mechanism for Embedded Linux Devices
How to Choose a Software Update Mechanism for Embedded Linux Devices
 
Apache hive
Apache hiveApache hive
Apache hive
 
Ipx protocol slide share
Ipx protocol slide shareIpx protocol slide share
Ipx protocol slide share
 
Intel Pentium Pro
Intel Pentium ProIntel Pentium Pro
Intel Pentium Pro
 

Similar to 4 character encoding-unicode

Data Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data TypesData Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data TypesDr Rajiv Srivastava
 
004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdfMaheShiva
 
expect("").length.toBe(1)
expect("").length.toBe(1)expect("").length.toBe(1)
expect("").length.toBe(1)Philip Hofstetter
 
CCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, BinaryCCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, BinaryRichard Homa
 
Unicode - Hacking The International Character System
Unicode - Hacking The International Character SystemUnicode - Hacking The International Character System
Unicode - Hacking The International Character SystemWebsecurify
 
CSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & BinaryCSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & BinaryRichard Homa
 
HTTP 완벽가이드 16장
HTTP 완벽가이드 16장HTTP 완벽가이드 16장
HTTP 완벽가이드 16장HyeonSeok Choi
 
Numbersystemcont
NumbersystemcontNumbersystemcont
NumbersystemcontSajib
 
Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptAlula Tafere
 
Pentium Real Mode.ppt
Pentium Real Mode.pptPentium Real Mode.ppt
Pentium Real Mode.pptPramodBorole2
 
UTF-8: The Secret of Character Encoding
UTF-8: The Secret of Character EncodingUTF-8: The Secret of Character Encoding
UTF-8: The Secret of Character EncodingBert Pattyn
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode formatAdityaSharma1452
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representationekul
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data RepresentationKyle
 

Similar to 4 character encoding-unicode (20)

Data Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data TypesData Communication & Computer Networks : Data Types
Data Communication & Computer Networks : Data Types
 
004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf004 NUMBER SYSTEM (1).pdf
004 NUMBER SYSTEM (1).pdf
 
expect("").length.toBe(1)
expect("").length.toBe(1)expect("").length.toBe(1)
expect("").length.toBe(1)
 
CCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, BinaryCCS103 Bits, Bytes, Binary
CCS103 Bits, Bytes, Binary
 
Unicode - Hacking The International Character System
Unicode - Hacking The International Character SystemUnicode - Hacking The International Character System
Unicode - Hacking The International Character System
 
Unicode
UnicodeUnicode
Unicode
 
CSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & BinaryCSC103 Bits, Bytes & Binary
CSC103 Bits, Bytes & Binary
 
HTTP 완벽가이드 16장
HTTP 완벽가이드 16장HTTP 완벽가이드 16장
HTTP 완벽가이드 16장
 
Cache mapping
Cache mappingCache mapping
Cache mapping
 
What character is that
What character is thatWhat character is that
What character is that
 
Numbersystemcont
NumbersystemcontNumbersystemcont
Numbersystemcont
 
lec2.pdf
lec2.pdflec2.pdf
lec2.pdf
 
Lecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.pptLecture_ASCII and Unicode.ppt
Lecture_ASCII and Unicode.ppt
 
Pentium Real Mode.ppt
Pentium Real Mode.pptPentium Real Mode.ppt
Pentium Real Mode.ppt
 
Ip v4
Ip v4Ip v4
Ip v4
 
UTF-8: The Secret of Character Encoding
UTF-8: The Secret of Character EncodingUTF-8: The Secret of Character Encoding
UTF-8: The Secret of Character Encoding
 
Character encoding and unicode format
Character encoding and unicode formatCharacter encoding and unicode format
Character encoding and unicode format
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representation
 
Topic 1 Data Representation
Topic 1 Data RepresentationTopic 1 Data Representation
Topic 1 Data Representation
 
AES.ppt
AES.pptAES.ppt
AES.ppt
 

More from irdginfo

Quicksort Presentation
Quicksort PresentationQuicksort Presentation
Quicksort Presentationirdginfo
 
10 merge sort
10 merge sort10 merge sort
10 merge sortirdginfo
 
9 big o-notation
9 big o-notation9 big o-notation
9 big o-notationirdginfo
 
8 elementary sorts-bubble
8 elementary sorts-bubble8 elementary sorts-bubble
8 elementary sorts-bubbleirdginfo
 
8 elementary sorts-shell
8 elementary sorts-shell8 elementary sorts-shell
8 elementary sorts-shellirdginfo
 
8 elementary sorts-insertion
8 elementary sorts-insertion8 elementary sorts-insertion
8 elementary sorts-insertionirdginfo
 
8 elementary sorts-selection
8 elementary sorts-selection8 elementary sorts-selection
8 elementary sorts-selectionirdginfo
 
7 searching injava-binary
7 searching injava-binary7 searching injava-binary
7 searching injava-binaryirdginfo
 
6 arrays injava
6 arrays injava6 arrays injava
6 arrays injavairdginfo
 
5 data structures-hashtable
5 data structures-hashtable5 data structures-hashtable
5 data structures-hashtableirdginfo
 
5 data structures-tree
5 data structures-tree5 data structures-tree
5 data structures-treeirdginfo
 
5 data structures-stack
5 data structures-stack5 data structures-stack
5 data structures-stackirdginfo
 
5 data structures-arraysandlinkedlist
5 data structures-arraysandlinkedlist5 data structures-arraysandlinkedlist
5 data structures-arraysandlinkedlistirdginfo
 
4 character encoding-ascii
4 character encoding-ascii4 character encoding-ascii
4 character encoding-asciiirdginfo
 
4 character encoding
4 character encoding4 character encoding
4 character encodingirdginfo
 
3 number systems-floatingpoint
3 number systems-floatingpoint3 number systems-floatingpoint
3 number systems-floatingpointirdginfo
 
2 number systems-scientificnotation
2 number systems-scientificnotation2 number systems-scientificnotation
2 number systems-scientificnotationirdginfo
 
1 number systems-hex
1 number systems-hex1 number systems-hex
1 number systems-hexirdginfo
 
1 number systems-unsignedsignedintegers
1 number systems-unsignedsignedintegers1 number systems-unsignedsignedintegers
1 number systems-unsignedsignedintegersirdginfo
 
1 number systems-octal
1 number systems-octal1 number systems-octal
1 number systems-octalirdginfo
 

More from irdginfo (20)

Quicksort Presentation
Quicksort PresentationQuicksort Presentation
Quicksort Presentation
 
10 merge sort
10 merge sort10 merge sort
10 merge sort
 
9 big o-notation
9 big o-notation9 big o-notation
9 big o-notation
 
8 elementary sorts-bubble
8 elementary sorts-bubble8 elementary sorts-bubble
8 elementary sorts-bubble
 
8 elementary sorts-shell
8 elementary sorts-shell8 elementary sorts-shell
8 elementary sorts-shell
 
8 elementary sorts-insertion
8 elementary sorts-insertion8 elementary sorts-insertion
8 elementary sorts-insertion
 
8 elementary sorts-selection
8 elementary sorts-selection8 elementary sorts-selection
8 elementary sorts-selection
 
7 searching injava-binary
7 searching injava-binary7 searching injava-binary
7 searching injava-binary
 
6 arrays injava
6 arrays injava6 arrays injava
6 arrays injava
 
5 data structures-hashtable
5 data structures-hashtable5 data structures-hashtable
5 data structures-hashtable
 
5 data structures-tree
5 data structures-tree5 data structures-tree
5 data structures-tree
 
5 data structures-stack
5 data structures-stack5 data structures-stack
5 data structures-stack
 
5 data structures-arraysandlinkedlist
5 data structures-arraysandlinkedlist5 data structures-arraysandlinkedlist
5 data structures-arraysandlinkedlist
 
4 character encoding-ascii
4 character encoding-ascii4 character encoding-ascii
4 character encoding-ascii
 
4 character encoding
4 character encoding4 character encoding
4 character encoding
 
3 number systems-floatingpoint
3 number systems-floatingpoint3 number systems-floatingpoint
3 number systems-floatingpoint
 
2 number systems-scientificnotation
2 number systems-scientificnotation2 number systems-scientificnotation
2 number systems-scientificnotation
 
1 number systems-hex
1 number systems-hex1 number systems-hex
1 number systems-hex
 
1 number systems-unsignedsignedintegers
1 number systems-unsignedsignedintegers1 number systems-unsignedsignedintegers
1 number systems-unsignedsignedintegers
 
1 number systems-octal
1 number systems-octal1 number systems-octal
1 number systems-octal
 

Recently uploaded

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 

Recently uploaded (20)

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 

4 character encoding-unicode

  • 1. Unicode • A standard character encoding designed to support all of the world's languages • Unicode represents characters differently than ASCII • Characters are mapped to a code point A 65 Code Point 1000001 UTF-32 UTF-16 UTF-8
  • 2. UTF-32 • Uses 4 bytes (32 bits) • Example: – A (100 0001) 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  • 4. UTF-16 • Stores each char in either 16-bit or two 16-bit 0 0 …. 0 0 0 0 …. 0 0 0 0 …. 0 0 16 bits 16 bits16 bits
  • 6. UTF-8 • It supports every language you’ll probably ever need. • No need for Windows-1252 this and Windows-1253 that. • Its code point range is from 0x00 to 0x10FFFF • It uses a variable (1 to 4) byte encoding.
  • 7. UTF-8 (1-byte) • 1-byte UTF-8 is used for code points in the range 0x00 to 0x7F. • 1-byte UTF-8  ASCII MSBit is 0 code point  representation • Examples of 1-byte UTF-8: – “A” -> 0100 0001 – “&” -> 0010 0110 – “5” -> 0011 01010 X X X X X X X
  • 8. UTF-8 (2-byte) • 2-byte UTF-8 code point != representation • The code point is broken apart into two pieces. • The five MSBits of the code point are assigned to the first byte and the six LSBits are assigned to the second byte.
  • 9. UTF-8 (2-byte) For the first byte of 2-byte UTF-8: • The three MSBits are set to 110 • The remaining bits are the five MSBits of the code point. For the second byte of 2-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the six LSBits of the code point.
  • 10. UTF-8 (2-byte) 1 1 0 X X X X X 1 0 X X X X X X Leading Byte Continuation Byte
  • 11. UTF-8 (3-byte) • 3-byte UTF-8 is used for code points in the range 0x0800 to 0xFFFF. • 3-byte UTF-8 code point != representation • The code point is broken apart into three pieces.
  • 12. UTF-8 (3-byte) • The four MSBits of the code point are assigned to the first byte. • The middle six bits are assigned to the second byte. • The six LSBits are assigned to the third byte.
  • 13. UTF-8 (3-byte) For the first byte of 3-byte UTF-8 • The four MSBits are set to 1110 • The remaining bits are the four MSBits of the code point. For the second byte of 3-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the six middle bits of the code point.
  • 14. UTF-8 (3-byte) For the third byte of 3-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the six LSBits of the code point.
  • 15. UTF-8 (3-byte) 1 1 1 0 X X X X 1 0 X X X X X X Leading Byte Continuation Byte 1 0 X X X X X X Continuation Byte
  • 16. UTF-8 (4-byte) • 4-byte UTF-8 is used for code points in the range 0x10000 to 0x10FFFF. • 4-byte UTF-8 code point != representation • The code point is broken apart into four pieces.
  • 17. UTF-8 (4-byte) • The three MSBits of the code point are assigned to the first byte. • The next six MSBits are assigned to the second byte. • Another of the next six MSBits are assigned to the third byte. • The six LSBits are assigned to the fourth byte.
  • 18. UTF-8 (4-byte) For the first byte of 4-byte UTF-8 • The five MSBits are set to 11110 • The remaining bits are the three MSBits of the code point. For the second byte of 4-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the next six middle bits of the code point.
  • 19. UTF-8 (4-byte) For the third byte of 4-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the next six middle bits of the code point. For the fourth byte of 4-byte UTF-8 • The two MSBits are set to 10 • The remaining bits are the six LSBits of the code point.