SlideShare a Scribd company logo
1 of 37
Raed Alfayez
SaudiNIC, CITC
Sep-2015, APTLD/APNIC
 Introduction
– About SaudiNIC
– Variants in the Arabic script and Arabic Language
 Variant Management System
– Concepts
– Requirements
 SaudiNIC ’s VMS
– Master Key Algorithm
– Variants Filters
 Administering the domain name space under (.sa)
since 1995 & ( .‫السعودية‬ ) since 2010.
 Operated by Communication and Information
Technology Commission (CITC) governmental
org.
 Coordinating with regional and international
bodies in order to present the local community
needs
 Leading the local community effort towards
supporting Arabic language in Domain Names
3
4
5
 Ranked as the 5nd language by native speakers in
the world.
– Native speakers: 295 million
 Considered as Official/Co-official language in 25
country
6
Source: http://en.wikipedia.org/wiki/Arabic_script
 Arabic language character codes occupy a very small
section of the Arabic script Unicode chart
1. Written from right to left.
2. Most characters have different shapes
– depending on their position (beginning,
middle, or end) within a word
– probably conjugated with preceding and
succeeding characters.
– These different shapes for a single
character do not count as different code
points but they are handled using
different fonts
– Letters that can be joined are always
joined in both hand-written and printed
Arabic (cursive style).
‫ن‬ ‫ا‬ ‫ب‬ ‫ج‬
‫ج‬‫ـج‬ ‫ـجـ‬ ‫جـ‬
‫ب‬‫ـب‬ ‫ـبـ‬ ‫بـ‬
‫ا‬ ‫ـا‬
‫ن‬‫ـن‬ ‫ـنـ‬ ‫نـ‬
‫جبان‬
3. Two sets of numerals are used:
– Arabic : 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
– Arabic-Indic : 0،1،2،3،4،5،6،7،8،9
– Numerals are written from left to right
4. Tashkeel (diacritic)
– A small sing (not a letter) that is usually put on top
or under a character for the purpose of correct
pronunciation
– It is not widely used except incase of the possibility
of mispronouncing words that have the same letters
but with different pronunciations, and hence having
different meanings.
5. Abbreviations are not widely used
– When an abbreviation is written (in domain name)
characters will be joined together … leads to a
different word and pronunciation
َ‫ج‬َ‫ب‬َ‫ان‬
َ‫ان‬َّ‫ب‬‫ج‬
 The 2nd most widely used alphabetic writing
system in the world.
 Used by many languages such as:
– Arabic, Urdu, Persian, Turkish, Kurdish, Pashto, …etc
 It is widely used by more than 43 countries
– more than one billion potential users could be concerned
in using Arabic script domain names.
10
Source: http://en.wikipedia.org/wiki/Arabic_script
11
– Some languages may add or change
characters to represent phonemes that do not
appear in Arabic phonology.
– A new character usually created based on modifying
the basic shape of an existing Arabic character, for
example, by adding more dots.
 There are 64 “variants” for
“Google.com” domain due to
lower/upper case of ASCII
letters.
– If you type any of them you will
reach the same site
– The solution was done by DNS
protocols
– All are allocated and delegated
 But this is not the case for
other languages!
– Arabic (‫)کلی‬ vs. Urdu (‫!)كلى‬
Example of
ASCII Variants
Google.com
gOogle.com
goOgle.com
gooGle.com
GooGle.com
GooglE.com
…etc.
 Across the script: There are a number of groups of characters that have
the same shapes (Homoglyph).
– eg. Kaf, Heh, Yeh, Alef, … groups
13
‫أ‬
‫آ‬
‫إ‬
‫ى‬
‫ة‬
 There is a need for a system to solve and manage
variants in the whole Arabic script.
– It should be achieved through coordination between a
Registry and Language communities.
– The goal is to secure the TLD name space in a simple and
logical manner.
• Enhance security (Limit domain phishing)
• Ensure domain name reachability
• Easy to use and manage
• One key for all variants
• Variants based on character position
• Single user input device
• International (cross-languages)
reachability
• Simple user interface
• Study the whole script
Concepts
•Define Supported Language(s)
•Coordinate with Language Communities
•Solve any variant conflicts
•Finalize Variant Tables
•Use mechanism to group variants under
one key
•Provide clever tools to manage variants
Requirements
 One Key for all Variants (Master key)
– Storing all possible variants is not a visible nor a practical solution,
especially for longer domain names as they generate larger variant list.
– A new identification mechanism is needed to:
• Easily manage the whole variants list with one unique identifier
• Speed up the lookup process
• Eliminate the need of saving all possible variants (save storage space)
– Example: Master Key algorithm
– Result:
• Efficacy (Store only one Key instead of all possible variants).
Label Approximately # of variants
‫اتصال‬ 300
‫اتصاالت‬ 6,000
‫االتصاالت‬ 60,000
‫هيئة‬-‫االتصاالت‬ 2,879,999
‫هيئة‬-‫االتصاالت‬-‫وتقنية‬-‫المعلومات‬ 82,944,000,000
 Variants base on character position:
– In Arabic script languages, characters may take different shapes
depending on their position (isolated, final, medial or initial) within
a word.
– Therefore, a Variants Management System should considers
character position when deciding if 2 code points are variants or not.
– Example (‫:)هدهد‬
• 0647 = 06BE = 06C1= 06D5
• 16 possible variant
• 4 valid variants (25%)
• 12 without risk (75%)
– Result:
• More Accuracy
 A label is composed using a single input character set
table
– Arabic label can be typed using “one” keyboard layout (input
device)
– There are no mixing between code points from different keyboards
(Arabic Keyboard layout , Urdu keyboard layout ..etc).
– Example (‫:)كلى‬
• 18 Possible variants
• 10 Blocked because of language mixing (56%)
– Example ( ‫القرآن‬-‫الكريم‬ )
• 11,900 Possible variants
• 11,888 Blocked because of language mixing (99%)
– Result: More Accuracy (only valid allocate-able variants)
LANGUAGE UNICODE LABEL
Arabic (U+0643) (U+0644) (U+0649)
‫کلی‬
Urdu (U+06A9) (U+0644) (U+06CC)
‫كلى‬
LANGUAGE UNICODE LABEL
N/A (U+0643) (U+0644) (U+06CC)
‫ك‬‫ل‬‫ی‬
N/A (U+06A9) (U+0644) (U+0649)
‫ک‬‫ل‬‫ى‬
‫ك‬(0643)
‫ى‬(0649)
‫ک‬(06A9)
‫ی‬(06CC)
‫ك‬‫ل‬‫ى‬.‫موقع‬
‫ک‬‫ل‬‫ی‬.‫موقع‬
Visit our website:
‫كلى‬.‫موقع‬
 International Reachability
– End users should be able to reach their domain names
regardless of their location.
– Input devices (language table) that would be used to reach a
domain name (based on the user location) should be carefully
considered when defining variants.
– For example:
• A user registered the domain “‫”كلى‬ (all characters from the Arabic
language)
• if another user try to reach that domain name from an Internet café
in Pakistan he/she will type “‫”کلی‬ (all characters from the Urdu
language)
• If that variant was not allocated, delegated and hosted then the
domain name will not be reachable.
– In summary, variants need to be studied from both:
• Similarity point of view (by language community) and
• Reachability pointy of view (based on input devices used by other
language communities).
 Simple User Interface
– Myths about registrants and variants:
• Registrant can decide which variant to allocate from a huge list of variants.
• Registrant may know the differences between code points
– Arabic KAF (U+0643) and KEHEH (U+06A9)
• Registrant may know which variant should be allocated in order for their
domain name to be reached globally.
• Registrant can handle complex user interface to manage variants.
– Please note: it is unpractical to list all allocate-able variants
• As the list may contain hundreds of allocate-able variants.
– Hence: the Registry should help registrants to:
• Generate the best (desired) allocate-able variants (as helping examples)
• Provide easy way to enable/disable them
• Provide a way for advance users to manually type other desired allocate-able
variants.
– Registries should provide a separate web interface and an EPP
command that list the best (desired) allocate-able variants using a
clever way to help in managing variants
• i.e. using multiple filters to minimize the generated list (Variant Filters)
 Study variants across the whole Arabic script
– A full study should be conducted across the whole
Arabic Script in order to identify all possible variants
against code points in the supported language table
• Some existing solutions only check the variants between code
points within only the support language tables.
– This way whenever a new language is added there will
be no need to restudy the previous supported languages
and change their variant tables.
– Result:
• less key regeneration when adding new languages to the
registry.
 The Registry need to choose which language(s) will
be supported under their TLD
 Registry should coordinate with language
communities (or language experts) to achieve the
following:
– Language Table,
– Variant Table (including Variant Types/Action)
• Language communities should study their code points across the
whole script when identifying variants
• Action on the variants (Allocated, Blocked ..etc)
– Identify how their users may type a domain name using input
devices from other languages
• Must be allocated variants
• E.g. Arabic user may use Urdu keyboard to register and/or reach an
Arabic domain name
 Registry should solve any conflicts in variants or variants types
– Examples:
• Language 1: A = B , Language 2: A <> B !
• Language 3: C = D (Blocked) , Language 4: C = D (Allocated)!
 Registry Finalize the following:
– Supported language tables(s)
• List of code points for each language
• Will be used to stop mixing characters from different languages
– Variant Table
• Variants, Variants Types/Actions (e.g. Allocate-able, Blocked)
• Will be used to generate allocate-able variants
– Filters
• Will be used to suggest desired variants
 Registry should use a mechanism to secure variants from being
registered by others by grouping them under one key
– E.g. Master Key algorithm
 Registry should provide a simple and clever way for Registrants to:
– Register a domain name in their language
– List allocate-able and/or desired variants
– Enable and Disable allocate-able variants
 SaudiNIC has developed a complete VMS:
– Based on the stated Concepts
– Provides the stated Requirements
 We developed a Master Key algorithm and
Variant Filters to:
– Secure our name space
– Ensure domain name reachability
– Simplify variants management
Unicode
+
IDNA
+
Supported
Language(s)
Variants
Tables
Registrant
Different Input Devices
User
Registry
Language
Tables
 Requested Domain name
 Master Key
 Allocate-able Variants
 Must be allocated Variants
 Predicted Desired Allocate-able Variants
 Predicted Undesired Allocate-able Variants
 Blocked Variants
Domain name
Master Key
All Variants
Filters
Master Key
Algorithm
Registry System
 Generates a unique key for a domain name label and all of
its possible variants (based on the character position), the
new key can be used in the lookup process for both:
– Domain name availability
– Variants generation and allocation
 For example:
– “G41B G42M G43F” represents 18 variants:
• ‫كلى‬ (U+0643) (U+0644) (U+0649)
• ‫کلى‬ (U+06A9) (U+0644) (U+0649)
• ‫کلی‬ (U+06A9) (U+0644) (U+06CC)
• ‫كلي‬ (U+0643) (U+0644) (U+064A)
• ‫کلي‬ (U+06A9) (U+0644) (U+064A)
• ‫کلۍ‬ (U+06A9) (U+0644) (U+06CD)
• ‫کلے‬ (U+06A9) (U+0644) (U+06D2)
• …etc ,
the full list: http://arabic-
domains.org/adn_tools/m
k/index.php?T=1&M=%D9
%83%D9%84%D9%89
 Goal:
– To reduce the huge size of allocate-able variants by intelligently
display only the desired variants
 How?
– Linguistically we study words in the Arabic language to find some
rules to help identifying desired variants
– We used N-grams model to statically study the repetitive patters in
Arabic words
• An example of 2-gram for word “cars”: ca, ar, rs
• We studied 2, 3 and 4-grams for more than 7 million non-repetitive words
in the Arabic language
• Source: Books, Newspapers, Refereed Academic Journals.. Etc.
– We studied high-frequency patterns and then built some rules/filters
based on them: ( ‫الـ‬*,‫ألـ‬*,‫آلـ‬* ,… etc.)
– Then we developed a ranking system to order allocate-able variants
based on weight given by each rule.
– We have confirmed our findings with linguists and researchers.
 Sample of our variant rules ( 21+ rules):
– AlefMadaEnd
• Input: ‫خطأ‬-‫ظمأ‬
• Filtered: ‫خطآ‬-‫ظمآ‬ , ‫خطآ‬-‫ظما‬ , ‫خطأ‬-‫ظمآ‬ ..etc
– AlefHamzaDownEnd
• Input: ‫خطأ‬-‫ظمأ‬
• Filtered: ‫خطإ‬-‫ظمإ‬ , ‫خطإ‬-‫ظما‬ , ‫خطأ‬-‫ظمإ‬ ..etc
– Alf-Altareef:
• Input:‫القرآن‬
• Filtered: ‫,ألقرآن‬ ‫,إلقرآن‬ ‫آلقرآن‬
– Alef-letter-Alef
• Input:‫رايات‬
• Filtered: ‫,رآيآت‬ ‫,رإيإت‬ ‫رأيأت‬
– .. etc.
Note: Filtered variants are
still can be allocated after
manual verification
 Easy interface for registrants:
 For more information about the Master Key
algorithm and Variant filters :
– http://arabic-domains.org/docs/Master_Key_Algorithm.pdf
– http://nic.sa/en/view/doc64
– http://arabic-domains.org/adn_tools/mk/index.php (Demo)
 Best practices for managing Arabic domain
names registries
– Available soon on: www.nic.sa
‫زايرة‬ ‫ميكنمك‬ ‫املعلومات‬ ‫من‬ ‫يد‬‫ز‬‫للم‬:
For more information you can visit:
‫ئة‬‫ي‬‫ه‬-‫االتصاالت‬.‫سعودية‬‫ل‬‫ا‬
citc.gov.sa
‫جسل‬.‫سعودية‬‫ل‬‫ا‬
nic.sa

More Related Content

Similar to SaudiNIC Variant Management System

English de lenguaje de programacion
English de lenguaje de programacionEnglish de lenguaje de programacion
English de lenguaje de programacionVillalba Griselda
 
Programming Languages and the Programming Process
Programming Languages and the Programming ProcessProgramming Languages and the Programming Process
Programming Languages and the Programming ProcessSajib Barua
 
1.2 Evaluation of PLs.ppt
1.2 Evaluation of PLs.ppt1.2 Evaluation of PLs.ppt
1.2 Evaluation of PLs.pptmeenabairagi1
 
Lec21&22.pptx programing language and there study
Lec21&22.pptx programing language and there studyLec21&22.pptx programing language and there study
Lec21&22.pptx programing language and there studysamiullahamjad06
 
Introduction To Computer Programming
Introduction To Computer ProgrammingIntroduction To Computer Programming
Introduction To Computer ProgrammingHussain Buksh
 
ICANN 51: IDN Root Zone LGR (workshop)
ICANN 51: IDN Root Zone LGR (workshop)ICANN 51: IDN Root Zone LGR (workshop)
ICANN 51: IDN Root Zone LGR (workshop)ICANN
 
Arabic Domain Names: What’s been done so far?
Arabic Domain Names: What’s been done so far?Arabic Domain Names: What’s been done so far?
Arabic Domain Names: What’s been done so far?ahzoman
 
computer language with full detail
computer language with full detail computer language with full detail
computer language with full detail sonykhan3
 
Lecture 1 introduction to language processors
Lecture 1  introduction to language processorsLecture 1  introduction to language processors
Lecture 1 introduction to language processorsRebaz Najeeb
 
Lect 1. introduction to programming languages
Lect 1. introduction to programming languagesLect 1. introduction to programming languages
Lect 1. introduction to programming languagesVarun Garg
 
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptx
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptxCobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptx
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptxmehrankhan7842312
 
3. WEB TECHNOLOGIES.pptx B.Pharm sem 2 CAP
3. WEB TECHNOLOGIES.pptx B.Pharm sem 2 CAP3. WEB TECHNOLOGIES.pptx B.Pharm sem 2 CAP
3. WEB TECHNOLOGIES.pptx B.Pharm sem 2 CAPVedika Narvekar
 
Programming Languages Categories / Programming Paradigm By: Prof. Lili Saghafi
Programming Languages Categories / Programming Paradigm By: Prof. Lili Saghafi Programming Languages Categories / Programming Paradigm By: Prof. Lili Saghafi
Programming Languages Categories / Programming Paradigm By: Prof. Lili Saghafi Professor Lili Saghafi
 
[EclipseCon France 2017] Language Server Protocol in action
[EclipseCon France 2017] Language Server Protocol in action[EclipseCon France 2017] Language Server Protocol in action
[EclipseCon France 2017] Language Server Protocol in actionMickael Istria
 

Similar to SaudiNIC Variant Management System (20)

English de lenguaje de programacion
English de lenguaje de programacionEnglish de lenguaje de programacion
English de lenguaje de programacion
 
Programming Languages and the Programming Process
Programming Languages and the Programming ProcessProgramming Languages and the Programming Process
Programming Languages and the Programming Process
 
1.2 Evaluation of PLs.ppt
1.2 Evaluation of PLs.ppt1.2 Evaluation of PLs.ppt
1.2 Evaluation of PLs.ppt
 
Presentation on Programming Languages.
Presentation on Programming Languages.Presentation on Programming Languages.
Presentation on Programming Languages.
 
Lec21&22.pptx programing language and there study
Lec21&22.pptx programing language and there studyLec21&22.pptx programing language and there study
Lec21&22.pptx programing language and there study
 
.Net framework
.Net framework.Net framework
.Net framework
 
Introduction To Computer Programming
Introduction To Computer ProgrammingIntroduction To Computer Programming
Introduction To Computer Programming
 
Intro Ch 13B.ppt
Intro Ch 13B.pptIntro Ch 13B.ppt
Intro Ch 13B.ppt
 
ICANN 51: IDN Root Zone LGR (workshop)
ICANN 51: IDN Root Zone LGR (workshop)ICANN 51: IDN Root Zone LGR (workshop)
ICANN 51: IDN Root Zone LGR (workshop)
 
Arabic Domain Names: What’s been done so far?
Arabic Domain Names: What’s been done so far?Arabic Domain Names: What’s been done so far?
Arabic Domain Names: What’s been done so far?
 
computer language with full detail
computer language with full detail computer language with full detail
computer language with full detail
 
Introduction to compilers
Introduction to compilersIntroduction to compilers
Introduction to compilers
 
Lecture 1 introduction to language processors
Lecture 1  introduction to language processorsLecture 1  introduction to language processors
Lecture 1 introduction to language processors
 
Lect 1. introduction to programming languages
Lect 1. introduction to programming languagesLect 1. introduction to programming languages
Lect 1. introduction to programming languages
 
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptx
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptxCobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptx
Cobbbbbbbnnnnnnnnnnnnnnnnncepts of PL.pptx
 
3. WEB TECHNOLOGIES.pptx B.Pharm sem 2 CAP
3. WEB TECHNOLOGIES.pptx B.Pharm sem 2 CAP3. WEB TECHNOLOGIES.pptx B.Pharm sem 2 CAP
3. WEB TECHNOLOGIES.pptx B.Pharm sem 2 CAP
 
Programming Languages Categories / Programming Paradigm By: Prof. Lili Saghafi
Programming Languages Categories / Programming Paradigm By: Prof. Lili Saghafi Programming Languages Categories / Programming Paradigm By: Prof. Lili Saghafi
Programming Languages Categories / Programming Paradigm By: Prof. Lili Saghafi
 
Introduction to Compiler design
Introduction to Compiler design Introduction to Compiler design
Introduction to Compiler design
 
Intermediate Languages
Intermediate LanguagesIntermediate Languages
Intermediate Languages
 
[EclipseCon France 2017] Language Server Protocol in action
[EclipseCon France 2017] Language Server Protocol in action[EclipseCon France 2017] Language Server Protocol in action
[EclipseCon France 2017] Language Server Protocol in action
 

More from APNIC

DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGAPNIC
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119APNIC
 
draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119APNIC
 
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119APNIC
 
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119APNIC
 
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119APNIC
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...APNIC
 
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85APNIC
 
NANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff HustonNANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff HustonAPNIC
 
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff HustonDNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff HustonAPNIC
 
APAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, ThailandAPAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, ThailandAPNIC
 
Lao Digital Week 2024: It's time to deploy IPv6
Lao Digital Week 2024: It's time to deploy IPv6Lao Digital Week 2024: It's time to deploy IPv6
Lao Digital Week 2024: It's time to deploy IPv6APNIC
 
AINTEC 2023: Networking in the Penumbra!
AINTEC 2023: Networking in the Penumbra!AINTEC 2023: Networking in the Penumbra!
AINTEC 2023: Networking in the Penumbra!APNIC
 
CNIRC 2023: Global and Regional IPv6 Deployment 2023
CNIRC 2023: Global and Regional IPv6 Deployment 2023CNIRC 2023: Global and Regional IPv6 Deployment 2023
CNIRC 2023: Global and Regional IPv6 Deployment 2023APNIC
 
AFSIG 2023: APNIC Foundation and support for Internet development
AFSIG 2023: APNIC Foundation and support for Internet developmentAFSIG 2023: APNIC Foundation and support for Internet development
AFSIG 2023: APNIC Foundation and support for Internet developmentAPNIC
 
AFNOG 1: Afghanistan IP Deployment Status
AFNOG 1: Afghanistan IP Deployment StatusAFNOG 1: Afghanistan IP Deployment Status
AFNOG 1: Afghanistan IP Deployment StatusAPNIC
 
AFSIG 2023: Internet routing and addressing
AFSIG 2023: Internet routing and addressingAFSIG 2023: Internet routing and addressing
AFSIG 2023: Internet routing and addressingAPNIC
 

More from APNIC (20)

DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119
 
draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119
 
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
 
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
 
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
 
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
 
NANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff HustonNANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff Huston
 
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff HustonDNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
 
APAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, ThailandAPAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, Thailand
 
Lao Digital Week 2024: It's time to deploy IPv6
Lao Digital Week 2024: It's time to deploy IPv6Lao Digital Week 2024: It's time to deploy IPv6
Lao Digital Week 2024: It's time to deploy IPv6
 
AINTEC 2023: Networking in the Penumbra!
AINTEC 2023: Networking in the Penumbra!AINTEC 2023: Networking in the Penumbra!
AINTEC 2023: Networking in the Penumbra!
 
CNIRC 2023: Global and Regional IPv6 Deployment 2023
CNIRC 2023: Global and Regional IPv6 Deployment 2023CNIRC 2023: Global and Regional IPv6 Deployment 2023
CNIRC 2023: Global and Regional IPv6 Deployment 2023
 
AFSIG 2023: APNIC Foundation and support for Internet development
AFSIG 2023: APNIC Foundation and support for Internet developmentAFSIG 2023: APNIC Foundation and support for Internet development
AFSIG 2023: APNIC Foundation and support for Internet development
 
AFNOG 1: Afghanistan IP Deployment Status
AFNOG 1: Afghanistan IP Deployment StatusAFNOG 1: Afghanistan IP Deployment Status
AFNOG 1: Afghanistan IP Deployment Status
 
AFSIG 2023: Internet routing and addressing
AFSIG 2023: Internet routing and addressingAFSIG 2023: Internet routing and addressing
AFSIG 2023: Internet routing and addressing
 

Recently uploaded

Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012rehmti665
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Roomgirls4nights
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Call Girls in East Of Kailash 9711199171 Delhi Enjoy Call Girls With Our Escorts
Call Girls in East Of Kailash 9711199171 Delhi Enjoy Call Girls With Our EscortsCall Girls in East Of Kailash 9711199171 Delhi Enjoy Call Girls With Our Escorts
Call Girls in East Of Kailash 9711199171 Delhi Enjoy Call Girls With Our Escortsindian call girls near you
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girlsstephieert
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Deliverybabeytanya
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)Damian Radcliffe
 
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service PuneVIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service PuneCall girls in Ahmedabad High profile
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of indiaimessage0108
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsThierry TROUIN ☁
 

Recently uploaded (20)

Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
Call Girls South Delhi Delhi reach out to us at ☎ 9711199012
 
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With RoomVIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
VIP Kolkata Call Girls Salt Lake 8250192130 Available With Room
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Call Girls in East Of Kailash 9711199171 Delhi Enjoy Call Girls With Our Escorts
Call Girls in East Of Kailash 9711199171 Delhi Enjoy Call Girls With Our EscortsCall Girls in East Of Kailash 9711199171 Delhi Enjoy Call Girls With Our Escorts
Call Girls in East Of Kailash 9711199171 Delhi Enjoy Call Girls With Our Escorts
 
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Russian Call girls in Dubai +971563133746 Dubai Call girls
Russian  Call girls in Dubai +971563133746 Dubai  Call girlsRussian  Call girls in Dubai +971563133746 Dubai  Call girls
Russian Call girls in Dubai +971563133746 Dubai Call girls
 
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on DeliveryCall Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
Call Girls In Mumbai Central Mumbai ❤️ 9920874524 👈 Cash on Delivery
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 22 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service PuneVIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
VIP Call Girls Pune Madhuri 8617697112 Independent Escort Service Pune
 
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in  Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Jamuna Vihar Delhi reach out to us at 🔝9953056974🔝
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of india
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with Flows
 

SaudiNIC Variant Management System

  • 2.  Introduction – About SaudiNIC – Variants in the Arabic script and Arabic Language  Variant Management System – Concepts – Requirements  SaudiNIC ’s VMS – Master Key Algorithm – Variants Filters
  • 3.  Administering the domain name space under (.sa) since 1995 & ( .‫السعودية‬ ) since 2010.  Operated by Communication and Information Technology Commission (CITC) governmental org.  Coordinating with regional and international bodies in order to present the local community needs  Leading the local community effort towards supporting Arabic language in Domain Names 3
  • 4. 4
  • 5. 5
  • 6.  Ranked as the 5nd language by native speakers in the world. – Native speakers: 295 million  Considered as Official/Co-official language in 25 country 6 Source: http://en.wikipedia.org/wiki/Arabic_script
  • 7.  Arabic language character codes occupy a very small section of the Arabic script Unicode chart
  • 8. 1. Written from right to left. 2. Most characters have different shapes – depending on their position (beginning, middle, or end) within a word – probably conjugated with preceding and succeeding characters. – These different shapes for a single character do not count as different code points but they are handled using different fonts – Letters that can be joined are always joined in both hand-written and printed Arabic (cursive style). ‫ن‬ ‫ا‬ ‫ب‬ ‫ج‬ ‫ج‬‫ـج‬ ‫ـجـ‬ ‫جـ‬ ‫ب‬‫ـب‬ ‫ـبـ‬ ‫بـ‬ ‫ا‬ ‫ـا‬ ‫ن‬‫ـن‬ ‫ـنـ‬ ‫نـ‬ ‫جبان‬
  • 9. 3. Two sets of numerals are used: – Arabic : 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 – Arabic-Indic : 0،1،2،3،4،5،6،7،8،9 – Numerals are written from left to right 4. Tashkeel (diacritic) – A small sing (not a letter) that is usually put on top or under a character for the purpose of correct pronunciation – It is not widely used except incase of the possibility of mispronouncing words that have the same letters but with different pronunciations, and hence having different meanings. 5. Abbreviations are not widely used – When an abbreviation is written (in domain name) characters will be joined together … leads to a different word and pronunciation َ‫ج‬َ‫ب‬َ‫ان‬ َ‫ان‬َّ‫ب‬‫ج‬
  • 10.  The 2nd most widely used alphabetic writing system in the world.  Used by many languages such as: – Arabic, Urdu, Persian, Turkish, Kurdish, Pashto, …etc  It is widely used by more than 43 countries – more than one billion potential users could be concerned in using Arabic script domain names. 10 Source: http://en.wikipedia.org/wiki/Arabic_script
  • 11. 11 – Some languages may add or change characters to represent phonemes that do not appear in Arabic phonology. – A new character usually created based on modifying the basic shape of an existing Arabic character, for example, by adding more dots.
  • 12.  There are 64 “variants” for “Google.com” domain due to lower/upper case of ASCII letters. – If you type any of them you will reach the same site – The solution was done by DNS protocols – All are allocated and delegated  But this is not the case for other languages! – Arabic (‫)کلی‬ vs. Urdu (‫!)كلى‬ Example of ASCII Variants Google.com gOogle.com goOgle.com gooGle.com GooGle.com GooglE.com …etc.
  • 13.  Across the script: There are a number of groups of characters that have the same shapes (Homoglyph). – eg. Kaf, Heh, Yeh, Alef, … groups 13
  • 15.  There is a need for a system to solve and manage variants in the whole Arabic script. – It should be achieved through coordination between a Registry and Language communities. – The goal is to secure the TLD name space in a simple and logical manner. • Enhance security (Limit domain phishing) • Ensure domain name reachability • Easy to use and manage
  • 16. • One key for all variants • Variants based on character position • Single user input device • International (cross-languages) reachability • Simple user interface • Study the whole script Concepts •Define Supported Language(s) •Coordinate with Language Communities •Solve any variant conflicts •Finalize Variant Tables •Use mechanism to group variants under one key •Provide clever tools to manage variants Requirements
  • 17.  One Key for all Variants (Master key) – Storing all possible variants is not a visible nor a practical solution, especially for longer domain names as they generate larger variant list. – A new identification mechanism is needed to: • Easily manage the whole variants list with one unique identifier • Speed up the lookup process • Eliminate the need of saving all possible variants (save storage space) – Example: Master Key algorithm – Result: • Efficacy (Store only one Key instead of all possible variants). Label Approximately # of variants ‫اتصال‬ 300 ‫اتصاالت‬ 6,000 ‫االتصاالت‬ 60,000 ‫هيئة‬-‫االتصاالت‬ 2,879,999 ‫هيئة‬-‫االتصاالت‬-‫وتقنية‬-‫المعلومات‬ 82,944,000,000
  • 18.  Variants base on character position: – In Arabic script languages, characters may take different shapes depending on their position (isolated, final, medial or initial) within a word. – Therefore, a Variants Management System should considers character position when deciding if 2 code points are variants or not. – Example (‫:)هدهد‬ • 0647 = 06BE = 06C1= 06D5 • 16 possible variant • 4 valid variants (25%) • 12 without risk (75%) – Result: • More Accuracy
  • 19.  A label is composed using a single input character set table – Arabic label can be typed using “one” keyboard layout (input device) – There are no mixing between code points from different keyboards (Arabic Keyboard layout , Urdu keyboard layout ..etc). – Example (‫:)كلى‬ • 18 Possible variants • 10 Blocked because of language mixing (56%) – Example ( ‫القرآن‬-‫الكريم‬ ) • 11,900 Possible variants • 11,888 Blocked because of language mixing (99%) – Result: More Accuracy (only valid allocate-able variants) LANGUAGE UNICODE LABEL Arabic (U+0643) (U+0644) (U+0649) ‫کلی‬ Urdu (U+06A9) (U+0644) (U+06CC) ‫كلى‬ LANGUAGE UNICODE LABEL N/A (U+0643) (U+0644) (U+06CC) ‫ك‬‫ل‬‫ی‬ N/A (U+06A9) (U+0644) (U+0649) ‫ک‬‫ل‬‫ى‬
  • 21.  International Reachability – End users should be able to reach their domain names regardless of their location. – Input devices (language table) that would be used to reach a domain name (based on the user location) should be carefully considered when defining variants. – For example: • A user registered the domain “‫”كلى‬ (all characters from the Arabic language) • if another user try to reach that domain name from an Internet café in Pakistan he/she will type “‫”کلی‬ (all characters from the Urdu language) • If that variant was not allocated, delegated and hosted then the domain name will not be reachable. – In summary, variants need to be studied from both: • Similarity point of view (by language community) and • Reachability pointy of view (based on input devices used by other language communities).
  • 22.  Simple User Interface – Myths about registrants and variants: • Registrant can decide which variant to allocate from a huge list of variants. • Registrant may know the differences between code points – Arabic KAF (U+0643) and KEHEH (U+06A9) • Registrant may know which variant should be allocated in order for their domain name to be reached globally. • Registrant can handle complex user interface to manage variants. – Please note: it is unpractical to list all allocate-able variants • As the list may contain hundreds of allocate-able variants. – Hence: the Registry should help registrants to: • Generate the best (desired) allocate-able variants (as helping examples) • Provide easy way to enable/disable them • Provide a way for advance users to manually type other desired allocate-able variants. – Registries should provide a separate web interface and an EPP command that list the best (desired) allocate-able variants using a clever way to help in managing variants • i.e. using multiple filters to minimize the generated list (Variant Filters)
  • 23.  Study variants across the whole Arabic script – A full study should be conducted across the whole Arabic Script in order to identify all possible variants against code points in the supported language table • Some existing solutions only check the variants between code points within only the support language tables. – This way whenever a new language is added there will be no need to restudy the previous supported languages and change their variant tables. – Result: • less key regeneration when adding new languages to the registry.
  • 24.  The Registry need to choose which language(s) will be supported under their TLD  Registry should coordinate with language communities (or language experts) to achieve the following: – Language Table, – Variant Table (including Variant Types/Action) • Language communities should study their code points across the whole script when identifying variants • Action on the variants (Allocated, Blocked ..etc) – Identify how their users may type a domain name using input devices from other languages • Must be allocated variants • E.g. Arabic user may use Urdu keyboard to register and/or reach an Arabic domain name
  • 25.  Registry should solve any conflicts in variants or variants types – Examples: • Language 1: A = B , Language 2: A <> B ! • Language 3: C = D (Blocked) , Language 4: C = D (Allocated)!  Registry Finalize the following: – Supported language tables(s) • List of code points for each language • Will be used to stop mixing characters from different languages – Variant Table • Variants, Variants Types/Actions (e.g. Allocate-able, Blocked) • Will be used to generate allocate-able variants – Filters • Will be used to suggest desired variants  Registry should use a mechanism to secure variants from being registered by others by grouping them under one key – E.g. Master Key algorithm  Registry should provide a simple and clever way for Registrants to: – Register a domain name in their language – List allocate-able and/or desired variants – Enable and Disable allocate-able variants
  • 26.  SaudiNIC has developed a complete VMS: – Based on the stated Concepts – Provides the stated Requirements  We developed a Master Key algorithm and Variant Filters to: – Secure our name space – Ensure domain name reachability – Simplify variants management
  • 27. Unicode + IDNA + Supported Language(s) Variants Tables Registrant Different Input Devices User Registry Language Tables  Requested Domain name  Master Key  Allocate-able Variants  Must be allocated Variants  Predicted Desired Allocate-able Variants  Predicted Undesired Allocate-able Variants  Blocked Variants Domain name Master Key All Variants Filters Master Key Algorithm Registry System
  • 28.  Generates a unique key for a domain name label and all of its possible variants (based on the character position), the new key can be used in the lookup process for both: – Domain name availability – Variants generation and allocation  For example: – “G41B G42M G43F” represents 18 variants: • ‫كلى‬ (U+0643) (U+0644) (U+0649) • ‫کلى‬ (U+06A9) (U+0644) (U+0649) • ‫کلی‬ (U+06A9) (U+0644) (U+06CC) • ‫كلي‬ (U+0643) (U+0644) (U+064A) • ‫کلي‬ (U+06A9) (U+0644) (U+064A) • ‫کلۍ‬ (U+06A9) (U+0644) (U+06CD) • ‫کلے‬ (U+06A9) (U+0644) (U+06D2) • …etc , the full list: http://arabic- domains.org/adn_tools/m k/index.php?T=1&M=%D9 %83%D9%84%D9%89
  • 29.  Goal: – To reduce the huge size of allocate-able variants by intelligently display only the desired variants  How? – Linguistically we study words in the Arabic language to find some rules to help identifying desired variants – We used N-grams model to statically study the repetitive patters in Arabic words • An example of 2-gram for word “cars”: ca, ar, rs • We studied 2, 3 and 4-grams for more than 7 million non-repetitive words in the Arabic language • Source: Books, Newspapers, Refereed Academic Journals.. Etc. – We studied high-frequency patterns and then built some rules/filters based on them: ( ‫الـ‬*,‫ألـ‬*,‫آلـ‬* ,… etc.) – Then we developed a ranking system to order allocate-able variants based on weight given by each rule. – We have confirmed our findings with linguists and researchers.
  • 30.  Sample of our variant rules ( 21+ rules): – AlefMadaEnd • Input: ‫خطأ‬-‫ظمأ‬ • Filtered: ‫خطآ‬-‫ظمآ‬ , ‫خطآ‬-‫ظما‬ , ‫خطأ‬-‫ظمآ‬ ..etc – AlefHamzaDownEnd • Input: ‫خطأ‬-‫ظمأ‬ • Filtered: ‫خطإ‬-‫ظمإ‬ , ‫خطإ‬-‫ظما‬ , ‫خطأ‬-‫ظمإ‬ ..etc – Alf-Altareef: • Input:‫القرآن‬ • Filtered: ‫,ألقرآن‬ ‫,إلقرآن‬ ‫آلقرآن‬ – Alef-letter-Alef • Input:‫رايات‬ • Filtered: ‫,رآيآت‬ ‫,رإيإت‬ ‫رأيأت‬ – .. etc. Note: Filtered variants are still can be allocated after manual verification
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.  Easy interface for registrants:
  • 36.  For more information about the Master Key algorithm and Variant filters : – http://arabic-domains.org/docs/Master_Key_Algorithm.pdf – http://nic.sa/en/view/doc64 – http://arabic-domains.org/adn_tools/mk/index.php (Demo)  Best practices for managing Arabic domain names registries – Available soon on: www.nic.sa
  • 37. ‫زايرة‬ ‫ميكنمك‬ ‫املعلومات‬ ‫من‬ ‫يد‬‫ز‬‫للم‬: For more information you can visit: ‫ئة‬‫ي‬‫ه‬-‫االتصاالت‬.‫سعودية‬‫ل‬‫ا‬ citc.gov.sa ‫جسل‬.‫سعودية‬‫ل‬‫ا‬ nic.sa