Text
#ICANN50
IDN Variant TLD Program
Update
Sarmad Hussain
IDN Variant TLD Program
ICANN
25 June 2014
TextText
#ICANN50
TextText
#ICANN50
Agenda
• Program Update – 15 min
• MSR - 15 min
• Community updates:
o  Arabic Generation Panel – 15 min...
TextText
#ICANN50
Program Update
Text
#ICANN49
Phase 1: 2011
o  Script Case Studies Conducted: Arabic, Chinese, Cyrillic, Devanagari, Greek, Latin
IDN Vari...
Text
#ICANN50
LGR Procedure Overview
Generation
Panel
Generation
Panel
Integration
Panel
Unified
LGR for
the
Root
Zone
One...
TextText
#ICANN50
LGR Project Status – Milestones
•  Community Work
o  Arabic Generation Panel seated
o  Chinese Generatio...
TextText
#ICANN50
MSR-1 Public Comment Process
•  Public Comment Report released on 20 June 2014:
https://www.icann.org/en...
TextText
#ICANN50
Cyclical, additive nature of MSR & LGR
• Motivation
o  Portions of MSR are not reviewed by relevant scri...
TextText
#ICANN50
Outreach Efforts
•  Outreach efforts focused on organizing GPs
o  Quick Guide Kit for Generation Panels
...
TextText
#ICANN50
Moving Forward
•  Suggested Plan
o  MSR-2 to cover additional scripts by Q1 2015
o  LGR-1 by Q3 2015 (an...
Text
#ICANN50
LGR Procedure Depends on Community Work
• Generation Panels and LGR proposals
are REQUIRED for IDN variants ...
TextText
#ICANN50
Want To Know More?
Join us for the LGR Workshop!
• IDN Root Zone LGR Generation Panels Workshop
Wednesda...
TextText
#ICANN50
Resources
•  Toolkit for ‘How to form a Generation Panel’
o  Quick Guide Kit for Generation Panels
•  Pr...
TextText
#ICANN50
Version 1 of the Maximal
Starting Repertoire
(MSR-1)
TextText
#ICANN50
MSR-1 Available
•  MSR-1 released on 20 June 2014:
https://www.icann.org/news/announcement-2-2014-06-20-...
TextText
#ICANN50
MSR-1 Content in Numbers
•  22 scripts
o  Arabic, Bengali, Cyrillic, Devanagari, Georgian, Greek, Gujara...
TextText
#ICANN50
MSR-1 Public Comment Process
•  Analysis of inputs received:
o  Requests for revision in code point anal...
TextText
#ICANN50
Expanded Graded Intergenerational Disruption
Scale (EGIDS)
•  EGIDS
o  Not based on population size, but...
TextText
#ICANN50
MSR - Next Steps
•  MSR-1 is only the start
•  MSR-2 will complete the repertoire
o  Adds some or all of...
Text
#ICANN50
MSR and LGR Timeline
GP1
GP2
GP
3
GP
4
GP5
LGR-1	

 LGR-2	

MSR-2	

MSR-1	

GP4’s script is not included
in ...
TextText
#ICANN50
IP-GP Communications
•  Integration Panel available to help Generation Panels to
make progress and to en...
TextText
#ICANN50
Representing LGRs in XML
• What Does XML-LGR Enable?
Simple validity
checking, as well
as variant label
...
TextText
#ICANN50
XML Status and Next Steps
•  Finalize specification and move tool(s) to production state
•  Internet dra...
TextText
#ICANN50
Community Work
Arabic Generation Panel
Task	
  Force	
  on	
  Arabic	
  Script	
  IDNs:	
  
Overview	
  and	
  Progress	
  
@	
  ICANN	
  London	
  Mee-ng	
  (Ju...
Community	
  Driven	
  Way	
  forward:	
  	
  
Task	
  Force	
  on	
  Arabic	
  Script	
  IDNs	
  
•  Crea-on	
  and	
  ov...
Membership	
  
•  Currently	
  26	
  members	
  –	
  applica-ons	
  s-ll	
  being	
  received	
  
•  From	
   15	
   count...
Task	
  Force	
  on	
  Arabic	
  Script	
  IDNs	
  
•  Membership	
  open,	
  community	
  based	
  
•  Details	
  and	
  ...
Arabic	
  Script	
  TLDs	
  Assigned	
  or	
  Delegated	
  
	
‫ اﳉﺰاﺋﺮ‬ .1
	
 ‫ ﻋﻤﺎن‬ .2
	
‫ ا+ﺮان‬ .3
	
 ‫ اﻣﺎرات‬ .4
	
 ...
SUMMARY	
  OF	
  THE	
  CODE	
  POINTS	
  
Color	
   Descrip-on	
   No.	
  Of	
  codes	
  
DISALLOWED	
  by	
  MSR	
   48	...
IDN	
  Variants	
  Needs	
  and	
  Challenges	
  
Security	
  and	
  Stability	
  Needs	
  
‫ﭘﺎ‬‫ک‬‫ﭘﺎ‬ ‫ﺳﺘﺎن‬‫ك‬	
  	
  	...
Progress	
  
Work	
  Accomplished	
  
–  Arabic	
  Script	
  Genera-on	
  Panel	
  
–  Principles	
  for	
  Inclusion,	
  ...
Current	
  Work	
  and	
  Next	
  Steps	
  
•  XML	
  Manual	
  [June	
  30,	
  2014]	
  
•  Finalize	
  the	
  discussion...
 	
  ‫ﻪ‬‫  
	ﮐﺎﺳﻴ‬‫ﺎ‬‫ﺗﺮﳝ‬
	
ً‫ا‬‫ﺷﻜﺮ‬
	
  ‫ہ‬‫ﺷﮑﺮی‬
	
  ‫س‬‫  
	ﺳﭙﺎ‬‫ﺎ‬‫ﺑ‬
	
  ‫ﺮ‬‫ﺗﺸﻜ‬
Thank	
  You	
  
10	
  Task	
  Fo...
TextText
#ICANN50
Community Work
CJK Coordination Report
CJK Rules For The Root Zone
Kenny Huang, Ph.D. 黃勝雄博士
Member, CDNC / CGP
Co-author, RFC3743 IETF
Member, Executive Council,...
Problem : CJK Is Complicated
2
PuttingCJKlabels in the root zone
is evenmore complicated
Institutionalized Problem Solving : Structure
3
Constraints for CJK LGR
4
Independent Tasks
Each CJK Panel creates an LGR
Each LGR includes a repertoire
and variants
D...
Overlap Case Illustration
5
壹
U58F9
弌
U5F0C
壱
U58F1
一
U4E00
allocate block block
Variant
Unicode
Disposition
一
U4E00
Varia...
High Level Conflict Strategies
6
ID Strategy Pros Cons Rank
1 Adopt X
Abandon Rcjk
Permit X No label rule
2 Adopt X
Inters...
Unified CJK LGR Illustration
7
壹
U58F9
弌
U5F0C
壱
U58F1
一
U4E00
allocate block block
Variant
Unicode
Disposition
Chinese
LG...
CJK Integration Methodology
Divide & Conquer (D&C)
Unified CJK Rules
Variant
Dispositions
Minimal Viable
Solution
CJK Rule...
Splitting Non-overlapping Code Points From
Repertories
9
C/J
Overlap:
6181
C-Han : 19520 (CNNIC/TWNIC)
J-Han : 6356 (JPRS)...
Engineering Design
10
2
TC : Apple News
SC : Sina News
JP : Mainichi News
Computation for Word Usage and Frequency
C/J ove...
Splitting Unused Code Points from The Overlap
11
J only : 203
C only : 1927Rc
Rj
total unused : 2739
3
C / J Overlap Data ...
Computing Frequency of Use of Code Points
12
4
Initial Data Set : 1312
Top 10 Most Popular Words
13
的, 2774
人, 1005
在, 975
一, 964
是, 960
不, 951
中, 896
有, 883
大, 776
台, 718
TC
日, 20942
月, 20315
...
14
4.063
4.1884
1.7338
0.6 0.6094
0.886
0.55820.59440.5518
0.468
0.7042
0.4304
0.6026
0.4488
0.36 0.35620.4026 0.385
0.750...
15
0.1988
0.2788
0.5512
0.1956
0.3834
0.1644
0.2366
0.1056
0.1212
0.1088
0.1688
0.1422
0.1344
0.1622
0.0856
0.2812
0.1588
...
16
0.0222
0.0144
0.0112
0.0056 0.0056
0.0022
0.0012 0.0012 0.0012 0.0012
0
0.005
0.01
0.015
0.02
0.025
8FCE 7D20 675F 541B...
Frequency of Use Reassembly
17
unified code points
363
939
1302
+
Problem Domain (Unsolved Overlap) : 10
C / J Usage Overl...
Data Processing & Computation Recap
18
>20K Han Code Points
6181 CJK Overlap
1312 Usage Overlap
Splitting Non-overlapping
...
Future Work
19
1578152323
3433
69
122
289
501
80
26
1455321
0
100
200
300
400
500
600
%
numberofXwithinthesamerange
Chines...
Re-consider Language Tag
20
K
tag
J
tag
TLD
registries
IANA/Verisign
provisioning
root server
operators
publication
Intern...
21
PerfectionSyndrome
“Engineering isn't about perfect solutions; it's about doing the best
you can with limited resources...
TextText
#ICANN50
Community Work
Neo-Brahmi Prospective
Generation Panel
Neo-BrahmiNeo-Brahmi
Generation Panel
What is Brahmi?
• An ancient script
• Most of the modern scripts in Indian subcontinent
have been derived from Brahmi
• Ge...
What Neo-Brahmi?
• Of all the scripts derived from “Brahmi”, not all are in
modern usage
• Approach is in consonance with ...
Neo-Brahmi Generation Panel
• Currently the group is of 10 members
• Mixed bag expertise like linguistic, Unicode
• Need m...
Progress so far…
• Reviewed and commented on Maximal Starting Repertoire for
the Root
• A workshop is planned in AprIGF on...
Integration
Panel
Neo-
Brahmi GP
Devanagari
SP
Hindi
Marathi
Konkani
Nepali
Bodo
Dogri
Tamil
Sub-panel
Telugu
SP
Gujarati
...
TextText
#ICANN50
ICANN IDN Team: Thank You
USEFUL LINKS:
•  The LGR Procedure:
http://www.icann.org/en/resources/idn/vari...
ICANN 50: IDN Variant TLD Program Update
Upcoming SlideShare
Loading in …5
×

ICANN 50: IDN Variant TLD Program Update

1,363 views
1,188 views

Published on

The session provides a presentation and status update on the IDN Variant TLD Program. Update includes progress made on the implementation of the IDN Root LGR Procedure, status of the Maximal Starting Repertoire, community and Generation Panels' updates.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,363
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

ICANN 50: IDN Variant TLD Program Update

  1. 1. Text #ICANN50 IDN Variant TLD Program Update Sarmad Hussain IDN Variant TLD Program ICANN 25 June 2014
  2. 2. TextText #ICANN50
  3. 3. TextText #ICANN50 Agenda • Program Update – 15 min • MSR - 15 min • Community updates: o  Arabic Generation Panel – 15 min o  CJK Coordination Report – 15 min o  Neo-Brahmi Prospective Generation Panel - 15 min • Q&A - 15 min
  4. 4. TextText #ICANN50 Program Update
  5. 5. Text #ICANN49 Phase 1: 2011 o  Script Case Studies Conducted: Arabic, Chinese, Cyrillic, Devanagari, Greek, Latin IDN Variant Program: A Brief Overview Phase 2: 2011 – 2012 o  Integrated Issues Report development and publication Phase 3: 2012 – 2013 o  Creation of Procedure to develop and maintain the Label Generation Rules for the Root Zone in Respect of IDNA labels (LGR Procedure) o  Development of “Study on Examining the User Experience Implications of Active Variant TLDs” and XML specification for representing Label Generation Rules Phase 4: 2013 – 2015 (In Progress) o Implementation of LGR procedure o Processes development for incorporating the LGR o Ongoing work on XML specification o Implementation of LGR procedure
  6. 6. Text #ICANN50 LGR Procedure Overview Generation Panel Generation Panel Integration Panel Unified LGR for the Root Zone One Generation Panel per script or writing system Propose Reject / Accept Reject / Accept Integrate Generation Panels •  Generate proposals for script specific LGRs, based on community expertise and requirements Integration Panel •  Integrates them into common Root Zone LGR while minimizing the risk to Root Zone as shared resource Label Generation Rules (LGR) •  Which labels are permissible •  Which variant labels exist •  Which variant labels may be allocated TO BE FORMED BY SCRIPT COMMUNITIES Arabic GP seated Chinese GP in-formation Integration Panel formed MSR published * URLs available on Slide 10 (Resources) and last slide
  7. 7. TextText #ICANN50 LGR Project Status – Milestones •  Community Work o  Arabic Generation Panel seated o  Chinese Generation Panel in formation o  Japanese, Korean and Neo-Brahmi Generation Panels being organized o  Individual expression of interests for other scripts •  ICANN/Integration Panel Work o  MSR-1 released for Public Comments o  MSR-1 published – 20 June o  Ongoing outreach efforts
  8. 8. TextText #ICANN50 MSR-1 Public Comment Process •  Public Comment Report released on 20 June 2014: https://www.icann.org/en/system/files/files/report-comments-msr-20jun14-en.pdf •  Inputs received: o  General: §  Community support for the conservative approach o  Process: §  Need for addition in MSR and LGR (scripts already included) §  Need for addition in MSR and LGR (scripts not included) §  Need for review by relevant script community §  Need for outreach to additional script communities o  Code points: §  Inclusion of specific code points §  Considering inclusion of languages spoken by smaller communities
  9. 9. TextText #ICANN50 Cyclical, additive nature of MSR & LGR • Motivation o  Portions of MSR are not reviewed by relevant script community o  At this time, there is not sufficient data to decide on a code point o  The status of a code point may change over time • Recommendation for further feedback o  Cyclical releases of MSR and LGR so communities and ICANN can organize their work – based on community need and practical considerations o  Additive releases of MSR and LGR, if new evidence for a code point and no impact on security and stability of existing system
  10. 10. TextText #ICANN50 Outreach Efforts •  Outreach efforts focused on organizing GPs o  Quick Guide Kit for Generation Panels o  IDN interviews and videos o  Email to ICANN executive mailing list to reach out to their contacts to create new GPs •  Keeping community informed o  Targeted events such as IGF, regional IGFs and ICANN meetings o  Web announcements and updates on project Community Wiki o  Brochures and collateral materials •  Facilitate GP-IP interaction
  11. 11. TextText #ICANN50 Moving Forward •  Suggested Plan o  MSR-2 to cover additional scripts by Q1 2015 o  LGR-1 by Q3 2015 (anticipated; based on community proposals) •  Call to Action o  Current panels finishing proposals in early 2015 to submit for LGR-1 o  New panels to provide input for future releases of LGR
  12. 12. Text #ICANN50 LGR Procedure Depends on Community Work • Generation Panels and LGR proposals are REQUIRED for IDN variants to be considered for delegation • Get involved: o  Form a generation panel o  Volunteer to join a generation panel o  Take part in public review of the MSR, LGR proposals, integrated LGR, etc. o  Disseminate information to interested individuals communities Arabic Bengali Chinese Cyrillic Devanagari Georgian Greek Gujarati Gurmukhi Hebrew Japanese Korean Latin Sinhala Tamil Telugu Thai
  13. 13. TextText #ICANN50 Want To Know More? Join us for the LGR Workshop! • IDN Root Zone LGR Generation Panels Workshop Wednesday, 25 June 2014 — 13:00–15:00 BST Balmoral Room
  14. 14. TextText #ICANN50 Resources •  Toolkit for ‘How to form a Generation Panel’ o  Quick Guide Kit for Generation Panels •  Project mailing lists: o  LGR@icann.org: Communicate with LGR community members and the Integration Panel on matters related to LGR work o  IntegrationPanel@icann.org: Contact directly the Integration Panel members on all matters related to LGR work o  ArabicGP@icann.org, ChineseGP@icann.org, CyrillicGP@icann.org, KoreanGP@icann.org, NeoBrahmiGP@icann.org : script community dedicated mailing lists o  idntlds@icann.org: Contact ICANN to submit Generation Panel proposals, individual statement of interests, work reports, updates, etc. o  Discuss issues related to the IDN Variant TLDs Program by subscribing to vip@icann.org here: https://mm.icann.org/mailman/listinfo/vip
  15. 15. TextText #ICANN50 Version 1 of the Maximal Starting Repertoire (MSR-1)
  16. 16. TextText #ICANN50 MSR-1 Available •  MSR-1 released on 20 June 2014: https://www.icann.org/news/announcement-2-2014-06-20-en •  Work can now proceed for 22 scripts to create LGRs for the Root Zone •  Generation Panels will: o  pick repertoire from within the MSR o  decide whether code point variants exist §  decide whether these should lead to allocatable or blocked variant labels o  generate an LGR proposal for public comment and review (and integration) by Integration Panel
  17. 17. TextText #ICANN50 MSR-1 Content in Numbers •  22 scripts o  Arabic, Bengali, Cyrillic, Devanagari, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Lao, Latin, Malayalam, Oriya, Sinhala, Tamil, Telugu and Thai •  ‘Common’ and ‘Inherited’ (shared) •  32,790 code points o  From 97,973 PVALID/CONTEXT code points defined in Unicode 6.3 o  11,172 Hangul syllables and 19,850 Han ideographs
  18. 18. TextText #ICANN50 MSR-1 Public Comment Process •  Analysis of inputs received: o  Requests for revision in code point analysis o  Request for attention to languages with smaller communities •  Response: o  7 additional code points o  Updated MSR Overview and Rationale document to address inputs received, including explicit use of languages on EGIDS to assess ‘established vitality’ of scripts in use •  Public Comment Report: https://www.icann.org/en/system/files/files/report-comments-msr-20jun14-en.pdf
  19. 19. TextText #ICANN50 Expanded Graded Intergenerational Disruption Scale (EGIDS) •  EGIDS o  Not based on population size, but on “established vitality” •  Used as proxy for “effective demand” for the writing system o  Not a perfect correlation, some writing systems not stable •  For the MSR the IP used the cut-off between Level 4 and Level 5 •  4: Educational o  Language in vigorous use, with standardization and literature being sustained through a widespread system of institutionally supported education •  5: Developing o  Language in vigorous use, with literature in a standardized form being used by some though this is not yet widespread or sustainable https://www.ethnologue.com/about/language-status
  20. 20. TextText #ICANN50 MSR - Next Steps •  MSR-1 is only the start •  MSR-2 will complete the repertoire o  Adds some or all of the deferred scripts: §  Armenian, Ethiopic, Khmer, Myanmar, Thaana and Tibetan o  Possible further extensions where warranted, including existing script repertoire •  In the meantime, MSR-1 is the basis for LGR-1
  21. 21. Text #ICANN50 MSR and LGR Timeline GP1 GP2 GP 3 GP 4 GP5 LGR-1 LGR-2 MSR-2 MSR-1 GP4’s script is not included in MSR-1, but is otherwise eligible. It can engage in early dialogue with ICANN and IP, and may pre- emptively begin work on its LGR proposal. GP3 requires a code point excluded from MSR-1, and p o s s e s s c o n v i n c i n g evidence that the code point is eligible. It would pre- emptively work on its LGR proposal, and submits it after MSR-2 is released (incorporating the requested code point)
  22. 22. TextText #ICANN50 IP-GP Communications •  Integration Panel available to help Generation Panels to make progress and to ensure successful submissions of script LGRs •  When scripts are related, coordination between GP is needed, so that consistency between LGRs is agreed between GPs before submitting the LGR to IP •  Use integrationpanel@icann.org to reach the Integration Panel o  Mailing list is archived and public
  23. 23. TextText #ICANN50 Representing LGRs in XML • What Does XML-LGR Enable? Simple validity checking, as well as variant label generation and disposition.
  24. 24. TextText #ICANN50 XML Status and Next Steps •  Finalize specification and move tool(s) to production state •  Internet draft currently under review by the community o  http://tools.ietf.org/html/draft-davies-idntables-07 o  Send feedback or discuss on public mailing list: vip@icann.org •  Use the specification as the basis for Root LGR work •  Creating LGRs or Converting to XML Format, generally a straightforward process • The internet draft contains detailed examples o  Including how to convert RFC 3743-style IDN tables •  MSR-1 can be used as a template for simple LGR with no variants •  ICANN working on setting up a process to support community in syntactically correct submissions
  25. 25. TextText #ICANN50 Community Work Arabic Generation Panel
  26. 26. Task  Force  on  Arabic  Script  IDNs:   Overview  and  Progress   @  ICANN  London  Mee-ng  (June  ‘14)       Task  Force  on  Arabic  Script  IDNs  (TF-­‐AIDN)   Middle  East  Strategy  Working  Group  (MESWG)   I-­‐aidn@meswg.org    
  27. 27. Community  Driven  Way  forward:     Task  Force  on  Arabic  Script  IDNs   •  Crea-on  and  oversight  by  community  based  Middle  East  Strategy   Working  Group  (MESWG;    hRps://community.icann.org/display/MES/MESWG+Members  )   •  TF-­‐AIDN  Objec-ves:  a  holis-c  approach   –  Arabic  Script  Label  Genera-on  Ruleset  (LGR)  for  the  Root  Zone   –  Second  level  LGRs  for  the  Arabic  script   –  Arabic  script  Interna-onalized  Registra-on  Data   –  Universal  acceptability  of  Arabic  script  IDNs   –  Technical  challenges  around  registra-on  of  Arabic  script  IDNs   –  Opera-onal  so[ware  for  registry  and  registrar  opera-ons   –  DNS  security  maRers  specifically  related  to  Arabic  script  IDNs   –  Technical  training  material  around  Arabic  script  IDNs   2  Task  Force  on  Arabic  Script  IDNs  
  28. 28. Membership   •  Currently  26  members  –  applica-ons  s-ll  being  received   •  From   15   countries   –   Australia,   Egypt,   England,   Ethiopia,   Germany,   Iran,   Jordan,   Lebanon,   Malaysia,   Morocco,   Pakistan,  Pales-ne,  Saudi  Arabia,  Sudan,  and  UAE   •  Speaking  than  nine  languages  –  Arabic,  Malay,  Saraiki,  Sindhi,   Pashto,  Persian,  Punjabi,  Torwali,  Urdu,  with  experFse  in  use   of  Arabic  script  from  East  Asia,  South  Asia,  Middle  East,  North   Africa  and  Africa   •  Coming   from   diverse   disciplines   –   academia   (linguis-cs   and   technical),   registries,   registrars,   na-onal   and   regional   policy   bodies,  community  based  organiza-ons,  technical  community   3  Task  Force  on  Arabic  Script  IDNs  
  29. 29. Task  Force  on  Arabic  Script  IDNs   •  Membership  open,  community  based   •  Details  and  interests  of  members  posted  by  MESWG   •  Discussions  publicly  archived   •  Details  at  hRp://lists.meswg.org/mailman/lis-nfo/I-­‐aidn   •  Background  and  Introduc-on  to  TF-­‐AIDN   –  hRps://community.icann.org/display/MES/Task+Force+on+Arabic+Script +IDNs     •  Workspace,  news  and  document  archive   –  hRps://community.icann.org/display/MES/TF-­‐AIDN+Work+Space     •  Email  Archive   –  hRp://lists.meswg.org/pipermail/I-­‐aidn/     4  Task  Force  on  Arabic  Script  IDNs  
  30. 30. Arabic  Script  TLDs  Assigned  or  Delegated   ‫ اﳉﺰاﺋﺮ‬ .1 ‫ ﻋﻤﺎن‬ .2 ‫ ا+ﺮان‬ .3 ‫ اﻣﺎرات‬ .4 ‫ ﺑﺎزار‬ .5 ‫ ﭘﺎﮐﺴﺘﺎن‬ .6 ‫ اﻻردن‬ .7 ‫ بھارت‬ .8   ‫ اﳌﻐﺮب‬ .9 ‫01. اﻟﺴﻌﻮدﻳﺔ‬ Task  Force  on  Arabic  Script  IDNs   5   ‫11. ﺳﻮدان‬ ‫21. ﻣﻠﻴﺴﻴﺎ‬ ‫31. ﺷﺒﻜﺔ‬ ‫ﺳﻮرﻳﺔ‬  .14 ‫51. ﺗﻮﻧﺲ‬ ‫61. ﻣﺼﺮ‬ ‫71. ﻗﻄﺮ‬    ‫ﲔ‬‫81. ﻓﻠﺴﻄ‬  ‫ﻊ‬‫ﻣﻮﻗ‬  .19
  31. 31. SUMMARY  OF  THE  CODE  POINTS   Color   Descrip-on   No.  Of  codes   DISALLOWED  by  MSR   48   ALLOWED  by  MSR   227   Not  Allowed  by   IDN2008   64   Total   339   SubmiHed  to  MSR   172   Discussion  on  codes  to   be  handed  over  to  LGR   Work  is  under  process  
  32. 32. IDN  Variants  Needs  and  Challenges   Security  and  Stability  Needs   ‫ﭘﺎ‬‫ک‬‫ﭘﺎ‬ ‫ﺳﺘﺎن‬‫ك‬         ‫ﺳﺘﺎن‬ U+0643                      U+06A9     ‎‎  xn-­‐-­‐mgbai9a5eva00b              xn-­‐-­‐mgbai9azgqp6j   •  120+   cases   of   visually   same   or   similar   Arabic   script   characters   iden-fied  by  case  study  team   –  Variants   must   not   be   allocated   independently   –  Variants   may   need   ac-va-on   to   allow  user  access  (w/  different  KB)   •  16   IDN   ccTLD   applica-ons   with 4  applica-ons  with  variants   Security  and  Stability  Challenges   •  Consistency  and  innumerability   –  Consistent  across  and  within  TLDs   –  Minimal  ac-va-on  for   manageability   •  Management  tools   –  Registra-on     –  Configura-on  and  Maintenance   –  Security  and  Monitoring     •  Usability  in  applica-ons   –  Browsing,  emailing,  etc.   –  Searching,  privacy,  etc.   7  Task  Force  on  Arabic  Script  IDNs  
  33. 33. Progress   Work  Accomplished   –  Arabic  Script  Genera-on  Panel   –  Principles  for  Inclusion,  Exclusion,  and  Deferral  of  Arabic   Script  Variants   –  MSR  Analysis  and  Feedback   –  Principles  on  Variants   –  Code  Points  for  LGR   Outreach  to  the  Community   –  Launch  at  the  Arab  IGF  Mee-ng  in  Algiers   –  Presenta-on  during  the  IGF  in  Bali   –  Outreach  during  the  ME  DNS  Forum   –  Presenta-on  to  the  community  at  ICANN  Singapore   –  Presenta-on  to  the  community  at  the  APTLD  Mee-ng   Task  Force  on  Arabic  Script  IDNs   8  
  34. 34. Current  Work  and  Next  Steps   •  XML  Manual  [June  30,  2014]   •  Finalize  the  discussions  on  Code  Points  [August  28,  2014]   •  Finalize  the  discussions  on  Variants  [September  30,  2014]   •  Whole  Label  Rules  –  Aug  –  Oct  14   –  Document  principles  for  whole  label  variants   –  Define  whole  label  variants   –  Release  for  Public  Comments   •  Finaliza-on  –  Nov  –Dec  14   –  Finalize  LGR  for  Arabic  script   –  Submit  to  ICANN/IP   –  Release  for  Public  Comments   9  Task  Force  on  Arabic  Script  IDNs  
  35. 35.    ‫ﻪ‬‫  ﮐﺎﺳﻴ‬‫ﺎ‬‫ﺗﺮﳝ‬ ً‫ا‬‫ﺷﻜﺮ‬  ‫ہ‬‫ﺷﮑﺮی‬  ‫س‬‫  ﺳﭙﺎ‬‫ﺎ‬‫ﺑ‬  ‫ﺮ‬‫ﺗﺸﻜ‬ Thank  You   10  Task  Force  on  Arabic  Script  IDNs  
  36. 36. TextText #ICANN50 Community Work CJK Coordination Report
  37. 37. CJK Rules For The Root Zone Kenny Huang, Ph.D. 黃勝雄博士 Member, CDNC / CGP Co-author, RFC3743 IETF Member, Executive Council, APNIC Member, Board of Directors, TWNIC huangksh@gmail.com 2014.Jun
  38. 38. Problem : CJK Is Complicated 2 PuttingCJKlabels in the root zone is evenmore complicated
  39. 39. Institutionalized Problem Solving : Structure 3
  40. 40. Constraints for CJK LGR 4 Independent Tasks Each CJK Panel creates an LGR Each LGR includes a repertoire and variants Define labels permission Define variants labels Assign dispositions •Allocatable •Block Coordination Tasks If an LGR includes Han characters: The variant *mappings* must agree for all the panels The variant *types* may be different The repertoires may be different *Presented by Lee Han Chuan & IP, Shanghai 2014 May 29
  41. 41. Overlap Case Illustration 5 壹 U58F9 弌 U5F0C 壱 U58F1 一 U4E00 allocate block block Variant Unicode Disposition 一 U4E00 Variant Unicode Disposition Chinese LGR Japanese LGR 1 2 3 Integrate ? Integrated Root Zone Label Generation Rules Rejected Generation Panel F T
  42. 42. High Level Conflict Strategies 6 ID Strategy Pros Cons Rank 1 Adopt X Abandon Rcjk Permit X No label rule 2 Adopt X Intersection ∩ (Rcjk) Permit X Permit ∩(variants/disp) Rules changed 3 Adopt X Union ∪(Rcjk) Permit X Permit ∪(variants/disp) Rules changed 4 Abandon X and Rcjk No conflict Label not available 5 Adopt rules based on frequency of use Fair & scientific approach Rules changed; fairness doesn’t mean appropriate CJK overlap C: rule Rc J : rule Rj K: rule Rk
  43. 43. Unified CJK LGR Illustration 7 壹 U58F9 弌 U5F0C 壱 U58F1 一 U4E00 allocate block block Variant Unicode Disposition Chinese LGR 1 2 3 一 U4E00 Variant Unicode Disposition Japanese LGR 壹 U58F9 弌 U5F0C 壱 U58F1 一 U4E00 allocate block block Variant Unicode Disposition Integrated LGR 1 2 3 一 U4E00 Variant Unicode Disposition Integrated LGR Union Intersection
  44. 44. CJK Integration Methodology Divide & Conquer (D&C) Unified CJK Rules Variant Dispositions Minimal Viable Solution CJK Rules Root Zone Admin Strategic Direction Plan and Define CJK Overlap Resources JK Overlap CJ Usage Pattern CJ Overlap CK Usage Pattern CK Overlap Services LGR Constrains Evaluation Method Diversified CJK DemandsRequires C Demands J Demands 8 Requires Split Merge
  45. 45. Splitting Non-overlapping Code Points From Repertories 9 C/J Overlap: 6181 C-Han : 19520 (CNNIC/TWNIC) J-Han : 6356 (JPRS) K-Han : 0 (KRNIC) Develop Conflict Strategy No conflict Rc Rk Rj 13339 175 1 unified code points 13339 175 13514 + CJK Han-overlap in IANA IDN Repository Problem Domain (Unsolved Overlap) : 6181 Rc Rj Rk Chinese LGR Japanese LGR Korean LGR
  46. 46. Engineering Design 10 2 TC : Apple News SC : Sina News JP : Mainichi News Computation for Word Usage and Frequency C/J overlap code points Matching usage frequency of use Split unused code points Split code points of low frequency of use Sample size is statistical significant
  47. 47. Splitting Unused Code Points from The Overlap 11 J only : 203 C only : 1927Rc Rj total unused : 2739 3 C / J Overlap Data Set : 6181 unified code points 2739 203 1927 4869 + C / J usage overlap : 1312 total used : 3442 Problem Domain (Unsolved Overlap) : 1312
  48. 48. Computing Frequency of Use of Code Points 12 4 Initial Data Set : 1312
  49. 49. Top 10 Most Popular Words 13 的, 2774 人, 1005 在, 975 一, 964 是, 960 不, 951 中, 896 有, 883 大, 776 台, 718 TC 日, 20942 月, 20315 人, 4430 国, 3754 中, 3521 被, 2791 称, 2340 地, 2226 南, 2152 生, 2027 SC 日, 822 年, 496 国, 393 会, 345 月, 325 人, 325 大, 319 市, 253 本, 251 中, 250 JP
  50. 50. 14 4.063 4.1884 1.7338 0.6 0.6094 0.886 0.55820.59440.5518 0.468 0.7042 0.4304 0.6026 0.4488 0.36 0.35620.4026 0.385 0.7508 0.325 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 C-Freq J-Freq Top 20 : Chinese Frequency of Use > Japanese Frequency of Use Generated Data Set : 939 FrequencyofUse%
  51. 51. 15 0.1988 0.2788 0.5512 0.1956 0.3834 0.1644 0.2366 0.1056 0.1212 0.1088 0.1688 0.1422 0.1344 0.1622 0.0856 0.2812 0.1588 0.0912 0.1134 0.1288 0 0.1 0.2 0.3 0.4 0.5 0.6 C-Freq J-Freq Top 20 : Chinese Frequency of Use < Japanese Frequency of Use Generated Data Set : 363 FrequencyofUse%
  52. 52. 16 0.0222 0.0144 0.0112 0.0056 0.0056 0.0022 0.0012 0.0012 0.0012 0.0012 0 0.005 0.01 0.015 0.02 0.025 8FCE 7D20 675F 541B 846C 79E9 82BD 96C0 5857 5353 C-Freq J-Freq FrequencyofUse% Chinese Frequency of Use = Japanese Frequency of Use Generated Data Set : 10
  53. 53. Frequency of Use Reassembly 17 unified code points 363 939 1302 + Problem Domain (Unsolved Overlap) : 10 C / J Usage Overlap Data Set : 1312 Freq C > J : 939 Freq J > C : 363 J = C 10 Rc Rj
  54. 54. Data Processing & Computation Recap 18 >20K Han Code Points 6181 CJK Overlap 1312 Usage Overlap Splitting Non-overlapping Frequency of Use Computation Filtering Process Filtering Process LOGICDesign Splitting Unused Methodology Review CJK Coordination Re-Sampling & Computation Statistical Justification 10 Code Points Problem domain was effectively reduced
  55. 55. Future Work 19 1578152323 3433 69 122 289 501 80 26 1455321 0 100 200 300 400 500 600 % numberofXwithinthesamerange Chinese Frequency of Use Minus Japanese Frequency of Use Overlap range redefine Expand (?) Std Dev. Require intensive CKJ coordination & deliberation RcRj Mean= 0.034465 S.D.=0.158477
  56. 56. Re-consider Language Tag 20 K tag J tag TLD registries IANA/Verisign provisioning root server operators publication Internet query Policy C tag Language tag support •RFC 2860 : The name space of language tags is administered by IANA •ISO Standard 639 : •when a language has both an IANA-registered tag and a tag derived from an ISO registered code, one MUST use the ISO tag. •Maintenance Agency : International Information Centre for Terminology (Austria) Sources of Language Tag distribution masters root servers DNS resolvers
  57. 57. 21 PerfectionSyndrome “Engineering isn't about perfect solutions; it's about doing the best you can with limited resources.” Randy Pausch
  58. 58. TextText #ICANN50 Community Work Neo-Brahmi Prospective Generation Panel
  59. 59. Neo-BrahmiNeo-Brahmi Generation Panel
  60. 60. What is Brahmi? • An ancient script • Most of the modern scripts in Indian subcontinent have been derived from Brahmi • Geographically the scripts being used in Central• Geographically the scripts being used in Central Asia, South Asia and South-East Asia • These scripts are used by multiple language families: Largely by Indo-Aryan and Dravidian
  61. 61. What Neo-Brahmi? • Of all the scripts derived from “Brahmi”, not all are in modern usage • Approach is in consonance with the conservatism principle of the LGR procedureprinciple of the LGR procedure
  62. 62. Neo-Brahmi Generation Panel • Currently the group is of 10 members • Mixed bag expertise like linguistic, Unicode • Need more members to cover the diversity within the group Will try to cover possibly all the major scripts/languages of 4 • Will try to cover possibly all the major scripts/languages of Brahmi family. • The group is currently working on gaining more participation within and outside India. • Interested individuals can send their expression of interest to neobrahmiGP@icann.org and idntlds@icann.org
  63. 63. Progress so far… • Reviewed and commented on Maximal Starting Repertoire for the Root • A workshop is planned in AprIGF on “Bringing diverse linguistic communities together for a unified IDN ruleset” for reaching out to the community for the wider participation in the panel 5 out to the community for the wider participation in the panel • Working on the Neo-Brahmi Generation panel proposal – May submit to ICANN by end of August/early September
  64. 64. Integration Panel Neo- Brahmi GP Devanagari SP Hindi Marathi Konkani Nepali Bodo Dogri Tamil Sub-panel Telugu SP Gujarati SP Neo-Brahmi GP Internal Composition Maithili Santhali Gurmukhi SP Bengali SP Bangla Assamese Manipuri Script Section Language Section
  65. 65. TextText #ICANN50 ICANN IDN Team: Thank You USEFUL LINKS: •  The LGR Procedure: http://www.icann.org/en/resources/idn/variant-tlds/lgr-procedure-20mar13-en.pdf •  MSR-1 Public Comment: https://www.icann.org/public-comments/msr-2014-03-03-en •  MSR-1 released: https://www.icann.org/news/announcement-2-2014-06-20-en •  V07 Internet Draft for LGR Rules Toolset Project Published: http://tools.ietf.org/html/draft-davies-idntables-07 •  Call for Generation Panels to Develop Root Zone Label Generation Rules: http://www.icann.org/en/news/announcements/announcement-11jul13-en.htm •  Setting up and running a Generation Panel: https://community.icann.org/display/croscomlgrprocedure/Generation+Panels •  Community Wiki LGR Project website: https://community.icann.org/display/croscomlgrprocedure/Root+Zone+LGR+Project •  For more info on the IDN Variant related pages, please visit: https://www.icann.org/resources/pages/variant-tlds-2012-05-08-en •  To submit expressions of interest, or if you have additional questions, please contact ICANN at: idntlds@icann.org

×