SlideShare a Scribd company logo
Internationalised Domain
Names, Foreign Language
Websites, & Investigations

  Jonathan D. Abolins
  Thu, 28 July 2011
  11:00 AM - 12:00 PM PDT (GMT-08:00)

  Post-Webinar Version with additional notes.
Introduction
   About me

   Why this topic

   Some notes about this presentation’s approach.
Note About Translation Tools
   Machine translation tools help a lot.

   But they can also leave out much or mislead.

   Helps to know the languages involved or work
    with a competent translator.
       But the translators might not know about some
        recent Internet developments.
Quick Overview of Terms
   Labels – example: www.veresoftware.com
                       Label 1   Label 2    Label 3

   TLD – Top Level Domain (e.g., .com or .uk)
   ccTLD – Country Code TLD (e.g., .uk, .ru)
   IDN – Internationalised Domain Name
   Unicode
   ACE – ASCII Compatible Encoding
   Punycode (RFC 3492), a form of ACE
OSINT in an Alphabet Soup of
the Networked World




But see http://www.cartoonistgroup.com/store/add.php?iid=8381
Sometimes, alphabet soup is soup, not a coded message.
A couple of Examples of non-
English Windows 7 Desktops
   First is Russian.
   Second is Arabic. Note the shift to the right.
   They were done by switching the languages on
    one of my Windows 7 Ultimate PCs.
   The GUI labels for My Documents, My Music,
    etc. are localised. But the underlying directory
    names, as seen via dir command in a CMD
    window, did not change.
The Net No Longer “Speaks”
Primarily English
   Old days
       Had to use code pages (character encodings) for
        non-Latin text. Can be confusing.
       Difficult to mix languages.
   Now
       Unicode covers most of the world’s writing systems.
        90+ scripts.
       Still encounter code pages.
But Underlying Code is
Universal
   Bits & Bytes
   Programming languages
   HTML codes
   IP Adresses
   Etc.
   This can work to your advantage!
If a foreign site offers English, why
read the foreign language version?




 http://krebsonsecurity.com/2010/12/russian-police-only-translate-the-good-news/
What if you can’t read Russian?
File/Pathnames May Have Clues…




       http://www.mvd.ru/news/
File/Pathnames May Have Clues…




       http://www.mvd.ru/presscenter/
Note for the Previous Slides…
   Sometimes the foreign site might be using a site
    structure developed in the English speaking
    world. Particularly the case with some Web
    forums.
   Other times, the Web designers are trying to
    avoid problems with mixing texts for directory
    and file names.
   In any case, the file path info often can be a
    help.
Tip: Google Chrome Has
Built-in Translation Function




http://habrahabr.ru/blogs/DIY/
Search Tip:
A Picture is Worth 1K Words
   An image search might help to zero in on the entries of
    interest.
   Especially useful if you want to save time wading
    through foreign
    language hits.
   Example search for the
    RASKAT (Раскат) data
    destruction device from
    Russia. Look for images
    the look “computerish”.
Google Translate Annoyance:
URL Conversion




    Tried to type in “http://www.xakep.ru”
    but Google “Russified” it.

                                             Uncheck the Phonetic Typing box
                                             before entering URLs for site
                                             translation



/
Internationalised Domain
Names (IDN)
   Intro – The Phonebook Analogy

    Imagine a phonebook where people could have entries in their prefered
    scripts. Mr. Wong could have his in Chinese. Ms. Romanov could have
    her in Russian. And so on. Many people will choose to have both Latin
    text and foreign text entries for the same phone number. Makes it easier
    for their family and friends to find them. But others fret about the
    different texts.

    Underneath it all, however, the phone system hardware, networks, and
    the phone numbers remain the same.

    Something like this is happening with the Internet.
The First Four IDN ccTLDs
In May 2010
 United Arab Emirates: ‫.اﻣﺎرات‬

 Saudi Arabia: ‫.اﻟﺳﻌودﯾﺔ‬

 Russian Federation: .рф

 Egypt: ‫.ﻣﺻر‬



More IDN ccTLDs have been launched.
Remember, IDNs can also exist under non-IDN ccTLDs.
  Example: ‫.גינדי‬com or bücher.com

http://blog.icann.org/2010/05/idn-cctlds-%E2%80%93-the-first-four/
Examples of IDNs & Punycode
   ‫.גינדי‬com
   스타벅스코리아.com
   газпром.рф
   ‫ﺳﺟل.ﻣﺻر‬
   汕头大学.中国
   xn--pssza05mm53a.xn--fiqs8s/
Gindi Realty (Israel)
‫.גינדי‬com




Punycode:   http://xn--6dbcrb7a.com/
Offline IDN Example
Starbucks Korea
 스타벅스코리아.com




Punycode: http://xn--oy2b35ckwhba574atvuzkc.com/
Shantou University (PRC)
汕头大学.中国/




 Same as
 http://stu.edu.cn


Punycode: http://xn--pssza05mm53a.xn--fiqs8s/
Sajela.MiSr (Egypt)
‫ﺳﺟل.ﻣﺻر‬




Punycode: http://xn--rgbn6c.xn--wgbh1c/
Fun with Arabic & Other
RTL (right to Left) IDN URLs
   Reading direction can switch.
   Example URL.
    http://‫/ﺳﺟل.ﻣﺻر‬Files/GeneralPolicy.pdf
    1 ---->   <----------2   3 --------------------------------------------->


   The direction changes can cause problems in
    various tools and procedures.
   This is where Punycode really helps.
    http://xn--rgbn6c.xn--wgbh1c/Files/GeneralPolicy.pdf
Punycode
   DNS works with Punycode for IDN labels
    Example: ‫ﺳﺟل.ﻣﺻر‬
    Punycode: xn--rgbn6c.xn--wgbh1c
   .xn--wgbh1c is Punycode for the ‫ ﻣﺻر‬IDN ccTLD.
       Note the distinctive xn– prefix.
   Much safer way to store & use IDNs.
   Various online and offline tools for conversion.
   Conversions works in both directions.
    Unicode IDN <-> Punycode.
An Online Converter




http://idnaconv.phlymail.de/
idn: An Offline IDN Converter
(Linux)
Challenges with IDNs
   Recognising what it is.
    (domain name, URL, e-mail address).
   Which end is the ccTLD?
   What language is it?
   What country of registry?
   Sad 'cause I can't find the ‫( ص‬Saad) key.
    (How do I enter the IDN?)
       Some characters have multiple codes.
   Many tools don't work correctly with IDNs.
   Homograph (Look-alike) Attacks
Recognising IDNs. Not just URLs.
How About IDN E-mail Addresses?
   What if you found a note with this:
    ваше_имя@письмо.рф ?

   Would you know it’s an
    e-mail address?
   Would your translator
    recognise it as an e-mail
    address?
By the Way, What About Vocalisation of
URLs & e-Mail Addresses in Foreign
Languages?
   The way a URL or an email address – IDN or not – is
    said can differ across languages.
   How is the “at” symbol or the “dot” said?
   Example with Russian and “Ivan@pochta.ru”:
    “Ivan sobachka pochta tochka ru”
    or
    “Ivan sobachka pochta dot ru”
       Sobachka (собачка – “little dog”) is a popular Russian way of
        voicalising the “@” sign.
       Tochka (точка – “point”) or Dot (дот) used for the “.” mark.


How to say an e-mail address in Russian:
http://www.themoscowtimes.com/opinion/article/the-really-cool-people-say-
dot/439857.html
What Does the IDN URL Mean?
How Do I Type the IDN?
   Copy & Paste
       Directly from page
       Google Translate
       Wikipedia
   Keyboard input
       Need the right keyboard or
        keytops.
       System setup for allowing
        the foreign language input.
   Character map tools
One Character, Multiple Codes




http://singapore41.icann.org/meetings/singapore2011/presentation-idn-variant-tlds-update-20jun11-en.pdf
Common Net Commands & IDN
   Windows cmd CLI a problem w/o modifcation
   Tools have to be able to handle Unicode.
   ping
   nslookup
   dig
   Whois (can be tricky at times)
   Punycode is more reliable.
Not All Our Tools Are Unicode
or IDN-Ready
Whois & IDN ccTLD Domains
   Whois on the domain name might not always
    work well with some IDN ccTLD domains.
   But there are options, including:
       Get and lookup IP address
       Use IANA db & Delegation Record
IANA Root Zone db
http://www.iana.org/domains/root/db/#
IANA Delegation Records




http://www.iana.org/domains/root/db/xn--p1ai.html
Security Concern:
Homograph Attacks
Are These Sets The Same?


 АаВьСсЕеНКкМРрОоТуХхЗ

 AaBbCcEeHKkMPpOoTyXx3
Looking at the Underlying Code
   АаВьСсЕеНКкМРрОоТуХхЗ <-Cryllic
    0410 0430 0412 044C 0421 0441 0415 0435 041D
    041A 043A 041C 0420 0440 041E 043E 0422 0443
    0425 0445 0417


   AaBbCcEeHKkMPpOoTyXx3 <-ASCII
    0041 0061 0042 0062 0043 0063 0045 0065 0048
    004B 006B 004D 0050 0070 004F 006F 0054 0079
    0058 0078 0033
Homographs for Fraud
& Punycode for Detection
   http://www.facebook.com/
    Really is http://www.facebook.com/
   http://www.facebοok.com/
    http://www.xn--facebok-dpf.com/
   http://www.faceboοk.com/
    http://www.xn--facebok-epf.com/
   http://www.facebοοk.com/
    http://www.xn--facebk-m0ea.com/



    http://idnaconv.phlymail.de/
Homograph Attack Concerns
   Raised by various people, including 3ric
    Johanson at Shmoocon in 2005.
   He registered www.xn—pypal-4ve.com to spoof
    Paypal.

   Anti-Phishing Working Group Global Phishing
    Survey 1H2010: last true homograph attack
    was in 2009. A “hotmail.net” look-alike:
    xn--hotmal-t9a.net
    Global Phishing Survey 1H2010: http://tinyurl.com/2ch5o87
Not All Homographs Are Bad.
Clever Homograph: xakep.ru
Special Topic:
Character Encodings
Code Pages /Character
Encodings
   Examples:
       Arabic: Windows 1256, IBM 864
       Cyrillic: IBM 855, KOI8-R, Windows 1251
       Hebrew: IBM 862, Windows 1255
       See also http://en.wikipedia.org/wiki/Code_pages
Character Encoding in Internet
documents
   If page doesn’t render properly:
       Check HTML source for clues like
        <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=KOI8-R">

       Server’s country location might be a clue.
       Try browser’s character encoding tools. (FireFox
        example)
       For Cyrillic, check out these tools:
           Universal Cyrillic Decoder page http://2cyr.com/decode/
           Russian Anywhere (re) package for many Linux distros.
Example




http://www.lena.ru/songs.html
Firefox – Character Encoding
Set to Auto Detect
In recent versions of Firefox,
Firefox button
-> Web Developer
-> Character Encoding
-> Auto Detect

In some cases, trial & error is
needed.

This method also can work
for local files.




  http://www.lena.ru/songs.html
Resources
       ICANN
          IDN Info: http://www.icann.org/en/topics/idn/
          Blog: http://blog.icann.org/
          IDN Wiki: http://idn.icann.org/
          IDN TLD Map: http://www.icann.org/en/maps/idntld.htm
       IDN Blog
        http://idnblog.com/
       Verisign IDN FAQ
        http://www.verisigninc.com/en_US/products-and-services/domain-name-
        services/domain-information-center/idn-resources/idn-faq/index.xhtml
       This Domain Name is Greek to Me: An Introduction to Internationalized
        Domain Names for Investigators (DFI News)
        http://www.dfinews.com/article/domain-name-greek-me-introduction-internationalized-domain-
        names-investigators?page=0,1
       Internationalized Domain Names & Investigations in the Networked World
        (one of the DojoCon 2010 videos)
        http://www.irongeek.com/i.php?page=videos/dojocon-2010-videos
Resources (cont)
   XN—ICANN
    http://www.hackerfactor.com/blog/index.php?/archives/321-xn-ICANN.html
   IDNForums.Com
    Emphasis upon buying & selling IDN domains.
    http://www.idnforums.com/
   IANA ccTLDs Database
    http://www.iana.org/domains/root/db/#
   Stratchclyde Forensics – IDN Homograph Attacks
    http://www.computerforensicsglasgow.info/IDN_Homograph_Attacks.htm
   New Arrival in Russian Spam – .РФ
    http://www.thesecurityblog.com/2011/02/new-arrival-in-russian-spam-%D1%80%D1%84/
   An IDN – Punycode Converter
    http://idnaconv.phlymail.de/
   How to say an e-mail address in Russian
    http://www.themoscowtimes.com/opinion/article/the-really-cool-people-say-
    dot/439857.html
Resources (cont)
Keyboard Setup
   How to Change Keyboard Language
    http://www.lib.uchicago.edu/e/using/catalog/inputoptions.html
    http://tlt.its.psu.edu/suggestions/international/keyboards/winkey.html
    http://www.al-bab.com/arab/comp.htm



Translation and Language Issues
   American Translators Association: Getting It Right (insights into translation
    issues)
    http://www.atanet.org/publications/getting_it_right.php
   Basis Technology – Excellent papers & presentations on language issues.
    http://www.basistech.com/resources/
    (The links on the left have more papers on topics such as Middle Eastern Languages, Digital Forensics,
    etc.)
Resources: Google Searches
for Some IDN ccTLDs
   Republic of Korea: 한국
    http://www.google.com/search?q=site%3A.한국
   Serbia: СРБ
    http://www.google.com/search?q=site%3A%D0%A1%D0%A0%D0%91
   Peoples Republic of China: 中国
    http://www.google.com/search?q=site%3A.%E4%B8%AD%E5%9B%BD
    http://www.google.com/search?q=site%3A.%E4%B8%AD%E5%9C%8B
   Hong Kong SAR: 香港
    http://www.google.com/search?q=site%3A.%E9%A6%99%E6%B8%AF
   Taiwan: 台湾
    http://www.google.com/search?q=site%3A.%E5%8F%B0%E6%B9%BE
    http://www.google.com/search?q=site%3A.%E5%8F%B0%E7%81%A3
   Egypt: ‫ﻣﺻر‬
    http://www.google.com/search?q=site%3A.‫ﻣﺻر‬
   Jordan: ‫اﻻردن‬
    http://www.google.com/search?q=site%3A.%D8%A7%D9%84%D8%A7%D8%B1%D8%AF%D9%86
   Saudi Arabia: ‫اﻟﺳﻌودﯾﺔ‬
    http://www.google.com/search?q=site%3A.%D8%A7%D9%84%D8%B3%D8%B9%D9%88%D8%AF%
    D9%8A%D8%A9
   Russian Federation: РФ
    http://www.google.com/search?q=site%3A.%D0%A0%D0%A4
Thank you.

• Jon.Abolins@gmail.com
• Twitter: @jabolins

• Web: idn.MeydaOnline.com

More Related Content

Similar to Internationalised Domain Names & Internet Investigations

International Web Application Development
International Web Application DevelopmentInternational Web Application Development
International Web Application Development
Sarah Allen
 
C 2
C 2C 2
How To Build And Launch A Successful Globalized App From Day One Or All The ...
How To Build And Launch A Successful Globalized App From Day One  Or All The ...How To Build And Launch A Successful Globalized App From Day One  Or All The ...
How To Build And Launch A Successful Globalized App From Day One Or All The ...
agileware
 
Web of data
Web of dataWeb of data
Web of data
Yves Raimond
 
Sweo talk
Sweo talkSweo talk
Gates Toorcon X New School Information Gathering
Gates Toorcon X New School Information GatheringGates Toorcon X New School Information Gathering
Gates Toorcon X New School Information Gathering
Chris Gates
 
Webtechnologies
Webtechnologies Webtechnologies
Webtechnologies
-jyothish kumar sirigidi
 
C 2
C 2C 2
Mind Your lang — Accessibility Camp Toronto 2016
Mind Your lang — Accessibility Camp Toronto 2016Mind Your lang — Accessibility Camp Toronto 2016
Mind Your lang — Accessibility Camp Toronto 2016
Adrian Roselli
 
Mind your lang (for role=drinks at CSUN 2017)
Mind your lang (for role=drinks at CSUN 2017)Mind your lang (for role=drinks at CSUN 2017)
Mind your lang (for role=drinks at CSUN 2017)
Adrian Roselli
 
How a website works - Er Ganesh Naik / Cool Software Solution
How a website works - Er Ganesh Naik / Cool Software SolutionHow a website works - Er Ganesh Naik / Cool Software Solution
How a website works - Er Ganesh Naik / Cool Software Solution
Ganesh Naik
 
Mind Your Lang — London Web Standards
Mind Your Lang — London Web StandardsMind Your Lang — London Web Standards
Mind Your Lang — London Web Standards
Adrian Roselli
 
Anatomy Of A Domain Name and URL
Anatomy Of A Domain Name and URLAnatomy Of A Domain Name and URL
Anatomy Of A Domain Name and URL
Andy Wibbels
 
A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between: accessib...
A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between:  accessib...A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between:  accessib...
A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between: accessib...
mtoppa
 
Introduction to the Internet
Introduction to the InternetIntroduction to the Internet
Introduction to the Internet
coastalgraphics
 
Protocols
ProtocolsProtocols
Protocols
Jason Smyth
 
How To Be A Hacker
How To Be A HackerHow To Be A Hacker
How To Be A Hacker
Paul Tarjan
 
Dmdh winter 2015 session #1
Dmdh winter 2015 session #1Dmdh winter 2015 session #1
Dmdh winter 2015 session #1
sarahkh12
 
Unit 8 ecommerce p1
Unit 8   ecommerce p1Unit 8   ecommerce p1
Unit 8 ecommerce p1
IronCheese
 
Getting started with Go - Florin Patan - Codemotion Rome 2017
Getting started with Go - Florin Patan - Codemotion Rome 2017Getting started with Go - Florin Patan - Codemotion Rome 2017
Getting started with Go - Florin Patan - Codemotion Rome 2017
Codemotion
 

Similar to Internationalised Domain Names & Internet Investigations (20)

International Web Application Development
International Web Application DevelopmentInternational Web Application Development
International Web Application Development
 
C 2
C 2C 2
C 2
 
How To Build And Launch A Successful Globalized App From Day One Or All The ...
How To Build And Launch A Successful Globalized App From Day One  Or All The ...How To Build And Launch A Successful Globalized App From Day One  Or All The ...
How To Build And Launch A Successful Globalized App From Day One Or All The ...
 
Web of data
Web of dataWeb of data
Web of data
 
Sweo talk
Sweo talkSweo talk
Sweo talk
 
Gates Toorcon X New School Information Gathering
Gates Toorcon X New School Information GatheringGates Toorcon X New School Information Gathering
Gates Toorcon X New School Information Gathering
 
Webtechnologies
Webtechnologies Webtechnologies
Webtechnologies
 
C 2
C 2C 2
C 2
 
Mind Your lang — Accessibility Camp Toronto 2016
Mind Your lang — Accessibility Camp Toronto 2016Mind Your lang — Accessibility Camp Toronto 2016
Mind Your lang — Accessibility Camp Toronto 2016
 
Mind your lang (for role=drinks at CSUN 2017)
Mind your lang (for role=drinks at CSUN 2017)Mind your lang (for role=drinks at CSUN 2017)
Mind your lang (for role=drinks at CSUN 2017)
 
How a website works - Er Ganesh Naik / Cool Software Solution
How a website works - Er Ganesh Naik / Cool Software SolutionHow a website works - Er Ganesh Naik / Cool Software Solution
How a website works - Er Ganesh Naik / Cool Software Solution
 
Mind Your Lang — London Web Standards
Mind Your Lang — London Web StandardsMind Your Lang — London Web Standards
Mind Your Lang — London Web Standards
 
Anatomy Of A Domain Name and URL
Anatomy Of A Domain Name and URLAnatomy Of A Domain Name and URL
Anatomy Of A Domain Name and URL
 
A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between: accessib...
A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between:  accessib...A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between:  accessib...
A11Y? I18N? L10N? UTF8? WTF? Understanding the connections between: accessib...
 
Introduction to the Internet
Introduction to the InternetIntroduction to the Internet
Introduction to the Internet
 
Protocols
ProtocolsProtocols
Protocols
 
How To Be A Hacker
How To Be A HackerHow To Be A Hacker
How To Be A Hacker
 
Dmdh winter 2015 session #1
Dmdh winter 2015 session #1Dmdh winter 2015 session #1
Dmdh winter 2015 session #1
 
Unit 8 ecommerce p1
Unit 8   ecommerce p1Unit 8   ecommerce p1
Unit 8 ecommerce p1
 
Getting started with Go - Florin Patan - Codemotion Rome 2017
Getting started with Go - Florin Patan - Codemotion Rome 2017Getting started with Go - Florin Patan - Codemotion Rome 2017
Getting started with Go - Florin Patan - Codemotion Rome 2017
 

More from Vere Software

Online officer safety 9 1-11
Online officer safety 9 1-11Online officer safety 9 1-11
Online officer safety 9 1-11
Vere Software
 
How Authentication, Collection & Reporting Strengthen OSINT's Value
How Authentication, Collection & Reporting Strengthen OSINT's ValueHow Authentication, Collection & Reporting Strengthen OSINT's Value
How Authentication, Collection & Reporting Strengthen OSINT's Value
Vere Software
 
Online sources of information december 2010
Online sources of information december 2010Online sources of information december 2010
Online sources of information december 2010
Vere Software
 
Tracing IP Addresses: Gary Kessler
Tracing IP Addresses: Gary KesslerTracing IP Addresses: Gary Kessler
Tracing IP Addresses: Gary Kessler
Vere Software
 
Social Media Policy & Law Enforcement Investigations
Social Media Policy & Law Enforcement InvestigationsSocial Media Policy & Law Enforcement Investigations
Social Media Policy & Law Enforcement Investigations
Vere Software
 
WebCase: How To Archive A Web Page
WebCase: How To Archive A Web PageWebCase: How To Archive A Web Page
WebCase: How To Archive A Web Page
Vere Software
 

More from Vere Software (6)

Online officer safety 9 1-11
Online officer safety 9 1-11Online officer safety 9 1-11
Online officer safety 9 1-11
 
How Authentication, Collection & Reporting Strengthen OSINT's Value
How Authentication, Collection & Reporting Strengthen OSINT's ValueHow Authentication, Collection & Reporting Strengthen OSINT's Value
How Authentication, Collection & Reporting Strengthen OSINT's Value
 
Online sources of information december 2010
Online sources of information december 2010Online sources of information december 2010
Online sources of information december 2010
 
Tracing IP Addresses: Gary Kessler
Tracing IP Addresses: Gary KesslerTracing IP Addresses: Gary Kessler
Tracing IP Addresses: Gary Kessler
 
Social Media Policy & Law Enforcement Investigations
Social Media Policy & Law Enforcement InvestigationsSocial Media Policy & Law Enforcement Investigations
Social Media Policy & Law Enforcement Investigations
 
WebCase: How To Archive A Web Page
WebCase: How To Archive A Web PageWebCase: How To Archive A Web Page
WebCase: How To Archive A Web Page
 

Recently uploaded

Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Jeffrey Haguewood
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
saastr
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 

Recently uploaded (20)

Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStrDeep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
Deep Dive: Getting Funded with Jason Jason Lemkin Founder & CEO @ SaaStr
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 

Internationalised Domain Names & Internet Investigations

  • 1. Internationalised Domain Names, Foreign Language Websites, & Investigations Jonathan D. Abolins Thu, 28 July 2011 11:00 AM - 12:00 PM PDT (GMT-08:00) Post-Webinar Version with additional notes.
  • 2. Introduction  About me  Why this topic  Some notes about this presentation’s approach.
  • 3. Note About Translation Tools  Machine translation tools help a lot.  But they can also leave out much or mislead.  Helps to know the languages involved or work with a competent translator.  But the translators might not know about some recent Internet developments.
  • 4. Quick Overview of Terms  Labels – example: www.veresoftware.com Label 1 Label 2 Label 3  TLD – Top Level Domain (e.g., .com or .uk)  ccTLD – Country Code TLD (e.g., .uk, .ru)  IDN – Internationalised Domain Name  Unicode  ACE – ASCII Compatible Encoding  Punycode (RFC 3492), a form of ACE
  • 5. OSINT in an Alphabet Soup of the Networked World But see http://www.cartoonistgroup.com/store/add.php?iid=8381 Sometimes, alphabet soup is soup, not a coded message.
  • 6. A couple of Examples of non- English Windows 7 Desktops  First is Russian.  Second is Arabic. Note the shift to the right.  They were done by switching the languages on one of my Windows 7 Ultimate PCs.  The GUI labels for My Documents, My Music, etc. are localised. But the underlying directory names, as seen via dir command in a CMD window, did not change.
  • 7.
  • 8.
  • 9. The Net No Longer “Speaks” Primarily English  Old days  Had to use code pages (character encodings) for non-Latin text. Can be confusing.  Difficult to mix languages.  Now  Unicode covers most of the world’s writing systems. 90+ scripts.  Still encounter code pages.
  • 10. But Underlying Code is Universal  Bits & Bytes  Programming languages  HTML codes  IP Adresses  Etc.  This can work to your advantage!
  • 11. If a foreign site offers English, why read the foreign language version? http://krebsonsecurity.com/2010/12/russian-police-only-translate-the-good-news/
  • 12. What if you can’t read Russian?
  • 13. File/Pathnames May Have Clues… http://www.mvd.ru/news/
  • 14. File/Pathnames May Have Clues… http://www.mvd.ru/presscenter/
  • 15. Note for the Previous Slides…  Sometimes the foreign site might be using a site structure developed in the English speaking world. Particularly the case with some Web forums.  Other times, the Web designers are trying to avoid problems with mixing texts for directory and file names.  In any case, the file path info often can be a help.
  • 16. Tip: Google Chrome Has Built-in Translation Function http://habrahabr.ru/blogs/DIY/
  • 17. Search Tip: A Picture is Worth 1K Words  An image search might help to zero in on the entries of interest.  Especially useful if you want to save time wading through foreign language hits.  Example search for the RASKAT (Раскат) data destruction device from Russia. Look for images the look “computerish”.
  • 18. Google Translate Annoyance: URL Conversion Tried to type in “http://www.xakep.ru” but Google “Russified” it. Uncheck the Phonetic Typing box before entering URLs for site translation /
  • 19.
  • 20. Internationalised Domain Names (IDN)  Intro – The Phonebook Analogy Imagine a phonebook where people could have entries in their prefered scripts. Mr. Wong could have his in Chinese. Ms. Romanov could have her in Russian. And so on. Many people will choose to have both Latin text and foreign text entries for the same phone number. Makes it easier for their family and friends to find them. But others fret about the different texts. Underneath it all, however, the phone system hardware, networks, and the phone numbers remain the same. Something like this is happening with the Internet.
  • 21. The First Four IDN ccTLDs In May 2010  United Arab Emirates: ‫.اﻣﺎرات‬  Saudi Arabia: ‫.اﻟﺳﻌودﯾﺔ‬  Russian Federation: .рф  Egypt: ‫.ﻣﺻر‬ More IDN ccTLDs have been launched. Remember, IDNs can also exist under non-IDN ccTLDs. Example: ‫.גינדי‬com or bücher.com http://blog.icann.org/2010/05/idn-cctlds-%E2%80%93-the-first-four/
  • 22. Examples of IDNs & Punycode  ‫.גינדי‬com  스타벅스코리아.com  газпром.рф  ‫ﺳﺟل.ﻣﺻر‬  汕头大学.中国  xn--pssza05mm53a.xn--fiqs8s/
  • 25. Starbucks Korea 스타벅스코리아.com Punycode: http://xn--oy2b35ckwhba574atvuzkc.com/
  • 26. Shantou University (PRC) 汕头大学.中国/ Same as http://stu.edu.cn Punycode: http://xn--pssza05mm53a.xn--fiqs8s/
  • 28. Fun with Arabic & Other RTL (right to Left) IDN URLs  Reading direction can switch.  Example URL. http://‫/ﺳﺟل.ﻣﺻر‬Files/GeneralPolicy.pdf 1 ----> <----------2 3 --------------------------------------------->  The direction changes can cause problems in various tools and procedures.  This is where Punycode really helps. http://xn--rgbn6c.xn--wgbh1c/Files/GeneralPolicy.pdf
  • 29. Punycode  DNS works with Punycode for IDN labels Example: ‫ﺳﺟل.ﻣﺻر‬ Punycode: xn--rgbn6c.xn--wgbh1c  .xn--wgbh1c is Punycode for the ‫ ﻣﺻر‬IDN ccTLD.  Note the distinctive xn– prefix.  Much safer way to store & use IDNs.  Various online and offline tools for conversion.  Conversions works in both directions. Unicode IDN <-> Punycode.
  • 31. idn: An Offline IDN Converter (Linux)
  • 32. Challenges with IDNs  Recognising what it is. (domain name, URL, e-mail address).  Which end is the ccTLD?  What language is it?  What country of registry?  Sad 'cause I can't find the ‫( ص‬Saad) key. (How do I enter the IDN?)  Some characters have multiple codes.  Many tools don't work correctly with IDNs.  Homograph (Look-alike) Attacks
  • 33. Recognising IDNs. Not just URLs. How About IDN E-mail Addresses?  What if you found a note with this: ваше_имя@письмо.рф ?  Would you know it’s an e-mail address?  Would your translator recognise it as an e-mail address?
  • 34. By the Way, What About Vocalisation of URLs & e-Mail Addresses in Foreign Languages?  The way a URL or an email address – IDN or not – is said can differ across languages.  How is the “at” symbol or the “dot” said?  Example with Russian and “Ivan@pochta.ru”: “Ivan sobachka pochta tochka ru” or “Ivan sobachka pochta dot ru”  Sobachka (собачка – “little dog”) is a popular Russian way of voicalising the “@” sign.  Tochka (точка – “point”) or Dot (дот) used for the “.” mark. How to say an e-mail address in Russian: http://www.themoscowtimes.com/opinion/article/the-really-cool-people-say- dot/439857.html
  • 35. What Does the IDN URL Mean?
  • 36. How Do I Type the IDN?  Copy & Paste  Directly from page  Google Translate  Wikipedia  Keyboard input  Need the right keyboard or keytops.  System setup for allowing the foreign language input.  Character map tools
  • 37. One Character, Multiple Codes http://singapore41.icann.org/meetings/singapore2011/presentation-idn-variant-tlds-update-20jun11-en.pdf
  • 38. Common Net Commands & IDN  Windows cmd CLI a problem w/o modifcation  Tools have to be able to handle Unicode.  ping  nslookup  dig  Whois (can be tricky at times)  Punycode is more reliable.
  • 39. Not All Our Tools Are Unicode or IDN-Ready
  • 40. Whois & IDN ccTLD Domains  Whois on the domain name might not always work well with some IDN ccTLD domains.  But there are options, including:  Get and lookup IP address  Use IANA db & Delegation Record
  • 41. IANA Root Zone db http://www.iana.org/domains/root/db/#
  • 44. Are These Sets The Same? АаВьСсЕеНКкМРрОоТуХхЗ AaBbCcEeHKkMPpOoTyXx3
  • 45. Looking at the Underlying Code  АаВьСсЕеНКкМРрОоТуХхЗ <-Cryllic 0410 0430 0412 044C 0421 0441 0415 0435 041D 041A 043A 041C 0420 0440 041E 043E 0422 0443 0425 0445 0417  AaBbCcEeHKkMPpOoTyXx3 <-ASCII 0041 0061 0042 0062 0043 0063 0045 0065 0048 004B 006B 004D 0050 0070 004F 006F 0054 0079 0058 0078 0033
  • 46. Homographs for Fraud & Punycode for Detection  http://www.facebook.com/ Really is http://www.facebook.com/  http://www.facebοok.com/ http://www.xn--facebok-dpf.com/  http://www.faceboοk.com/ http://www.xn--facebok-epf.com/  http://www.facebοοk.com/ http://www.xn--facebk-m0ea.com/ http://idnaconv.phlymail.de/
  • 47. Homograph Attack Concerns  Raised by various people, including 3ric Johanson at Shmoocon in 2005.  He registered www.xn—pypal-4ve.com to spoof Paypal.  Anti-Phishing Working Group Global Phishing Survey 1H2010: last true homograph attack was in 2009. A “hotmail.net” look-alike: xn--hotmal-t9a.net Global Phishing Survey 1H2010: http://tinyurl.com/2ch5o87
  • 48. Not All Homographs Are Bad. Clever Homograph: xakep.ru
  • 50. Code Pages /Character Encodings  Examples:  Arabic: Windows 1256, IBM 864  Cyrillic: IBM 855, KOI8-R, Windows 1251  Hebrew: IBM 862, Windows 1255  See also http://en.wikipedia.org/wiki/Code_pages
  • 51. Character Encoding in Internet documents  If page doesn’t render properly:  Check HTML source for clues like <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=KOI8-R">  Server’s country location might be a clue.  Try browser’s character encoding tools. (FireFox example)  For Cyrillic, check out these tools:  Universal Cyrillic Decoder page http://2cyr.com/decode/  Russian Anywhere (re) package for many Linux distros.
  • 53. Firefox – Character Encoding Set to Auto Detect In recent versions of Firefox, Firefox button -> Web Developer -> Character Encoding -> Auto Detect In some cases, trial & error is needed. This method also can work for local files. http://www.lena.ru/songs.html
  • 54. Resources  ICANN  IDN Info: http://www.icann.org/en/topics/idn/  Blog: http://blog.icann.org/  IDN Wiki: http://idn.icann.org/  IDN TLD Map: http://www.icann.org/en/maps/idntld.htm  IDN Blog http://idnblog.com/  Verisign IDN FAQ http://www.verisigninc.com/en_US/products-and-services/domain-name- services/domain-information-center/idn-resources/idn-faq/index.xhtml  This Domain Name is Greek to Me: An Introduction to Internationalized Domain Names for Investigators (DFI News) http://www.dfinews.com/article/domain-name-greek-me-introduction-internationalized-domain- names-investigators?page=0,1  Internationalized Domain Names & Investigations in the Networked World (one of the DojoCon 2010 videos) http://www.irongeek.com/i.php?page=videos/dojocon-2010-videos
  • 55. Resources (cont)  XN—ICANN http://www.hackerfactor.com/blog/index.php?/archives/321-xn-ICANN.html  IDNForums.Com Emphasis upon buying & selling IDN domains. http://www.idnforums.com/  IANA ccTLDs Database http://www.iana.org/domains/root/db/#  Stratchclyde Forensics – IDN Homograph Attacks http://www.computerforensicsglasgow.info/IDN_Homograph_Attacks.htm  New Arrival in Russian Spam – .РФ http://www.thesecurityblog.com/2011/02/new-arrival-in-russian-spam-%D1%80%D1%84/  An IDN – Punycode Converter http://idnaconv.phlymail.de/  How to say an e-mail address in Russian http://www.themoscowtimes.com/opinion/article/the-really-cool-people-say- dot/439857.html
  • 56. Resources (cont) Keyboard Setup  How to Change Keyboard Language http://www.lib.uchicago.edu/e/using/catalog/inputoptions.html http://tlt.its.psu.edu/suggestions/international/keyboards/winkey.html http://www.al-bab.com/arab/comp.htm Translation and Language Issues  American Translators Association: Getting It Right (insights into translation issues) http://www.atanet.org/publications/getting_it_right.php  Basis Technology – Excellent papers & presentations on language issues. http://www.basistech.com/resources/ (The links on the left have more papers on topics such as Middle Eastern Languages, Digital Forensics, etc.)
  • 57. Resources: Google Searches for Some IDN ccTLDs  Republic of Korea: 한국 http://www.google.com/search?q=site%3A.한국  Serbia: СРБ http://www.google.com/search?q=site%3A%D0%A1%D0%A0%D0%91  Peoples Republic of China: 中国 http://www.google.com/search?q=site%3A.%E4%B8%AD%E5%9B%BD http://www.google.com/search?q=site%3A.%E4%B8%AD%E5%9C%8B  Hong Kong SAR: 香港 http://www.google.com/search?q=site%3A.%E9%A6%99%E6%B8%AF  Taiwan: 台湾 http://www.google.com/search?q=site%3A.%E5%8F%B0%E6%B9%BE http://www.google.com/search?q=site%3A.%E5%8F%B0%E7%81%A3  Egypt: ‫ﻣﺻر‬ http://www.google.com/search?q=site%3A.‫ﻣﺻر‬  Jordan: ‫اﻻردن‬ http://www.google.com/search?q=site%3A.%D8%A7%D9%84%D8%A7%D8%B1%D8%AF%D9%86  Saudi Arabia: ‫اﻟﺳﻌودﯾﺔ‬ http://www.google.com/search?q=site%3A.%D8%A7%D9%84%D8%B3%D8%B9%D9%88%D8%AF% D9%8A%D8%A9  Russian Federation: РФ http://www.google.com/search?q=site%3A.%D0%A0%D0%A4
  • 58. Thank you. • Jon.Abolins@gmail.com • Twitter: @jabolins • Web: idn.MeydaOnline.com