Habash: Arabic Natural Language Processing

3,023 views

Published on

Published in: Technology, Education

Habash: Arabic Natural Language Processing

  1. 1. 1 Arabic Natural Language Processing Nizar Habash Columbia University Center for Computational Learning Systems With modifications
  2. 2. 2 Road Map • Introduction • Orthography – Arabic Script – MSA Phonology and Spelling – Encoding Issues • Morphology • Syntax
  3. 3. 3 Introduction • What is ‘Arabic’? – A Semitic language – Arabic Script • With or without diacritics • More ambiguity! – Arabic Language • Modern Standard Arabic (MSA) • Arabic Dialects
  4. 4. 4 Road Map • Introduction • Orthography – Arabic Script – MSA Phonology and Spelling – Encoding Issues • Morphology • Syntax
  5. 5. 5 Arabic Script Arabic script is an alphabet with allographic variants, optional zero-width diacritics and common ligatures. Arabic script is used to write many languages: Arabic, Persian, Kurdish, Urdu, Pashto, etc. PDF Files problem ‫طا‬ُ ‫ا‬ ‫خ‬َ‫ط‬ ‫ال‬ ‫ب

×