The Named Entity Recognition (NER)• Al-Shehri ,Aisha• Almutairi ,Shaikhah• Alswelim ,HayaKINGDOM OF SAUDI ARABIAMinistry of Higher EducationAl-Imam Muhammad Ibn Saud IslamicUniversityCollege of Computer and Information Sciences
AbstractName Entity Recognition is an important part of many naturallanguage processing tasks .There are different type of name entity such as people ,location and organization .
Introduction• The Named Entity Recognition is the identification andclassification of Named Entities within an open-domain text.• The task of named entity recognition was defined as threesubtasks:• ENAMEX.• TIMEX, and NUMEX.
• We present the attempt at the recognition andextraction of the most important proper name entity, that is,the person name, for the Arabic language(PERA).Components of an Arabic Full Name:divided into five main categories, Ibn Auda (2003):1. An ism (pronounced IZM).2. A kunya (pronounced COON-yah).3. By a nasab (pronounced NAH-sahb).4. A laqab (pronounced LAH-kahb).5. A nisba (pronounced NISS-bah).
Methodology1-Parallel Corpora .a-Reliabilityb-Representativeness2-Previously developed tools for other languages .a-Person namesb-Location names (Geographical locations and Toponyms)c-Organizations (Political of Administrative Entities)d-Position (job titles)e-Acronyms
Challenges• 1- There is no capital letters or a specific signal in theorthography like many other language.• 2-The Arabic has different meaning• 3-Abiguity
SystemArchitectureand Implementation1)Gazetteers:Gazetteer containing: lists of known named entities.White list:The White list plays the role of fixed static dictionaries ofvarious NE.
SystemArchitectureand Implementation2) Grammar:The grammar performs recognition and extraction of Arabicnamed entities from the input text based on derived rules.The following are examples of indicators used within rules:• Job title: (the doctor), (the sciencesprofessor).• Person title: (Mr.) , (Mrs.).
SystemArchitectureand Implementation3) Filter:filter rules hels in dealing with recognitionambiguity between named entities.filtration mechanism is used that serves two differentpurposes:revision of the NE extractor results anddisambiguationof matches returned by different NE extractors.
Example:variationTypographicEntity typeEnglishtranslationArabicexampleTwo dots removed from taamarboutaLocationSaudiArabiaDrop of the letter madda from thealephLocationAsia
Conclusion• 1-We tried in the majority of cases to follow more generalcriteria, applicable on English-Arabic transliteration orFrench-Arabic transliteration.• 2-This work is part of a new system for Arabic NER. It hasseveral ongoing activities.
References• Sherief Abdallah, Khaled Shaalan, and Muhammad Shoaib ,Integrating Rule-Based System with Classification for ArabicNamed Entity Recognition, 2012• Yassine Benajiba , Mona Diab , and Paolo Rosso ,UsingLanguage Independent and Language Specific Features toEnhance Arabic Named Entity Recognition, 2009• Yassine Benajiba , Mona Diab , and Paolo Rosso , ArabicNamed Entity Recognition: AN SVM-BASED APPROACH, 2009• Doaa Samy, Antonio Moreno, and José Mª Guirao, A ProposalFor An Arabic Named Entity Tagger Leveraging aParallelCorpus,2005• Khaled Shaalan, Hafsa Raza, Person Name Entity Recognitionfor Arabic,2009