Published on

alKawarizmy Language Software 2009

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 2. 15 Years of Work before Deciding to Establish &quot;alKhawarizmy &quot; <ul><li>alKhawarizmy Language Software&quot; (established in January 2006) </li></ul><ul><li>In spite of the recency of the company, the roots of the concept of the company go back 15 years </li></ul><ul><li>The founder of the company </li></ul><ul><li>Dr. Hossam ElDin Mahgoub </li></ul><ul><li>together with a team of researchers, developers and linguists, were engaged in NLP research, applied to the Arabic language. </li></ul><ul><li>Dr. Hossam established the company in order to invest his experience and research in the NLP area, applied to benefit the Arabic language  and the Arab user . </li></ul>The greatest challenge was to try to make the computer &quot;understand&quot; the Arabic language and to process it as simply as possible, in spite of its unique and special features.
  2. 3. <ul><li>KSearch , from alKhawarizmy is an Arabic search engine for websites, companies and  organizations, that is capable of searching through thousands of Arabic web pages or documents </li></ul><ul><li>thereby benefiting your business through the following features : </li></ul><ul><ul><li>Speed:      KSearch indexes web pages and documents at a rate of about 20,000 words/sec. </li></ul></ul><ul><ul><li>Automatic Indexing:     KSearch 's indexing engine is capable of automatically indexing web pages and documents, based on a period which you select </li></ul></ul><ul><ul><li>Accuracy:     KSearch 's primary aim is to facilitate the retrieval of information for your website's visitors or the employees in your company by providing them with fast, comprehensive and accurate information retrieval . </li></ul></ul><ul><ul><li>Productivity:     Search accuracy, fast retrieval of results, automatic indexing... these are all features that will make your Arabic content more effective. The information you retrieve will be more reliable, since it will be more reachable than before. </li></ul></ul>Discover How KSearch can benefit your Business!
  3. 4. <ul><li>Arabic NLP Research </li></ul><ul><li>Arabic Applications based on NLP components </li></ul><ul><li>Stress on software quality (targeting ‘zero defect’ S/W). </li></ul><ul><li>Cooperate with the community; e.g. research students at universities (forming partnerships). </li></ul><ul><li>Promote widespread use of affordable applications that take the special features of the Arabic language into account. </li></ul><ul><li>Effectively serve the Arab region by catering for its users’ needs  impact the way an Arabic user searches. </li></ul>
  4. 5. <ul><li>The number of Arab Internet Users is growing </li></ul><ul><ul><li>22 million users in 2006 </li></ul></ul><ul><ul><li>43 million expected in 2008 </li></ul></ul><ul><li>The volume of Arabic e-content is increasing (on the web and in companies’ intranets): </li></ul><ul><li>Around 100 million Arabic web pages </li></ul><ul><li>About 5 million Arabic web sites </li></ul>
  5. 6. <ul><li>Arabic is a highly inflected language </li></ul><ul><li>Arabic morphology has a set of unique features </li></ul><ul><li>Proper Arabic e-content processing is deficient </li></ul><ul><li>Consequently, Arab users are unable to take full advantage of Arabic e-content, compared with other languages </li></ul><ul><li>As an example, considering searching through Arabic content … </li></ul>
  6. 7. Using : - Search for “ الحائزون على جوائز نوبل ” produces about 238 results
  7. 8. Using : - Search for “ الحائزون على جائزة نوبل ” produces about 684 results
  8. 9. Using : - Search for “ حاز على جائزة نوبل ” produces about 16,700 results
  9. 10. <ul><li>When used for Arabic search, traditional search engines produce </li></ul><ul><ul><li>Incomprehensive results, i.e. not all inflected forms are found => a lot of useful information is missing </li></ul></ul><ul><ul><li>Redundant results, i.e. some results are inaccurate => they ‘bear no relation’ in form or in meaning to the search word(s) </li></ul></ul>
  10. 12. An Arabic Search Model that: (A) Provides Morphological Search  Comprehensive (B) Differentiates between Meanings of Arabic words  Improves Accuracy In other words… Let us see the same example, using KSearch …
  11. 13. <ul><li>Search </li></ul><ul><li>Arabic Morphological Search (to produce comprehensive search results). </li></ul><ul><li>Document, as well as Database Search. </li></ul><ul><li>Differentiation between Word Meanings (to increase accuracy of search results, i.e. reduce redundancy). </li></ul><ul><li>Search using Logical Operators ( و – أو - ليس ). </li></ul><ul><li>Adjacency (Proximity) Search, in order of query words or not. </li></ul><ul><li>Search using Wildcards (for proper nouns). </li></ul><ul><li>Search words are highlighted in the results pages. </li></ul><ul><li>Latin character support (English words). </li></ul><ul><li>Spell checking of query words. </li></ul><ul><li>Stem and Thesaurus Search. </li></ul>= NEW (After Incubation Funding )
  12. 14. <ul><li>Indexing </li></ul><ul><li>Arabic comprehensive dictionary of contemporary Arabic (approximately 78,000 entries). </li></ul><ul><li>Document, as well as Database Indexing. </li></ul><ul><li>Fast Indexing Engine (≈ 20,000-56,000 words/sec on a PC with Intel Core 2 Duo CPU running at 2.33GHz, SATA HDD, 3GB RAM). </li></ul><ul><li>Uses 64 bit Technology => Unlimited Index Size. </li></ul><ul><li>Comprehensive Index Management: Capability of deleting, updating and merging indexes. </li></ul><ul><li>Following document formats are supported, including UNICODE encoded documents: Text, RTF, MS Office, PDF. </li></ul>
  13. 15. Arabic ِ Morphological Analyzer Comprehensive + Contemporary Arabic Lexicon Arabic Data Source (Database, Document, etc.) Indexing Engine Meta Data Repository Search Engine Search Results Arabic Lexical Semantic Analyzer
  14. 16. <ul><li>Component Oriented Architecture: </li></ul><ul><ul><li>Software Integrated in: </li></ul></ul><ul><ul><li>Websites  Web Edition. </li></ul></ul><ul><ul><li>Enterprises (Intranets)  Enterprise Edition. </li></ul></ul><ul><ul><li>Single PCs  Desktop Edition. </li></ul></ul><ul><li>Software as a Service (SaaS) – Future Direction: </li></ul><ul><ul><li>On Dedicated Web Server. </li></ul></ul>
  15. 17. <ul><li>Employs KMorph , a fast Arabic morphological analyzer. </li></ul><ul><li>Uses a comprehensive Arabic lexicon of contemporary words. </li></ul><ul><li>KSpell Engine: Provides APIs for spelling verification and correction, e.g. may be integrated with content management systems to produce correctly spelled Arabic web content. </li></ul>
  16. 18. <ul><li>Target Audience: </li></ul><ul><li>1- e- Government. </li></ul><ul><li>2- Web Publishers (News sites, Web developers,…etc.). </li></ul><ul><li>3- Web Content Management (CMS, E-library systems, Helpdesk…etc.). </li></ul><ul><li>4- Arabic & Arabic enabled internet search sites. </li></ul>
  17. 19. <ul><li>Competitive Advantage: </li></ul><ul><li>Price </li></ul><ul><li>Off-the-Shelf Installation </li></ul><ul><li>Online Demonstration </li></ul>