Monday 12/05/2008
<ul><li>Arabic NLP Research </li></ul><ul><li>Arabic Applications based on NLP components </li></ul><ul><li>Stress on soft...
<ul><li>1 st  Nov. 2007 – 31 st  Dec. 2007: </li></ul><ul><li>3 Developers + 1 Product Manager => Small (borrowed) room. <...
<ul><li>The number of  Arab Internet Users  is growing </li></ul><ul><ul><li>22 million users in 2006 </li></ul></ul><ul><...
<ul><li>Arabic is a highly inflected language </li></ul><ul><li>Arabic morphology has a set of unique features </li></ul><...
Using  : - Search for “ الحائزون على جوائز نوبل ”  produces about  238  results Monday 12/05/2008
Using  : - Search for “ الحائزون على جائزة نوبل ”  produces about  684  results Monday 12/05/2008
Using  : - Search for “ حاز على جائزة نوبل ”  produces about  16,700  results Monday 12/05/2008
<ul><li>When used for Arabic search, traditional search engines produce </li></ul><ul><ul><li>Incomprehensive  results, i....
An Arabic Search Model that: <ul><li>Provides morphological search     Comprehensive </li></ul><ul><li>Differentiates bet...
Monday 12/05/2008
<ul><li>Arabic Morphological Search (to produce comprehensive search results). </li></ul><ul><li>Differentiation between W...
Monday 12/05/2008 Arabic  ِ Morphological Analyzer Comprehensive + Contemporary Arabic Lexicon Arabic Data Source (Databas...
<ul><li>Employs  KMorph , a fast Arabic morphological analyzer </li></ul><ul><li>Uses a comprehensive Arabic lexicon of co...
Upcoming SlideShare
Loading in …5
×

K Search Al Khawarizmy Language Software

2,436 views
2,402 views

Published on

ARABIC SEARCH ENGINE(KSearch)

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,436
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

K Search Al Khawarizmy Language Software

  1. 1. Monday 12/05/2008
  2. 2. <ul><li>Arabic NLP Research </li></ul><ul><li>Arabic Applications based on NLP components </li></ul><ul><li>Stress on software quality (targeting ‘zero defect’ S/W) </li></ul><ul><li>Cooperate with the community; e.g. research students at universities (forming partnerships) </li></ul><ul><li>Promote widespread use of affordable applications that take the special features of the Arabic language into account </li></ul><ul><li>Effectively serve the Arab region by catering for its users’ needs </li></ul>Monday 12/05/2008
  3. 3. <ul><li>1 st Nov. 2007 – 31 st Dec. 2007: </li></ul><ul><li>3 Developers + 1 Product Manager => Small (borrowed) room. </li></ul><ul><li>1 st Jan. 2008 – 31 st Jan. 2008: </li></ul><ul><li>1 Linguist => Home Office. </li></ul><ul><li>1 st Feb. 2008 – 31 st Mar. 2008: </li></ul><ul><li>1 Linguist => Smart Village Incubation. </li></ul><ul><li>1 st Apr. 2008 – Present: </li></ul><ul><li>3 Developers + 1 Linguist + 1 Business Development Manager + 1 Office Manager => Smart Village Incubation. </li></ul>Monday 12/05/2008
  4. 4. <ul><li>The number of Arab Internet Users is growing </li></ul><ul><ul><li>22 million users in 2006 </li></ul></ul><ul><ul><li>43 million expected in 2008 </li></ul></ul><ul><li>The volume of Arabic e-content is increasing (on the web and in companies’ intranets): </li></ul><ul><li>Around 100 million Arabic web pages </li></ul><ul><li>About 5 million Arabic web sites </li></ul>Monday 12/05/2008
  5. 5. <ul><li>Arabic is a highly inflected language </li></ul><ul><li>Arabic morphology has a set of unique features </li></ul><ul><li>Proper Arabic e-content processing is deficient </li></ul><ul><li>Consequently, Arab users are unable to take full advantage of Arabic e-content, compared with other languages </li></ul><ul><li>As an example, considering searching through Arabic content … </li></ul>Monday 12/05/2008
  6. 6. Using : - Search for “ الحائزون على جوائز نوبل ” produces about 238 results Monday 12/05/2008
  7. 7. Using : - Search for “ الحائزون على جائزة نوبل ” produces about 684 results Monday 12/05/2008
  8. 8. Using : - Search for “ حاز على جائزة نوبل ” produces about 16,700 results Monday 12/05/2008
  9. 9. <ul><li>When used for Arabic search, traditional search engines produce </li></ul><ul><ul><li>Incomprehensive results, i.e. not all inflected forms are found => a lot of useful information is missing </li></ul></ul><ul><ul><li>Redundant results, i.e. some results are inaccurate => they ‘bear no relation’ in form or in meaning to the search word(s) </li></ul></ul>Monday 12/05/2008
  10. 10. An Arabic Search Model that: <ul><li>Provides morphological search  Comprehensive </li></ul><ul><li>Differentiates between meanings of Arabic words  Improves Accuracy </li></ul><ul><li>In other words… </li></ul><ul><li>Let us see the same example, using KSearch … </li></ul>Monday 12/05/2008
  11. 11. Monday 12/05/2008
  12. 12. <ul><li>Arabic Morphological Search (to produce comprehensive search results). </li></ul><ul><li>Differentiation between Word Meanings (to increase accuracy of search results, i.e. reduce redundancy). </li></ul><ul><li>Search using Logical Operators ( و – أو - ليس ). </li></ul><ul><li>Adjacency (Proximity) Search. </li></ul><ul><li>Search using Wildcards (for proper nouns and Latin text) . </li></ul><ul><li>Search words are highlighted in the results pages. </li></ul><ul><li>Over 200 document formats are supported, including UNICODE encoded documents. </li></ul><ul><li>Arabic comprehensive dictionary of contemporary Arabic (approximately 78,000 entries). </li></ul><ul><li>Fast Indexing Engine (25,000 - 30,000 words/sec on a PC with AMD Athlon 3800+ CPU, IDE HDD, 1GB RAM). </li></ul><ul><li>Uses 64 bit Technology => Unlimited Index Size. </li></ul><ul><li>Comprehensive Index Management: Capability of deleting, updating and merging indexes. </li></ul>Monday 12/05/2008
  13. 13. Monday 12/05/2008 Arabic ِ Morphological Analyzer Comprehensive + Contemporary Arabic Lexicon Arabic Data Source (Database, Document, etc.) Fast Indexing Engine Meta Data Repository Search Engine Search Results Arabic Lexical Semantic Analyzer
  14. 14. <ul><li>Employs KMorph , a fast Arabic morphological analyzer </li></ul><ul><li>Uses a comprehensive Arabic lexicon of contemporary words </li></ul><ul><li>KSpell Engine: Provides APIs for spelling verification and correction, e.g. may be integrated with content management systems to produce correctly spelled Arabic web content </li></ul>Monday 12/05/2008

×