Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Application of Ontology in Semantic Information Retrieval 
Presentation for MyRENSeminar 
Berjaya Hotel, Kuala Lumpur 
27 ...
Brief speaker’s info 
2 
Shahrul Azman Mohd. Noah, Ph.D. 
Knowledge Technology Research Group 
Center for AI Technology (C...
ONTOLOGY 
5
What is ontology? 
•Ontology may be considered as a kind of method to represent knowledge. 
•From a philosophical discipli...
Ontology in Computing 
•An ontology is an engineering artifact: 
–It is constituted by a specific vocabulary used to descr...
8 
Ontology Definition 
Formal, explicit specification of a shared conceptualization 
commonly accepted understanding 
con...
Source: Smith & Welty (2001) 
a catalog 
a set of 
text files 
a glossary 
a thesaurus 
a collection of 
taxonomies 
a set...
Various approaches to classify ontologies 
10 
Classify ontologies according to the information the ontology needs to expr...
Ontology language 
• Ontology languages are formal languages used to construct ontologies 
– allow the encoding of knowled...
Example of ontologies 
•Top level ontology - 
12 
Suggested Upper Merged Ontology (SUMO
13 
Portion of SUMO ontology with 
USGS Geo-concepts inserted
Example of ontologies (cont.) 
•Lexical ontology -Wordnet 
14
Example of ontologies (cont.) 
•Domain ontology -Simple News and Press Ontologies (SNaP) 
15
Linked Data…? 
16
Applications of ontology 
•Searching & browsing 
•Decision support system 
•Question answering system 
•Recommendation 
•D...
INFORMATION RETRIEVAL 
18
Concepts 
•“Information retrieval (IR)is a field concerned with the structure, analysis, organization, storage, searching,...
Issues in IR 
•Some issues in IR: 
–Relevance 
–Evaluation 
–Users and information needs 
•Context based search 
•Semantic...
IR process 
22
ONTOLOGY + INFORMATION RETRIEVAL 
23
Ontology and semantic search 
•Various ways to support semantic search: 
–Query expansion –users query are expanded with r...
Query Expansion 
•Query expansion (QE) is needed due to the ambiguity of natural language. 
•Main aim of QE –to add new me...
Query Expansion 
27
Semantic index 
• Textual documents are indexed according to some ontology 
model. 
• Remember the concept of vocabulary i...
Semantic index 
• Textual documents are indexed according to some ontology 
model. 
• Remember the concept of vocabulary i...
Examples 
•Three research projects that illustrate the applications of ontology-based IR: 
–Semantic digital library 
–Cri...
Semantic digital library 
•Proposed an approach for managing, organizing and populating ontology for document collections ...
Semantic digital library 
•General architecture 
37
Semantic digital library 
•Involved three ontologies –ACM Topic hierarchies, Geo ontology and Dublin core metadata 
•Porti...
Semantic digital library 
•Document annotation 
39
Semantic digital library 
•The process 
40
VSM Index 
#create Class Person 
#create instance of Class Student 
<Student rdf:ID="Student1"> 
<rdfs:label>ArifahAlhadi<...
Ontology-based IR for crime news retrieval 
•Each crime news must be classified into categories: Traffic Violation, Theft,...
Example 
43 
Murder 
Kidnap 
Theft 
Gang 
NurinJazlin 
Sosilawati 
Canny Ong 
Investigation into Canny Ong case include me...
Required methods 
•In order to support the aforementioned requirements: 
–Conventional text processing -tokenizing, indexi...
46 
PRE-PROCESSING TASK 
DOCUMENT REPRESENTATION 
DOCUMENT ORGANIZATION 
+ 
+ 
•Stopwordremoval 
•Stemming 
•Parsing 
•Ind...
Document representation 
•Documents will be presented into meaningful forms: 
–BoW–Bag of Words 
–Named Entity Recognition...
Document representation 
49
Document organization 
•Documents need to be organised into categories, topics and events. 
–Classification –Adaboostalgor...
51 
Asset ontology 
Event ontology
Extending the SNaP ontology and 
mapping to entities in news documents 
52 
SNaP 
Crime 
pne:Event 
pna:Asset 
pns:Stuff 
...
The Application 
•What we need/desire. 
53
Ontology-based Image Retrieval 
•Rapid growth of visual information (VI) –lead to difficulty in finding and accessing VI. ...
Ontology-based Image Retrieval 
•Illustrate how images are describes based on it visual, textual and domain semantic featu...
Proposed Approach 
57
Example of multi-modality ontology 
58
Example of Multi-modality ontology with DBpedia 
59
Conclusion -Practical implementation of ontology-based IR 
60 
TBox 
ABox 
Ontology 
Documents 
Index 
Extraction 
build 
...
Research issues 
•Index representation –most still based on the conventional VSM. 
•Ranking –weighting and ranking mechani...
References 
•Castells, P., Fernandez, M.,Vallet, D. 2007. An Adaptation of Vector Space Model for Ontology Based Informati...
Example-advancedapplicationofontology 
64
Watson –the science behind an answer 
65
66 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
Group members: 
1.Shahrul Azman Mohd. Noah 
2.JuhanaSalim 
3.Masnizah Mohd 
4.Nazli...
END 
67
Upcoming SlideShare
Loading in …5
×

Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

11,356 views

Published on

Application of Ontology in Semantic Information Retrieval
by Prof Shahrul Azman from FSTM, UKM
Presentation for MyREN Seminar 2014
Berjaya Hotel, Kuala Lumpur
27 November 2014

Published in: Technology
  • Be the first to comment

Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

  1. 1. Application of Ontology in Semantic Information Retrieval Presentation for MyRENSeminar Berjaya Hotel, Kuala Lumpur 27 November 2014 1
  2. 2. Brief speaker’s info 2 Shahrul Azman Mohd. Noah, Ph.D. Knowledge Technology Research Group Center for AI Technology (CAIT) shahrul@ukm.edu.my Graduated in BSc(Mathematics) from UKM Graduated in MSc(IS) from Sheffield U. Graduated in PhD(IS) from Sheffield U. – knowledge-based systems From Muar, Johor
  3. 3. ONTOLOGY 5
  4. 4. What is ontology? •Ontology may be considered as a kind of method to represent knowledge. •From a philosophical discipline –the science of “what is”; the kinds and structures of objects, properties, events, processes and relations in every area of reality. •Aristotle classification of animals is one the first ontology developed. 6
  5. 5. Ontology in Computing •An ontology is an engineering artifact: –It is constituted by a specific vocabulary used to describe a certain reality, plus –A set of explicit assumptions regarding the intended meaning of the vocabulary. •Thus, an ontology describes a formal specification of a certain domain: –Shared understanding of a domain of interest –Formal and machine manipulablemodel of a domain of interest 7
  6. 6. 8 Ontology Definition Formal, explicit specification of a shared conceptualization commonly accepted understanding conceptual model of a domain (ontological theory) unambiguous terminology definitions machine-readability with computational semantics [Gruber93]
  7. 7. Source: Smith & Welty (2001) a catalog a set of text files a glossary a thesaurus a collection of taxonomies a set of general logical constraints a collection of frames Complexity An ontology is… 9
  8. 8. Various approaches to classify ontologies 10 Classify ontologies according to the information the ontology needs to express and the richness of its internal structure (Lassila& McGuiness, 2001) Classify into 2 orthogonal dimensions: the amount and type of structure and the subject (Van Heijstet al., 1997) Classify ontologies according to their level of dependence on a particular task (Guarino, 1998)
  9. 9. Ontology language • Ontology languages are formal languages used to construct ontologies – allow the encoding of knowledge about specific domains and often – include reasoning rules that support the processing of that knowledge • Various languages have been proposed: CycL, KL-One, Ontolingua, F-Logic, OCML, LOOM, Telos, RDF(S), OIL, DAML+OIL, XOL, SHOE, OWL etc. • Usually based on Description Logic (DL). • Summarised as (Kalibatiene & Vasilecas, 2011): 11
  10. 10. Example of ontologies •Top level ontology - 12 Suggested Upper Merged Ontology (SUMO
  11. 11. 13 Portion of SUMO ontology with USGS Geo-concepts inserted
  12. 12. Example of ontologies (cont.) •Lexical ontology -Wordnet 14
  13. 13. Example of ontologies (cont.) •Domain ontology -Simple News and Press Ontologies (SNaP) 15
  14. 14. Linked Data…? 16
  15. 15. Applications of ontology •Searching & browsing •Decision support system •Question answering system •Recommendation •Data integration •Etc. 17
  16. 16. INFORMATION RETRIEVAL 18
  17. 17. Concepts •“Information retrieval (IR)is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968). •Applications of IR: recommendations, Q&A, filtering… and of course searching. 20
  18. 18. Issues in IR •Some issues in IR: –Relevance –Evaluation –Users and information needs •Context based search •Semantic search •Etc. 21
  19. 19. IR process 22
  20. 20. ONTOLOGY + INFORMATION RETRIEVAL 23
  21. 21. Ontology and semantic search •Various ways to support semantic search: –Query expansion –users query are expanded with related terminological terms –Disambiguation –resolving terms or concepts when they refer to more than one topics –Classifying –classify documents such as ads into ontological topics to support semantic search –Enhanced IR model –embed ontology into existing IR model resulting a modified IR model 25
  22. 22. Query Expansion •Query expansion (QE) is needed due to the ambiguity of natural language. •Main aim of QE –to add new meaningful terms to the initial query. 26 Bhogal, J., Macfarlane, A. & Smith, A. 2007. A review of ontology based query expansion. Information Processing and Management, 43: 866-886.
  23. 23. Query Expansion 27
  24. 24. Semantic index • Textual documents are indexed according to some ontology model. • Remember the concept of vocabulary in IR? 31 architecture bus computer database …. xml computer science collection index terms or vocabulary of the collection Extract Indexing
  25. 25. Semantic index • Textual documents are indexed according to some ontology model. • Remember the concept of vocabulary in IR? 32 computer science collection Replace the index with ontological-index Extract Indexing architecture bus computer database …. xml
  26. 26. Examples •Three research projects that illustrate the applications of ontology-based IR: –Semantic digital library –Crime news retrieval –Multi modality ontology-based image retrieval 35
  27. 27. Semantic digital library •Proposed an approach for managing, organizing and populating ontology for document collections in digital library. •The document metadata and content are inserted and populated to a knowledge base which allows sophisticated query and searching. •Firstly to propose an ontology based information retrieval model which is based on the classic vector space model which includes document annotation, instance-based weighting and concept-based ranking. 36
  28. 28. Semantic digital library •General architecture 37
  29. 29. Semantic digital library •Involved three ontologies –ACM Topic hierarchies, Geo ontology and Dublin core metadata •Portion of domain ontology focusing on academic thesis 38
  30. 30. Semantic digital library •Document annotation 39
  31. 31. Semantic digital library •The process 40
  32. 32. VSM Index #create Class Person #create instance of Class Student <Student rdf:ID="Student1"> <rdfs:label>ArifahAlhadi</rdfs:label> </Student> <Student rdf:ID="Student2"> <rdfs:labelrdf:datatype="http://www.w3.org/2001/XMLSchema#string" >AsyrafArifin</rdfs:label> </Student> #Create Instance of Class Supervisor <Supervisor rdf:ID="Supervisor1"> <rdfs:label>PM Dr ShahrulAzman</rdfs:label> <rdfs:label>Prof. MadyaDr. ShahrulAzmanMohdNoah</rdfs:label> </Supervisor> <Supervisor rdf:ID="Supervisor2"> <rdfs:label>Prof Aziz Deraman</rdfs:label> </Supervisor> Concept Instance Documents http://www.ukm.my/thesis/supervisor# http://www.ukm.my/thesis/person# Supervisor1 Doc1 http://ukm.my/thesis/student# http://ukm.my/thesis/creator# http://ukm.my/thesis/person# Student1 Doc1 http://ukm.my/thesis/student# http://ukm.my/thesis/creator# http://ukm.my/thesis/person# Student2 Doc1 Id Term TFIDF Frq Doc Id 1 ArifahAlhadi 0.11 2 Doc1 2 AsyrafArifin 0.123 1 Doc1 3 PMDr ShahrulAzman 0.45 1 Doc1
  33. 33. Ontology-based IR for crime news retrieval •Each crime news must be classified into categories: Traffic Violation, Theft, Sex Crime, Murder, Kidnap, Fraud, Drugs, Cybercrime, Arsonand Gang(Chen et al. 2004) •Useful entities need to be identified: Person, Location, Organisation, Date/Time, Weapon, Amount, Vehicle, Drug, Personel properties, and Age. •Clustering of crime news into topics, e.g. NurinJazlinmurder, Canny Ong, Sosilawatietc. •Clustering of specific topic into various and chronological events. •Mapping of named entities into news ontology to support semantic querying and retrieval. 42
  34. 34. Example 43 Murder Kidnap Theft Gang NurinJazlin Sosilawati Canny Ong Investigation into Canny Ong case include medical report and trial Evidence/Suspect into Canny Ong case DNA test Family reacts into Canny Ong and negligence suit Court Sentence, plead guilty (17) (6) (3) (9) (13) ……………….. Classification Clustering Cluster into topics
  35. 35. Required methods •In order to support the aforementioned requirements: –Conventional text processing -tokenizing, indexing, stopping, stemming etc. –Named entity recognition (NER) –Classification and clustering –Ontology mapping 44
  36. 36. 46 PRE-PROCESSING TASK DOCUMENT REPRESENTATION DOCUMENT ORGANIZATION + + •Stopwordremoval •Stemming •Parsing •Indexing •Bag of words •Named entity recognition •Classification -AdaBoost •Clustering – KNN •Semantic mapping
  37. 37. Document representation •Documents will be presented into meaningful forms: –BoW–Bag of Words –Named Entity Recognition –used the GATE Annie and Jape rules –Adopt the Vector Space Model (VSM) but enhanced with ontological model 48
  38. 38. Document representation 49
  39. 39. Document organization •Documents need to be organised into categories, topics and events. –Classification –Adaboostalgorithm –Clustering –Used the KNN clustering –Ontology mapping –we have develop a crime news ontology by extending the existing SNaPontology. Includes classes/entities which are important to crime such as classification of crimes, locationand weapon. 50
  40. 40. 51 Asset ontology Event ontology
  41. 41. Extending the SNaP ontology and mapping to entities in news documents 52 SNaP Crime pne:Event pna:Asset pns:Stuff pns:Tangible pns:Location pns:Organization pns:Person event:Event rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf pns:Weapon pns:Vehicle pnc:Classification <Murder> <Kidnap> rdf:type rdf:type rdfs:subClassOf pne: subeventOf rdfs:domain rdfs:range <Event 1> rdf:type pnt:Tag rdfs:subClassOf rdfs:subClassOf pnc:Classifiable pnc: isClassifiedBy rdfs:subClassOf rdf:domain rdf:range rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf
  42. 42. The Application •What we need/desire. 53
  43. 43. Ontology-based Image Retrieval •Rapid growth of visual information (VI) –lead to difficulty in finding and accessing VI. •Inability to capture the semantic content. •Problem arise –lack of coincidence between information extracted from VI and user needs. •Conventional approaches of image retrieval (IMR) -TBIR and CBIR have reached their limit in attempting to solve this problem. •As a result –SBIR approach, ontology-based provide an explicit domain oriented semantic for concept and relationship. 55
  44. 44. Ontology-based Image Retrieval •Illustrate how images are describes based on it visual, textual and domain semantic features. •Proposed a multi-modality ontology: visual ontology, textual ontology and domain ontology. •Illustrate how such ontology can be integrated with open source knowledge base (DBpedia) to support a more comprehensive search. 56
  45. 45. Proposed Approach 57
  46. 46. Example of multi-modality ontology 58
  47. 47. Example of Multi-modality ontology with DBpedia 59
  48. 48. Conclusion -Practical implementation of ontology-based IR 60 TBox ABox Ontology Documents Index Extraction build Population Annotation Query Processing query ranked docs
  49. 49. Research issues •Index representation –most still based on the conventional VSM. •Ranking –weighting and ranking mechanisms •Automatic population –supervised and unsupervised •Extraction & annotation •Multilingual and cross-language 61
  50. 50. References •Castells, P., Fernandez, M.,Vallet, D. 2007. An Adaptation of Vector Space Model for Ontology Based Information Retrieval. IEEE Transaction on Knowledge and Data Engineering, 19(2): •Shahrul Azman Noah, Nor AfniRaziahAlias, NurulAida Osman, ZuraidahAbdullah, NazliaOmar, YazrinaYahya, MaryatiMohd Yusof: Ontology-Driven Semantic Digital Library. AIRS2010: 141-150. •Shahrul Azman Noah, DatulAida Ali: The Role of Lexical Ontology in Expanding the Semantic Textual Content of On-Line News Images. AIRS2010: 193-202. •Fernández, M., Cantador, I., López, V. , Vallet, D., Castells, P., & Motta, E. 2011. Semantically enhanced information retrieval: an ontology-based approach. Web Semantics: Science, Services and Agents on the World Wide Web, 9: 434-452. •Kara, S. Alan, O., Sabuncu, O., Akpınar, S., CicekliN.K., & Alpaslan, F.N. 2012. An ontology-based retrieval system using semantic indexing. Information Systems, 37: 294-305. •Kohler, J., Philippi, S., Specht, M., & Ruegg, A. 2006. Ontology based text indexing and querying for the semantic web. Knowledge-Based Systems, 19: 744-754. •Etc. 62
  51. 51. Example-advancedapplicationofontology 64
  52. 52. Watson –the science behind an answer 65
  53. 53. 66 1 2 3 4 5 6 7 8 9 10 11 Group members: 1.Shahrul Azman Mohd. Noah 2.JuhanaSalim 3.Masnizah Mohd 4.Nazlia Omar 5.Mohd Juzaiddin Ab Aziz 6.Nazlena Mohamad Ali 7.Saidah Saad 8.Shereena Mohd Arif 9.LailaltulqadriZakaria 10.Sabrina Tiun 11.Maryati Mohd. Yusof
  54. 54. END 67

×