SlideShare a Scribd company logo
1 of 54
Download to read offline
Application of Ontology in Semantic Information Retrieval 
Presentation for MyRENSeminar 
Berjaya Hotel, Kuala Lumpur 
27 November 2014 
1
Brief speaker’s info 
2 
Shahrul Azman Mohd. Noah, Ph.D. 
Knowledge Technology Research Group 
Center for AI Technology (CAIT) 
shahrul@ukm.edu.my 
Graduated in BSc(Mathematics) from UKM 
Graduated in MSc(IS) from Sheffield U. 
Graduated in PhD(IS) from Sheffield U. – 
knowledge-based systems 
From Muar, Johor
ONTOLOGY 
5
What is ontology? 
•Ontology may be considered as a kind of method to represent knowledge. 
•From a philosophical discipline –the science of “what is”; the kinds and structures of objects, properties, events, processes and relations in every area of reality. 
•Aristotle classification of animals is one 
the first ontology developed. 
6
Ontology in Computing 
•An ontology is an engineering artifact: 
–It is constituted by a specific vocabulary used to describe a certain reality, plus 
–A set of explicit assumptions regarding the intended meaning of the vocabulary. 
•Thus, an ontology describes a formal specification of a certain domain: 
–Shared understanding of a domain of interest 
–Formal and machine manipulablemodel of a domain of interest 
7
8 
Ontology Definition 
Formal, explicit specification of a shared conceptualization 
commonly accepted understanding 
conceptual model of a domain (ontological theory) 
unambiguous terminology definitions 
machine-readability with computational semantics 
[Gruber93]
Source: Smith & Welty (2001) 
a catalog 
a set of 
text files 
a glossary 
a thesaurus 
a collection of 
taxonomies 
a set of 
general logical 
constraints 
a collection of 
frames 
Complexity 
An ontology is… 
9
Various approaches to classify ontologies 
10 
Classify ontologies according to the information the ontology needs to express and the richness of its internal structure (Lassila& McGuiness, 2001) 
Classify into 2 orthogonal dimensions: the amount and type of structure and the subject (Van Heijstet al., 1997) 
Classify ontologies according to their level of dependence on a particular task (Guarino, 1998)
Ontology language 
• Ontology languages are formal languages used to construct ontologies 
– allow the encoding of knowledge about specific domains and often 
– include reasoning rules that support the processing of that knowledge 
• Various languages have been proposed: CycL, KL-One, Ontolingua, F-Logic, 
OCML, LOOM, Telos, RDF(S), OIL, DAML+OIL, XOL, SHOE, 
OWL etc. 
• Usually based on Description Logic (DL). 
• Summarised as (Kalibatiene & Vasilecas, 2011): 
11
Example of ontologies 
•Top level ontology - 
12 
Suggested Upper Merged Ontology (SUMO
13 
Portion of SUMO ontology with 
USGS Geo-concepts inserted
Example of ontologies (cont.) 
•Lexical ontology -Wordnet 
14
Example of ontologies (cont.) 
•Domain ontology -Simple News and Press Ontologies (SNaP) 
15
Linked Data…? 
16
Applications of ontology 
•Searching & browsing 
•Decision support system 
•Question answering system 
•Recommendation 
•Data integration 
•Etc. 
17
INFORMATION RETRIEVAL 
18
Concepts 
•“Information retrieval (IR)is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968). 
•Applications of IR: recommendations, Q&A, filtering… and of course searching. 
20
Issues in IR 
•Some issues in IR: 
–Relevance 
–Evaluation 
–Users and information needs 
•Context based search 
•Semantic search 
•Etc. 
21
IR process 
22
ONTOLOGY + INFORMATION RETRIEVAL 
23
Ontology and semantic search 
•Various ways to support semantic search: 
–Query expansion –users query are expanded with related terminological terms 
–Disambiguation –resolving terms or concepts when they refer to more than one topics 
–Classifying –classify documents such as ads into ontological topics to support semantic search 
–Enhanced IR model –embed ontology into existing IR model resulting a modified IR model 
25
Query Expansion 
•Query expansion (QE) is needed due to the ambiguity of natural language. 
•Main aim of QE –to add new meaningful terms to the initial query. 
26 
Bhogal, J., Macfarlane, A. & Smith, A. 2007. A review of ontology based query expansion. Information Processing and Management, 43: 866-886.
Query Expansion 
27
Semantic index 
• Textual documents are indexed according to some ontology 
model. 
• Remember the concept of vocabulary in IR? 
31 
architecture 
bus 
computer 
database 
…. 
xml 
computer science 
collection index terms or vocabulary 
of the collection 
Extract Indexing
Semantic index 
• Textual documents are indexed according to some ontology 
model. 
• Remember the concept of vocabulary in IR? 
32 
computer science 
collection Replace the index with ontological-index 
Extract Indexing 
architecture 
bus 
computer 
database 
…. 
xml
Examples 
•Three research projects that illustrate the applications of ontology-based IR: 
–Semantic digital library 
–Crime news retrieval 
–Multi modality ontology-based image retrieval 
35
Semantic digital library 
•Proposed an approach for managing, organizing and populating ontology for document collections in digital library. 
•The document metadata and content are inserted and populated to a knowledge base which allows sophisticated query and searching. 
•Firstly to propose an ontology based information retrieval model which is based on the classic vector space model which includes document annotation, instance-based weighting and concept-based ranking. 
36
Semantic digital library 
•General architecture 
37
Semantic digital library 
•Involved three ontologies –ACM Topic hierarchies, Geo ontology and Dublin core metadata 
•Portion of domain ontology focusing on academic thesis 
38
Semantic digital library 
•Document annotation 
39
Semantic digital library 
•The process 
40
VSM Index 
#create Class Person 
#create instance of Class Student 
<Student rdf:ID="Student1"> 
<rdfs:label>ArifahAlhadi</rdfs:label> 
</Student> 
<Student rdf:ID="Student2"> 
<rdfs:labelrdf:datatype="http://www.w3.org/2001/XMLSchema#string" 
>AsyrafArifin</rdfs:label> 
</Student> 
#Create Instance of Class Supervisor 
<Supervisor rdf:ID="Supervisor1"> 
<rdfs:label>PM Dr ShahrulAzman</rdfs:label> 
<rdfs:label>Prof. MadyaDr. ShahrulAzmanMohdNoah</rdfs:label> 
</Supervisor> 
<Supervisor rdf:ID="Supervisor2"> 
<rdfs:label>Prof Aziz Deraman</rdfs:label> 
</Supervisor> 
Concept 
Instance 
Documents 
http://www.ukm.my/thesis/supervisor# 
http://www.ukm.my/thesis/person# 
Supervisor1 
Doc1 
http://ukm.my/thesis/student# 
http://ukm.my/thesis/creator# 
http://ukm.my/thesis/person# 
Student1 
Doc1 
http://ukm.my/thesis/student# 
http://ukm.my/thesis/creator# 
http://ukm.my/thesis/person# 
Student2 
Doc1 
Id 
Term 
TFIDF 
Frq 
Doc 
Id 
1 
ArifahAlhadi 
0.11 
2 
Doc1 
2 
AsyrafArifin 
0.123 
1 
Doc1 
3 
PMDr ShahrulAzman 
0.45 
1 
Doc1
Ontology-based IR for crime news retrieval 
•Each crime news must be classified into categories: Traffic Violation, Theft, Sex Crime, Murder, Kidnap, Fraud, Drugs, Cybercrime, Arsonand Gang(Chen et al. 2004) 
•Useful entities need to be identified: Person, Location, Organisation, Date/Time, Weapon, Amount, Vehicle, Drug, Personel properties, and Age. 
•Clustering of crime news into topics, e.g. NurinJazlinmurder, Canny Ong, Sosilawatietc. 
•Clustering of specific topic into various 
and chronological events. 
•Mapping of named entities into news 
ontology to support semantic querying and retrieval. 
42
Example 
43 
Murder 
Kidnap 
Theft 
Gang 
NurinJazlin 
Sosilawati 
Canny Ong 
Investigation into Canny Ong case include medical report and trial 
Evidence/Suspect into Canny Ong case 
DNA test 
Family reacts into Canny Ong and negligence suit 
Court Sentence, plead guilty 
(17) 
(6) 
(3) 
(9) 
(13) 
……………….. 
Classification 
Clustering 
Cluster into topics
Required methods 
•In order to support the aforementioned requirements: 
–Conventional text processing -tokenizing, indexing, stopping, stemming etc. 
–Named entity recognition (NER) 
–Classification and clustering 
–Ontology mapping 
44
46 
PRE-PROCESSING TASK 
DOCUMENT REPRESENTATION 
DOCUMENT ORGANIZATION 
+ 
+ 
•Stopwordremoval 
•Stemming 
•Parsing 
•Indexing 
•Bag of words 
•Named entity recognition 
•Classification -AdaBoost 
•Clustering – KNN 
•Semantic mapping
Document representation 
•Documents will be presented into meaningful forms: 
–BoW–Bag of Words 
–Named Entity Recognition –used the GATE Annie and Jape rules 
–Adopt the Vector Space Model (VSM) but enhanced with ontological model 
48
Document representation 
49
Document organization 
•Documents need to be organised into categories, topics and events. 
–Classification –Adaboostalgorithm 
–Clustering –Used the KNN clustering 
–Ontology mapping –we have develop a crime news ontology by extending the existing SNaPontology. Includes classes/entities which are important to crime such as classification of crimes, locationand weapon. 
50
51 
Asset ontology 
Event ontology
Extending the SNaP ontology and 
mapping to entities in news documents 
52 
SNaP 
Crime 
pne:Event 
pna:Asset 
pns:Stuff 
pns:Tangible 
pns:Location pns:Organization 
pns:Person 
event:Event 
rdfs:subClassOf 
rdfs:subClassOf 
rdfs:subClassOf 
pns:Weapon 
pns:Vehicle 
pnc:Classification 
<Murder> 
<Kidnap> 
rdf:type 
rdf:type 
rdfs:subClassOf 
pne: 
subeventOf 
rdfs:domain 
rdfs:range 
<Event 1> 
rdf:type 
pnt:Tag 
rdfs:subClassOf 
rdfs:subClassOf 
pnc:Classifiable 
pnc: 
isClassifiedBy 
rdfs:subClassOf 
rdf:domain 
rdf:range 
rdfs:subClassOf 
rdfs:subClassOf 
rdfs:subClassOf
The Application 
•What we need/desire. 
53
Ontology-based Image Retrieval 
•Rapid growth of visual information (VI) –lead to difficulty in finding and accessing VI. 
•Inability to capture the semantic content. 
•Problem arise –lack of coincidence between information extracted from VI and user needs. 
•Conventional approaches of image retrieval (IMR) -TBIR and CBIR have reached their limit in attempting to solve this problem. 
•As a result –SBIR approach, 
ontology-based provide an explicit 
domain oriented semantic for 
concept and relationship. 
55
Ontology-based Image Retrieval 
•Illustrate how images are describes based on it visual, textual and domain semantic features. 
•Proposed a multi-modality ontology: visual ontology, textual ontology and domain ontology. 
•Illustrate how such ontology can be integrated with open source knowledge base (DBpedia) to support a more comprehensive search. 
56
Proposed Approach 
57
Example of multi-modality ontology 
58
Example of Multi-modality ontology with DBpedia 
59
Conclusion -Practical implementation of ontology-based IR 
60 
TBox 
ABox 
Ontology 
Documents 
Index 
Extraction 
build 
Population 
Annotation 
Query Processing 
query 
ranked docs
Research issues 
•Index representation –most still based on the conventional VSM. 
•Ranking –weighting and ranking mechanisms 
•Automatic population –supervised and unsupervised 
•Extraction & annotation 
•Multilingual and cross-language 
61
References 
•Castells, P., Fernandez, M.,Vallet, D. 2007. An Adaptation of Vector Space Model for Ontology Based Information Retrieval. IEEE Transaction on Knowledge and Data Engineering, 19(2): 
•Shahrul Azman Noah, Nor AfniRaziahAlias, NurulAida Osman, ZuraidahAbdullah, NazliaOmar, YazrinaYahya, MaryatiMohd Yusof: Ontology-Driven Semantic Digital Library. AIRS2010: 141-150. 
•Shahrul Azman Noah, DatulAida Ali: The Role of Lexical Ontology in Expanding the Semantic Textual Content of On-Line News Images. AIRS2010: 193-202. 
•Fernández, M., Cantador, I., López, V. , Vallet, D., Castells, P., & Motta, E. 2011. Semantically enhanced information retrieval: an ontology-based approach. Web Semantics: Science, Services and Agents on the World Wide Web, 9: 434-452. 
•Kara, S. Alan, O., Sabuncu, O., Akpınar, S., CicekliN.K., & Alpaslan, F.N. 2012. An ontology-based retrieval system using semantic indexing. Information Systems, 37: 294-305. 
•Kohler, J., Philippi, S., Specht, M., & Ruegg, A. 2006. Ontology based text indexing and querying for the semantic web. Knowledge-Based Systems, 19: 744-754. 
•Etc. 
62
Example-advancedapplicationofontology 
64
Watson –the science behind an answer 
65
66 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
Group members: 
1.Shahrul Azman Mohd. Noah 
2.JuhanaSalim 
3.Masnizah Mohd 
4.Nazlia Omar 
5.Mohd Juzaiddin Ab Aziz 
6.Nazlena Mohamad Ali 
7.Saidah Saad 
8.Shereena Mohd Arif 
9.LailaltulqadriZakaria 
10.Sabrina Tiun 
11.Maryati Mohd. Yusof
END 
67

More Related Content

What's hot

Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsMounia Lalmas-Roelleke
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMSai Kumar Ale
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
Semantic Web - Ontologies
Semantic Web - OntologiesSemantic Web - Ontologies
Semantic Web - OntologiesSerge Linckels
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsSelman Bozkır
 
HITS + Pagerank
HITS + PagerankHITS + Pagerank
HITS + Pagerankajkt
 
Web mining (structure mining)
Web mining (structure mining)Web mining (structure mining)
Web mining (structure mining)Amir Fahmideh
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introductionnimmyjans4
 
Chapter 1 semantic web
Chapter 1 semantic webChapter 1 semantic web
Chapter 1 semantic webR A Akerkar
 
Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Primya Tamil
 
Vector space model of information retrieval
Vector space model of information retrievalVector space model of information retrieval
Vector space model of information retrievalNanthini Dominique
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDFNarni Rajesh
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented DatabasesFabio Fumarola
 
Information Retrieval Models
Information Retrieval ModelsInformation Retrieval Models
Information Retrieval ModelsNisha Arankandath
 

What's hot (20)

Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & Models
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Semantic Web - Ontologies
Semantic Web - OntologiesSemantic Web - Ontologies
Semantic Web - Ontologies
 
Probabilistic information retrieval models & systems
Probabilistic information retrieval models & systemsProbabilistic information retrieval models & systems
Probabilistic information retrieval models & systems
 
HITS + Pagerank
HITS + PagerankHITS + Pagerank
HITS + Pagerank
 
Web mining (structure mining)
Web mining (structure mining)Web mining (structure mining)
Web mining (structure mining)
 
Python for Data Science
Python for Data SciencePython for Data Science
Python for Data Science
 
Ontology engineering
Ontology engineering Ontology engineering
Ontology engineering
 
Ontology
OntologyOntology
Ontology
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
 
Chapter 1 semantic web
Chapter 1 semantic webChapter 1 semantic web
Chapter 1 semantic web
 
Boolean,vector space retrieval Models
Boolean,vector space retrieval Models Boolean,vector space retrieval Models
Boolean,vector space retrieval Models
 
Vector space model of information retrieval
Vector space model of information retrievalVector space model of information retrieval
Vector space model of information retrieval
 
Term weighting
Term weightingTerm weighting
Term weighting
 
Data mining
Data miningData mining
Data mining
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
Information Retrieval Models
Information Retrieval ModelsInformation Retrieval Models
Information Retrieval Models
 

Similar to Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

Applying a new subject classification scheme for a database by a data-driven ...
Applying a new subject classification scheme for a database by a data-driven ...Applying a new subject classification scheme for a database by a data-driven ...
Applying a new subject classification scheme for a database by a data-driven ...National Institute of Informatics
 
Semantic Web - Ontology 101
Semantic Web - Ontology 101Semantic Web - Ontology 101
Semantic Web - Ontology 101Luigi De Russis
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...Neuroscience Information Framework
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the WebRinke Hoekstra
 
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...Jennifer D'Souza
 
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...Angelo Salatino
 
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...Giannis Tsakonas
 
Semi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesSemi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesElsevier
 
Survey of natural language processing(midp2)
Survey of natural language processing(midp2)Survey of natural language processing(midp2)
Survey of natural language processing(midp2)Tariqul islam
 
Representation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelRepresentation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelMihika Shah
 
Generating Lexical Information for Terminology in a Bioinformatics Ontology
Generating Lexical Information for Terminologyin a Bioinformatics OntologyGenerating Lexical Information for Terminologyin a Bioinformatics Ontology
Generating Lexical Information for Terminology in a Bioinformatics OntologyHammad Afzal
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2Seonho Kim
 
Understanding Information Architecture
Understanding Information ArchitectureUnderstanding Information Architecture
Understanding Information ArchitectureScott Abel
 
It services & research methods
It services & research methodsIt services & research methods
It services & research methodsAkanshShandilya
 

Similar to Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM (20)

Applying a new subject classification scheme for a database by a data-driven ...
Applying a new subject classification scheme for a database by a data-driven ...Applying a new subject classification scheme for a database by a data-driven ...
Applying a new subject classification scheme for a database by a data-driven ...
 
Semantic Web - Ontology 101
Semantic Web - Ontology 101Semantic Web - Ontology 101
Semantic Web - Ontology 101
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
 
Ontologies Fmi 042010
Ontologies Fmi 042010Ontologies Fmi 042010
Ontologies Fmi 042010
 
Knowledge Representation on the Web
Knowledge Representation on the WebKnowledge Representation on the Web
Knowledge Representation on the Web
 
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
Pattern-based Acquisition of Scientific Entities from Scholarly Article Title...
 
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
The CSO Classifier: Ontology-Driven Detection of Research Topics in Scholarly...
 
Topic Models Exploration
Topic Models ExplorationTopic Models Exploration
Topic Models Exploration
 
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
Charting the Digital Library Evaluation Domain with a Semantically Enhanced M...
 
Semi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific TablesSemi-automated Exploration and Extraction of Data in Scientific Tables
Semi-automated Exploration and Extraction of Data in Scientific Tables
 
Semantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including AstrophysicsSemantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including Astrophysics
 
Survey of natural language processing(midp2)
Survey of natural language processing(midp2)Survey of natural language processing(midp2)
Survey of natural language processing(midp2)
 
Ontology
OntologyOntology
Ontology
 
Ontology Engineering
Ontology EngineeringOntology Engineering
Ontology Engineering
 
Representation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelRepresentation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object model
 
Generating Lexical Information for Terminology in a Bioinformatics Ontology
Generating Lexical Information for Terminologyin a Bioinformatics OntologyGenerating Lexical Information for Terminologyin a Bioinformatics Ontology
Generating Lexical Information for Terminology in a Bioinformatics Ontology
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2
 
Getting Ready for the Next Generation Science Standards: Reviewing the Draft ...
Getting Ready for the Next Generation Science Standards: Reviewing the Draft ...Getting Ready for the Next Generation Science Standards: Reviewing the Draft ...
Getting Ready for the Next Generation Science Standards: Reviewing the Draft ...
 
Understanding Information Architecture
Understanding Information ArchitectureUnderstanding Information Architecture
Understanding Information Architecture
 
It services & research methods
It services & research methodsIt services & research methods
It services & research methods
 

More from Khirulnizam Abd Rahman

Mobile Web App development multiplatform using phonegap-cordova
Mobile Web App development multiplatform using phonegap-cordovaMobile Web App development multiplatform using phonegap-cordova
Mobile Web App development multiplatform using phonegap-cordovaKhirulnizam Abd Rahman
 
Android app development hybrid approach for beginners - Tools Installations ...
Android app development  hybrid approach for beginners - Tools Installations ...Android app development  hybrid approach for beginners - Tools Installations ...
Android app development hybrid approach for beginners - Tools Installations ...Khirulnizam Abd Rahman
 
Android app development Hybrid approach for beginners
Android app development  Hybrid approach for beginnersAndroid app development  Hybrid approach for beginners
Android app development Hybrid approach for beginnersKhirulnizam Abd Rahman
 
Tips menyediakan slaid pembentangan berkesan - tiada template
Tips menyediakan slaid pembentangan berkesan - tiada templateTips menyediakan slaid pembentangan berkesan - tiada template
Tips menyediakan slaid pembentangan berkesan - tiada templateKhirulnizam Abd Rahman
 
Topik 4 Teknologi Komputer: Hardware, Software dan Heartware
Topik 4 Teknologi Komputer: Hardware, Software dan HeartwareTopik 4 Teknologi Komputer: Hardware, Software dan Heartware
Topik 4 Teknologi Komputer: Hardware, Software dan HeartwareKhirulnizam Abd Rahman
 
Topik 2 Sejarah Perkembanggan Ilmu NBWU1072
Topik 2 Sejarah Perkembanggan Ilmu NBWU1072Topik 2 Sejarah Perkembanggan Ilmu NBWU1072
Topik 2 Sejarah Perkembanggan Ilmu NBWU1072Khirulnizam Abd Rahman
 
Panduan tugasan Makmal Teknologi Maklumat dalam Kehidupan Insan
Panduan tugasan Makmal Teknologi Maklumat dalam Kehidupan InsanPanduan tugasan Makmal Teknologi Maklumat dalam Kehidupan Insan
Panduan tugasan Makmal Teknologi Maklumat dalam Kehidupan InsanKhirulnizam Abd Rahman
 
Npwu-mpu 3252 Teknologi Maklumat dalam Kehidupan Insan
Npwu-mpu 3252 Teknologi Maklumat dalam Kehidupan InsanNpwu-mpu 3252 Teknologi Maklumat dalam Kehidupan Insan
Npwu-mpu 3252 Teknologi Maklumat dalam Kehidupan InsanKhirulnizam Abd Rahman
 

More from Khirulnizam Abd Rahman (20)

Html5 + Bootstrap & Mobirise
Html5 + Bootstrap & MobiriseHtml5 + Bootstrap & Mobirise
Html5 + Bootstrap & Mobirise
 
Mobile Web App development multiplatform using phonegap-cordova
Mobile Web App development multiplatform using phonegap-cordovaMobile Web App development multiplatform using phonegap-cordova
Mobile Web App development multiplatform using phonegap-cordova
 
Android app development hybrid approach for beginners - Tools Installations ...
Android app development  hybrid approach for beginners - Tools Installations ...Android app development  hybrid approach for beginners - Tools Installations ...
Android app development hybrid approach for beginners - Tools Installations ...
 
Chapter 6 Java IO File
Chapter 6 Java IO FileChapter 6 Java IO File
Chapter 6 Java IO File
 
Chapter 5 Class File
Chapter 5 Class FileChapter 5 Class File
Chapter 5 Class File
 
Chapter 4 - Classes in Java
Chapter 4 - Classes in JavaChapter 4 - Classes in Java
Chapter 4 - Classes in Java
 
Android app development Hybrid approach for beginners
Android app development  Hybrid approach for beginnersAndroid app development  Hybrid approach for beginners
Android app development Hybrid approach for beginners
 
Tips menyediakan slaid pembentangan berkesan - tiada template
Tips menyediakan slaid pembentangan berkesan - tiada templateTips menyediakan slaid pembentangan berkesan - tiada template
Tips menyediakan slaid pembentangan berkesan - tiada template
 
Chapter 3 Arrays in Java
Chapter 3 Arrays in JavaChapter 3 Arrays in Java
Chapter 3 Arrays in Java
 
Topik 4 Teknologi Komputer: Hardware, Software dan Heartware
Topik 4 Teknologi Komputer: Hardware, Software dan HeartwareTopik 4 Teknologi Komputer: Hardware, Software dan Heartware
Topik 4 Teknologi Komputer: Hardware, Software dan Heartware
 
Chapter 2 Java Methods
Chapter 2 Java MethodsChapter 2 Java Methods
Chapter 2 Java Methods
 
Topik 3 Masyarakat Malaysia dan ICT
Topik 3   Masyarakat Malaysia dan ICTTopik 3   Masyarakat Malaysia dan ICT
Topik 3 Masyarakat Malaysia dan ICT
 
Chapter 2 Method in Java OOP
Chapter 2   Method in Java OOPChapter 2   Method in Java OOP
Chapter 2 Method in Java OOP
 
Topik 2 Sejarah Perkembanggan Ilmu NBWU1072
Topik 2 Sejarah Perkembanggan Ilmu NBWU1072Topik 2 Sejarah Perkembanggan Ilmu NBWU1072
Topik 2 Sejarah Perkembanggan Ilmu NBWU1072
 
Panduan tugasan Makmal Teknologi Maklumat dalam Kehidupan Insan
Panduan tugasan Makmal Teknologi Maklumat dalam Kehidupan InsanPanduan tugasan Makmal Teknologi Maklumat dalam Kehidupan Insan
Panduan tugasan Makmal Teknologi Maklumat dalam Kehidupan Insan
 
Topik 1 Islam dan Teknologi Maklumat
Topik 1 Islam dan Teknologi MaklumatTopik 1 Islam dan Teknologi Maklumat
Topik 1 Islam dan Teknologi Maklumat
 
Chapter 1 Nested Control Structures
Chapter 1 Nested Control StructuresChapter 1 Nested Control Structures
Chapter 1 Nested Control Structures
 
Chapter 1 nested control structures
Chapter 1 nested control structuresChapter 1 nested control structures
Chapter 1 nested control structures
 
DTCP2023 Fundamentals of Programming
DTCP2023 Fundamentals of ProgrammingDTCP2023 Fundamentals of Programming
DTCP2023 Fundamentals of Programming
 
Npwu-mpu 3252 Teknologi Maklumat dalam Kehidupan Insan
Npwu-mpu 3252 Teknologi Maklumat dalam Kehidupan InsanNpwu-mpu 3252 Teknologi Maklumat dalam Kehidupan Insan
Npwu-mpu 3252 Teknologi Maklumat dalam Kehidupan Insan
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 

Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azman from FSTM, UKM

  • 1. Application of Ontology in Semantic Information Retrieval Presentation for MyRENSeminar Berjaya Hotel, Kuala Lumpur 27 November 2014 1
  • 2. Brief speaker’s info 2 Shahrul Azman Mohd. Noah, Ph.D. Knowledge Technology Research Group Center for AI Technology (CAIT) shahrul@ukm.edu.my Graduated in BSc(Mathematics) from UKM Graduated in MSc(IS) from Sheffield U. Graduated in PhD(IS) from Sheffield U. – knowledge-based systems From Muar, Johor
  • 4. What is ontology? •Ontology may be considered as a kind of method to represent knowledge. •From a philosophical discipline –the science of “what is”; the kinds and structures of objects, properties, events, processes and relations in every area of reality. •Aristotle classification of animals is one the first ontology developed. 6
  • 5. Ontology in Computing •An ontology is an engineering artifact: –It is constituted by a specific vocabulary used to describe a certain reality, plus –A set of explicit assumptions regarding the intended meaning of the vocabulary. •Thus, an ontology describes a formal specification of a certain domain: –Shared understanding of a domain of interest –Formal and machine manipulablemodel of a domain of interest 7
  • 6. 8 Ontology Definition Formal, explicit specification of a shared conceptualization commonly accepted understanding conceptual model of a domain (ontological theory) unambiguous terminology definitions machine-readability with computational semantics [Gruber93]
  • 7. Source: Smith & Welty (2001) a catalog a set of text files a glossary a thesaurus a collection of taxonomies a set of general logical constraints a collection of frames Complexity An ontology is… 9
  • 8. Various approaches to classify ontologies 10 Classify ontologies according to the information the ontology needs to express and the richness of its internal structure (Lassila& McGuiness, 2001) Classify into 2 orthogonal dimensions: the amount and type of structure and the subject (Van Heijstet al., 1997) Classify ontologies according to their level of dependence on a particular task (Guarino, 1998)
  • 9. Ontology language • Ontology languages are formal languages used to construct ontologies – allow the encoding of knowledge about specific domains and often – include reasoning rules that support the processing of that knowledge • Various languages have been proposed: CycL, KL-One, Ontolingua, F-Logic, OCML, LOOM, Telos, RDF(S), OIL, DAML+OIL, XOL, SHOE, OWL etc. • Usually based on Description Logic (DL). • Summarised as (Kalibatiene & Vasilecas, 2011): 11
  • 10. Example of ontologies •Top level ontology - 12 Suggested Upper Merged Ontology (SUMO
  • 11. 13 Portion of SUMO ontology with USGS Geo-concepts inserted
  • 12. Example of ontologies (cont.) •Lexical ontology -Wordnet 14
  • 13. Example of ontologies (cont.) •Domain ontology -Simple News and Press Ontologies (SNaP) 15
  • 15. Applications of ontology •Searching & browsing •Decision support system •Question answering system •Recommendation •Data integration •Etc. 17
  • 17. Concepts •“Information retrieval (IR)is a field concerned with the structure, analysis, organization, storage, searching, and retrieval of information.” (Salton, 1968). •Applications of IR: recommendations, Q&A, filtering… and of course searching. 20
  • 18. Issues in IR •Some issues in IR: –Relevance –Evaluation –Users and information needs •Context based search •Semantic search •Etc. 21
  • 20. ONTOLOGY + INFORMATION RETRIEVAL 23
  • 21. Ontology and semantic search •Various ways to support semantic search: –Query expansion –users query are expanded with related terminological terms –Disambiguation –resolving terms or concepts when they refer to more than one topics –Classifying –classify documents such as ads into ontological topics to support semantic search –Enhanced IR model –embed ontology into existing IR model resulting a modified IR model 25
  • 22. Query Expansion •Query expansion (QE) is needed due to the ambiguity of natural language. •Main aim of QE –to add new meaningful terms to the initial query. 26 Bhogal, J., Macfarlane, A. & Smith, A. 2007. A review of ontology based query expansion. Information Processing and Management, 43: 866-886.
  • 24. Semantic index • Textual documents are indexed according to some ontology model. • Remember the concept of vocabulary in IR? 31 architecture bus computer database …. xml computer science collection index terms or vocabulary of the collection Extract Indexing
  • 25. Semantic index • Textual documents are indexed according to some ontology model. • Remember the concept of vocabulary in IR? 32 computer science collection Replace the index with ontological-index Extract Indexing architecture bus computer database …. xml
  • 26. Examples •Three research projects that illustrate the applications of ontology-based IR: –Semantic digital library –Crime news retrieval –Multi modality ontology-based image retrieval 35
  • 27. Semantic digital library •Proposed an approach for managing, organizing and populating ontology for document collections in digital library. •The document metadata and content are inserted and populated to a knowledge base which allows sophisticated query and searching. •Firstly to propose an ontology based information retrieval model which is based on the classic vector space model which includes document annotation, instance-based weighting and concept-based ranking. 36
  • 28. Semantic digital library •General architecture 37
  • 29. Semantic digital library •Involved three ontologies –ACM Topic hierarchies, Geo ontology and Dublin core metadata •Portion of domain ontology focusing on academic thesis 38
  • 30. Semantic digital library •Document annotation 39
  • 31. Semantic digital library •The process 40
  • 32. VSM Index #create Class Person #create instance of Class Student <Student rdf:ID="Student1"> <rdfs:label>ArifahAlhadi</rdfs:label> </Student> <Student rdf:ID="Student2"> <rdfs:labelrdf:datatype="http://www.w3.org/2001/XMLSchema#string" >AsyrafArifin</rdfs:label> </Student> #Create Instance of Class Supervisor <Supervisor rdf:ID="Supervisor1"> <rdfs:label>PM Dr ShahrulAzman</rdfs:label> <rdfs:label>Prof. MadyaDr. ShahrulAzmanMohdNoah</rdfs:label> </Supervisor> <Supervisor rdf:ID="Supervisor2"> <rdfs:label>Prof Aziz Deraman</rdfs:label> </Supervisor> Concept Instance Documents http://www.ukm.my/thesis/supervisor# http://www.ukm.my/thesis/person# Supervisor1 Doc1 http://ukm.my/thesis/student# http://ukm.my/thesis/creator# http://ukm.my/thesis/person# Student1 Doc1 http://ukm.my/thesis/student# http://ukm.my/thesis/creator# http://ukm.my/thesis/person# Student2 Doc1 Id Term TFIDF Frq Doc Id 1 ArifahAlhadi 0.11 2 Doc1 2 AsyrafArifin 0.123 1 Doc1 3 PMDr ShahrulAzman 0.45 1 Doc1
  • 33. Ontology-based IR for crime news retrieval •Each crime news must be classified into categories: Traffic Violation, Theft, Sex Crime, Murder, Kidnap, Fraud, Drugs, Cybercrime, Arsonand Gang(Chen et al. 2004) •Useful entities need to be identified: Person, Location, Organisation, Date/Time, Weapon, Amount, Vehicle, Drug, Personel properties, and Age. •Clustering of crime news into topics, e.g. NurinJazlinmurder, Canny Ong, Sosilawatietc. •Clustering of specific topic into various and chronological events. •Mapping of named entities into news ontology to support semantic querying and retrieval. 42
  • 34. Example 43 Murder Kidnap Theft Gang NurinJazlin Sosilawati Canny Ong Investigation into Canny Ong case include medical report and trial Evidence/Suspect into Canny Ong case DNA test Family reacts into Canny Ong and negligence suit Court Sentence, plead guilty (17) (6) (3) (9) (13) ……………….. Classification Clustering Cluster into topics
  • 35. Required methods •In order to support the aforementioned requirements: –Conventional text processing -tokenizing, indexing, stopping, stemming etc. –Named entity recognition (NER) –Classification and clustering –Ontology mapping 44
  • 36. 46 PRE-PROCESSING TASK DOCUMENT REPRESENTATION DOCUMENT ORGANIZATION + + •Stopwordremoval •Stemming •Parsing •Indexing •Bag of words •Named entity recognition •Classification -AdaBoost •Clustering – KNN •Semantic mapping
  • 37. Document representation •Documents will be presented into meaningful forms: –BoW–Bag of Words –Named Entity Recognition –used the GATE Annie and Jape rules –Adopt the Vector Space Model (VSM) but enhanced with ontological model 48
  • 39. Document organization •Documents need to be organised into categories, topics and events. –Classification –Adaboostalgorithm –Clustering –Used the KNN clustering –Ontology mapping –we have develop a crime news ontology by extending the existing SNaPontology. Includes classes/entities which are important to crime such as classification of crimes, locationand weapon. 50
  • 40. 51 Asset ontology Event ontology
  • 41. Extending the SNaP ontology and mapping to entities in news documents 52 SNaP Crime pne:Event pna:Asset pns:Stuff pns:Tangible pns:Location pns:Organization pns:Person event:Event rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf pns:Weapon pns:Vehicle pnc:Classification <Murder> <Kidnap> rdf:type rdf:type rdfs:subClassOf pne: subeventOf rdfs:domain rdfs:range <Event 1> rdf:type pnt:Tag rdfs:subClassOf rdfs:subClassOf pnc:Classifiable pnc: isClassifiedBy rdfs:subClassOf rdf:domain rdf:range rdfs:subClassOf rdfs:subClassOf rdfs:subClassOf
  • 42. The Application •What we need/desire. 53
  • 43. Ontology-based Image Retrieval •Rapid growth of visual information (VI) –lead to difficulty in finding and accessing VI. •Inability to capture the semantic content. •Problem arise –lack of coincidence between information extracted from VI and user needs. •Conventional approaches of image retrieval (IMR) -TBIR and CBIR have reached their limit in attempting to solve this problem. •As a result –SBIR approach, ontology-based provide an explicit domain oriented semantic for concept and relationship. 55
  • 44. Ontology-based Image Retrieval •Illustrate how images are describes based on it visual, textual and domain semantic features. •Proposed a multi-modality ontology: visual ontology, textual ontology and domain ontology. •Illustrate how such ontology can be integrated with open source knowledge base (DBpedia) to support a more comprehensive search. 56
  • 47. Example of Multi-modality ontology with DBpedia 59
  • 48. Conclusion -Practical implementation of ontology-based IR 60 TBox ABox Ontology Documents Index Extraction build Population Annotation Query Processing query ranked docs
  • 49. Research issues •Index representation –most still based on the conventional VSM. •Ranking –weighting and ranking mechanisms •Automatic population –supervised and unsupervised •Extraction & annotation •Multilingual and cross-language 61
  • 50. References •Castells, P., Fernandez, M.,Vallet, D. 2007. An Adaptation of Vector Space Model for Ontology Based Information Retrieval. IEEE Transaction on Knowledge and Data Engineering, 19(2): •Shahrul Azman Noah, Nor AfniRaziahAlias, NurulAida Osman, ZuraidahAbdullah, NazliaOmar, YazrinaYahya, MaryatiMohd Yusof: Ontology-Driven Semantic Digital Library. AIRS2010: 141-150. •Shahrul Azman Noah, DatulAida Ali: The Role of Lexical Ontology in Expanding the Semantic Textual Content of On-Line News Images. AIRS2010: 193-202. •Fernández, M., Cantador, I., López, V. , Vallet, D., Castells, P., & Motta, E. 2011. Semantically enhanced information retrieval: an ontology-based approach. Web Semantics: Science, Services and Agents on the World Wide Web, 9: 434-452. •Kara, S. Alan, O., Sabuncu, O., Akpınar, S., CicekliN.K., & Alpaslan, F.N. 2012. An ontology-based retrieval system using semantic indexing. Information Systems, 37: 294-305. •Kohler, J., Philippi, S., Specht, M., & Ruegg, A. 2006. Ontology based text indexing and querying for the semantic web. Knowledge-Based Systems, 19: 744-754. •Etc. 62
  • 52. Watson –the science behind an answer 65
  • 53. 66 1 2 3 4 5 6 7 8 9 10 11 Group members: 1.Shahrul Azman Mohd. Noah 2.JuhanaSalim 3.Masnizah Mohd 4.Nazlia Omar 5.Mohd Juzaiddin Ab Aziz 6.Nazlena Mohamad Ali 7.Saidah Saad 8.Shereena Mohd Arif 9.LailaltulqadriZakaria 10.Sabrina Tiun 11.Maryati Mohd. Yusof