SlideShare a Scribd company logo
Towards an Empirical Semantic
Web Science: Knowledge Pattern
Extraction and Usage
           Andrea Nuzzolese
                    Ph.D. Student
            Università di Bologna
               STLab, ISTC-CNR
Outline




•   Empirical Semantic Web Science and Knowledge Patterns (KPs)

•   A possible methodology for making KPs emerge from the Web of
    Data

•   The work done so far in KP extraction

•   Evaluating KPs' efficacy through Exploratory Search




                                2
Does a Web science exist?


•   A science usually is applied to clear research objects
    ✦   Physical and biological science analyzes the natural world, and tries to find
        microscopic laws that, extrapolated to the macroscopic realm, would
        generate the behavior observed

•   The Web is an engineered space created through formally
    specified languages and protocols

•   Web pages with their content and links are created by humans
    with a particular task governed by social conventions and laws

•   A Web science exists [Berners-Lee Et Al., 2006] and is oriented
    to:
    ✦   Growth of the engineered space;
    ✦   Human-web interaction patterns
                                         3
What about a Web of Data science?

•   Linked data offers huge data for empirical research




                                 4
What are the research objects of the empirical
                SW science?




 •   The Semantic Web and Linked data give us the chance to
     empirically study what are the patterns in organizing and
     representing knowledge

 •   The research objects of the Semantic Web as an empirical science
     are Knowledge Patterns (KPs)




                                  5
Knoweldge Patterns




•   KPs are small well connected units of meaning, which are
    ✦   task based
    ✦   well grounded
    ✦   cognitively sound

•   KPs find their theoretical grounding in frames
    ✦   “… a frame is a data-structure for representing a stereotyped
        situation.” [Minsky 1975]
    ✦   “...the availability of global patterns of knowledge cuts down on non-determinacy
        enough to offset idiosyncratic bottom-up input that might otherwise be
        confusing.” [Beaugrande 1980]



                                           6
An example of KP




         7
Empirical Semantic Web and KPs




•   KPs emerge from the knowledge soup deriving from the Web

•   A methodology for KP extraction from the Web




                              8
KP extraction



•   The Web is populated by heterogeneous sources

•   We can classify sources in two categories
    ✦   Formal and semi-formal sources modeled by adopting a top-down approach
        ✴   e.g., foundational ontologies, frames, thesauri, etc.
    ✦   Non-formal sources modeled by adopting a bottom-up approach
        ✴   e.g., RDBs, Linked Data, Web pages, XML documents, etc.

•   Our KP extraction methodology is based on two complementary
    approaches
    ✦   A top-down approach
    ✦   A bottom-up approach


                                               9
KP boundary




      10
KP detection and discovery




•   The top-down approach is aimed to extract KPs that already
    exists in a formal or semi-formal structure
    ✦   Possible techniques: reengineering, refactoring based on association rules,
        key concept identification, ontology mapping, etc.

•   The bottom-up approach is aimed to extract to discover or detect
    KPs from data
    ✦   Possible techniques: inductive techniques, machine learning, data mining,
        ontology mining, etc.




                                        11
KP validation



•   The top-down and the bottom-up approaches concur in the
    validation of KPs

•   KP extraction is a matter of understanding how the world or
    specific domains have been described from different perspectives
    ✦   The perspective of domain experts, ontologists, etc., which try to give
        formalizations either of the world or of specific domains
    ✦   The perspective of users, data entries, etc, which effectively populate and
        manage data that report facts about the world

•   For example it would be cognitively relevant if an occurrence of
    KP emerges both with the top-down and the bottom-up
    approach

                                        12
KP extraction methodology




             13
KP reengineering from FrameNet’s frames




•   FrameNet is a cognitive sound lexical knowledge base, which is
    grounded in a large corpus

•   FrameNet consists of a set of frames, which have frame elements
    lexical units, which pair words (lexemes) to frames, and relations
    to corpus elements
    ✦   Each frame can be interpreted as a class of situations




                                        14
An example of frame




          15
Using Semion for reengineering and
                refactoring FrameNet’s frame

!"#$%"$#&'(
!%)*+&(




,-./$-01%(
!%)*+&(




,-./$-01%(
2&"&(




34#5$0(
2&"&(




6*7*#*.1&'(
2&"&(



                                16
FrameNet as LOD




        17
FrameNet as KPs




        18
KP discovery from Wikipedia links




•   Hypothesis
    ✦   the types of linked resources that occur most often for a certain type of
        resource constitute its KP
    ✦   since we expect that any cognitive invariance in explaining/describing things
        is reflected in the wikilink graph, discovered KPs are cognitively sound

•   Contribution
    ✦   an EKP discovery procedure
    ✦   184 EKPs published in OWL2




                                        19
Collecting paths from wikilinks

                                                                              dbpedia:
     dbpo:Person            owl:Thing                        owl:Thing
                                                                             Organisation


                                                Path
        dbpo:                                                                 dbpedia:
                           db:Minnie_Mouse      db:The_Walt_Disney_Company    Company
 FictionalCharacter




dbpo:wikiPageWikiLink                           Path

       rdf:type
                                  dbpo: db:Mickey_Mouse
                           FictionalCharacter
    rdfs:subClassOf

                               dbpo:
                                                             owl:Thing
                        FictionalCharacter

                                             dbpo:Person
                                                       20
Path popularity


                                           Jackson_5
        Dave_Grohl          Michael_Jackson

                                                              Jackie_Jackson
                       Nirvana

                                Madonna
                                                 Prince
                       Charlie_Parker                     Keith_Jarrett

Foo Fighters                                Beatles
   nSubjectRes(Pi,j)/nRes(Si)

                                                              John_Lennon
                                Paul_McCartney



                                     21
Boundaries of KPs




•   An KP(Si) is a set of paths, such that


                  Pi,j ∈ KP(Si) !   pathPopularity(Pi,j, Si) ≥ t



•   t is a threshold, under which a path is not included in an KP

•   How to get a good value for t?



                                     22
Boundary induction


Step                        Description

 1     For each path, calculate the path popularity

       For each subject type, get the 40 top-ranked path popularity
 2
       values*
       Apply multiple correlation (Pearson ρ) between the paths of all
 3     subject types by rank, and check for homogeneity of ranks
       across subject types
       For each of the 40 path popularity ranks, calculate its mean
 4
       across all subject types

 5     Apply k-means clustering on the 40 ranks

       Decide threshold(s) based on k-means as well as other
 6
       indicators (e.g. FrameNet roles distribution)
                             23
Boundary induction




          24
How can be KPs evaluated and used?




•   The evaluation of KPs should be performed in terms of their
    capability to be cognitively sound in capturing and representing
    knowledge

•   A scenario that can be used as for evaluating the efficacy of KPs
    is the exploratory search combined with user studies.




                                 25
Why exploratory search?



•   Exploratory search is characterized “by uncertainty about the space
    being searched and the nature of the problem that motivates the
    search” [White Et Al., 2005]

•   KPs can be used for supporting exploratory search
    ✦   They can be used in order to filter knowledge by drawing a meaningful
        boundary around the retrieved data
    ✦   They allow to suggest exploratory paths based on cognitive criteria of
        relevance

•   We can investigate how KPs help users in exploratory search
    tasks


                                       26
Aemoo: KP-based exploratory search




•   A Web application that supports exploratory search on the Web
    based on KPs extracted from Wikipedia links

•   It aggregates knowledge from Linked Data, Wikipedia, Twitter and
    Google News by applying KPs as knowledge lenses over data

•   It provides an effective summary of knowledge about an entity,
    including explanations




                                27
Exploring knowledge with Aemoo (1)




                  28
Exploring knowledge with Aemoo (2)




                  29
Conclusions


•   We want to contribute to the realization of the Semantic Web as
    an empirical science by providing a methodology for KP
    extraction

•   Our methodology for extracting KPs is based on two approaches
    ✦   a top-down approach
    ✦   a bottom-up approach

•   We have seen our experience in KP extraction so far
    ✦   KPs from FrameNet’s frames
    ✦   KPs from Wikipedia links

•   The evaluation we have in mind should be performed by means of
    exploratory search tasks
    ✦   Aemoo
                                     30
Thanks




  31

More Related Content

Viewers also liked

Conference Linked Data: the ScholarlyData project
Conference Linked Data: the ScholarlyData projectConference Linked Data: the ScholarlyData project
Conference Linked Data: the ScholarlyData project
Andrea Nuzzolese
 
Winnie's presentation
Winnie's presentationWinnie's presentation
Winnie's presentation
asiimwe1990
 
David aradillasppt
David aradillaspptDavid aradillasppt
David aradillaspptdeividex
 
Differentiated unit- plant CSI
Differentiated unit- plant CSIDifferentiated unit- plant CSI
Differentiated unit- plant CSImariatzifas
 
Winnie's presentation
Winnie's presentationWinnie's presentation
Winnie's presentation
asiimwe1990
 
Oke
OkeOke
Winnie's presentation
Winnie's presentationWinnie's presentation
Winnie's presentation
asiimwe1990
 
Leader brands analysis
Leader brands analysisLeader brands analysis
Leader brands analysis
Mark O'Connor
 
eCommerce Age: Come aumentare le vendite del tuo Shop Online con l'Email Mark...
eCommerce Age: Come aumentare le vendite del tuo Shop Online con l'Email Mark...eCommerce Age: Come aumentare le vendite del tuo Shop Online con l'Email Mark...
eCommerce Age: Come aumentare le vendite del tuo Shop Online con l'Email Mark...
Guglielmo Arrigoni
 

Viewers also liked (9)

Conference Linked Data: the ScholarlyData project
Conference Linked Data: the ScholarlyData projectConference Linked Data: the ScholarlyData project
Conference Linked Data: the ScholarlyData project
 
Winnie's presentation
Winnie's presentationWinnie's presentation
Winnie's presentation
 
David aradillasppt
David aradillaspptDavid aradillasppt
David aradillasppt
 
Differentiated unit- plant CSI
Differentiated unit- plant CSIDifferentiated unit- plant CSI
Differentiated unit- plant CSI
 
Winnie's presentation
Winnie's presentationWinnie's presentation
Winnie's presentation
 
Oke
OkeOke
Oke
 
Winnie's presentation
Winnie's presentationWinnie's presentation
Winnie's presentation
 
Leader brands analysis
Leader brands analysisLeader brands analysis
Leader brands analysis
 
eCommerce Age: Come aumentare le vendite del tuo Shop Online con l'Email Mark...
eCommerce Age: Come aumentare le vendite del tuo Shop Online con l'Email Mark...eCommerce Age: Come aumentare le vendite del tuo Shop Online con l'Email Mark...
eCommerce Age: Come aumentare le vendite del tuo Shop Online con l'Email Mark...
 

Similar to Towards an Empirical Semantic Web Science: Knowledge Pattern Extraction and Usage

ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+
Prateek Jain
 
Knowledge Patterns for the Web: extraction, transformation, and reuse
Knowledge Patterns for the Web: extraction, transformation, and reuseKnowledge Patterns for the Web: extraction, transformation, and reuse
Knowledge Patterns for the Web: extraction, transformation, and reuse
Andrea Nuzzolese
 
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Dr. Aparna Varde
 
Type inference through the analysis of Wikipedia links
Type inference through the analysis of Wikipedia linksType inference through the analysis of Wikipedia links
Type inference through the analysis of Wikipedia linksAndrea Nuzzolese
 
Introduction_to_knowledge_graph.pdf
Introduction_to_knowledge_graph.pdfIntroduction_to_knowledge_graph.pdf
Introduction_to_knowledge_graph.pdf
JaberRad1
 
PhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek JainPhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek Jain
Artificial Intelligence Institute at UofSC
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Prateek Jain
 
Wi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX toolWi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX tool
Laura Po
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
Andre Freitas
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
Debanjan Mahata
 
Blurring boundaries to spark motivation: collaborative approaches to teaching...
Blurring boundaries to spark motivation: collaborative approaches to teaching...Blurring boundaries to spark motivation: collaborative approaches to teaching...
Blurring boundaries to spark motivation: collaborative approaches to teaching...
megan.fitzgibbons
 
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCFueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Valentina Presutti
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2Seonho Kim
 
Mining Web content for Enhanced Search
Mining Web content for Enhanced Search Mining Web content for Enhanced Search
Mining Web content for Enhanced Search
Roi Blanco
 
Domain-specific Knowledge Extraction from the Web of Data
Domain-specific Knowledge Extraction from the Web of DataDomain-specific Knowledge Extraction from the Web of Data
Domain-specific Knowledge Extraction from the Web of Data
Artificial Intelligence Institute at UofSC
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Artificial Intelligence Institute at UofSC
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary Defense
Yongyao Jiang
 
Book of the Dead Project
Book of the Dead ProjectBook of the Dead Project
Book of the Dead Project
Barry Norton
 

Similar to Towards an Empirical Semantic Web Science: Knowledge Pattern Extraction and Usage (20)

ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+ ESWC 2011 BLOOMS+
ESWC 2011 BLOOMS+
 
Knowledge Patterns for the Web: extraction, transformation, and reuse
Knowledge Patterns for the Web: extraction, transformation, and reuseKnowledge Patterns for the Web: extraction, transformation, and reuse
Knowledge Patterns for the Web: extraction, transformation, and reuse
 
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
Information to Wisdom: Commonsense Knowledge Extraction and Compilation - Part 3
 
Type inference through the analysis of Wikipedia links
Type inference through the analysis of Wikipedia linksType inference through the analysis of Wikipedia links
Type inference through the analysis of Wikipedia links
 
Introduction_to_knowledge_graph.pdf
Introduction_to_knowledge_graph.pdfIntroduction_to_knowledge_graph.pdf
Introduction_to_knowledge_graph.pdf
 
PhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek JainPhD Proposal Defense - Prateek Jain
PhD Proposal Defense - Prateek Jain
 
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesLinked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
Linked Open Data Alignment and Enrichment Using Bootstrapping Based Techniques
 
Wi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX toolWi2015 - Clustering of Linked Open Data - the LODeX tool
Wi2015 - Clustering of Linked Open Data - the LODeX tool
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Blurring boundaries to spark motivation: collaborative approaches to teaching...
Blurring boundaries to spark motivation: collaborative approaches to teaching...Blurring boundaries to spark motivation: collaborative approaches to teaching...
Blurring boundaries to spark motivation: collaborative approaches to teaching...
 
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWCFueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
Fueling the future with Semantic Web patterns - Keynote at WOP2014@ISWC
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2
 
Mining Web content for Enhanced Search
Mining Web content for Enhanced Search Mining Web content for Enhanced Search
Mining Web content for Enhanced Search
 
Domain-specific Knowledge Extraction from the Web of Data
Domain-specific Knowledge Extraction from the Web of DataDomain-specific Knowledge Extraction from the Web of Data
Domain-specific Knowledge Extraction from the Web of Data
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
 
A Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary DefenseA Knowledge Discovery Framework for Planetary Defense
A Knowledge Discovery Framework for Planetary Defense
 
Book of the Dead Project
Book of the Dead ProjectBook of the Dead Project
Book of the Dead Project
 
Deep Web mining
Deep Web miningDeep Web mining
Deep Web mining
 

More from Andrea Nuzzolese

Semantic Technologies in ST&DL
Semantic Technologies in ST&DLSemantic Technologies in ST&DL
Semantic Technologies in ST&DL
Andrea Nuzzolese
 
Sheldon challenge
Sheldon challengeSheldon challenge
Sheldon challenge
Andrea Nuzzolese
 
Evaluating citation functions in CiTO: cognitive issues
Evaluating citation functions in CiTO: cognitive issuesEvaluating citation functions in CiTO: cognitive issues
Evaluating citation functions in CiTO: cognitive issues
Andrea Nuzzolese
 
Towards the automatic identification of the nature of citations
Towards the automatic identification of the nature of citationsTowards the automatic identification of the nature of citations
Towards the automatic identification of the nature of citationsAndrea Nuzzolese
 
Knowledge Representation and Reasoning with Apache Stanbol
Knowledge Representation and Reasoning with Apache StanbolKnowledge Representation and Reasoning with Apache Stanbol
Knowledge Representation and Reasoning with Apache Stanbol
Andrea Nuzzolese
 
Gathering Lexical Linked Data and Knowledge Patterns from FrameNet
Gathering Lexical Linked Data and Knowledge Patterns from FrameNetGathering Lexical Linked Data and Knowledge Patterns from FrameNet
Gathering Lexical Linked Data and Knowledge Patterns from FrameNetAndrea Nuzzolese
 
Aemoo: exploratory search based on knowledge patterns over the Semantic Web
Aemoo:  exploratory search based on knowledge patterns over the Semantic WebAemoo:  exploratory search based on knowledge patterns over the Semantic Web
Aemoo: exploratory search based on knowledge patterns over the Semantic WebAndrea Nuzzolese
 

More from Andrea Nuzzolese (8)

Semantic Technologies in ST&DL
Semantic Technologies in ST&DLSemantic Technologies in ST&DL
Semantic Technologies in ST&DL
 
Sheldon challenge
Sheldon challengeSheldon challenge
Sheldon challenge
 
Evaluating citation functions in CiTO: cognitive issues
Evaluating citation functions in CiTO: cognitive issuesEvaluating citation functions in CiTO: cognitive issues
Evaluating citation functions in CiTO: cognitive issues
 
Loditaly2014 new
Loditaly2014 newLoditaly2014 new
Loditaly2014 new
 
Towards the automatic identification of the nature of citations
Towards the automatic identification of the nature of citationsTowards the automatic identification of the nature of citations
Towards the automatic identification of the nature of citations
 
Knowledge Representation and Reasoning with Apache Stanbol
Knowledge Representation and Reasoning with Apache StanbolKnowledge Representation and Reasoning with Apache Stanbol
Knowledge Representation and Reasoning with Apache Stanbol
 
Gathering Lexical Linked Data and Knowledge Patterns from FrameNet
Gathering Lexical Linked Data and Knowledge Patterns from FrameNetGathering Lexical Linked Data and Knowledge Patterns from FrameNet
Gathering Lexical Linked Data and Knowledge Patterns from FrameNet
 
Aemoo: exploratory search based on knowledge patterns over the Semantic Web
Aemoo:  exploratory search based on knowledge patterns over the Semantic WebAemoo:  exploratory search based on knowledge patterns over the Semantic Web
Aemoo: exploratory search based on knowledge patterns over the Semantic Web
 

Recently uploaded

The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 

Recently uploaded (20)

The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 

Towards an Empirical Semantic Web Science: Knowledge Pattern Extraction and Usage

  • 1. Towards an Empirical Semantic Web Science: Knowledge Pattern Extraction and Usage Andrea Nuzzolese Ph.D. Student Università di Bologna STLab, ISTC-CNR
  • 2. Outline • Empirical Semantic Web Science and Knowledge Patterns (KPs) • A possible methodology for making KPs emerge from the Web of Data • The work done so far in KP extraction • Evaluating KPs' efficacy through Exploratory Search 2
  • 3. Does a Web science exist? • A science usually is applied to clear research objects ✦ Physical and biological science analyzes the natural world, and tries to find microscopic laws that, extrapolated to the macroscopic realm, would generate the behavior observed • The Web is an engineered space created through formally specified languages and protocols • Web pages with their content and links are created by humans with a particular task governed by social conventions and laws • A Web science exists [Berners-Lee Et Al., 2006] and is oriented to: ✦ Growth of the engineered space; ✦ Human-web interaction patterns 3
  • 4. What about a Web of Data science? • Linked data offers huge data for empirical research 4
  • 5. What are the research objects of the empirical SW science? • The Semantic Web and Linked data give us the chance to empirically study what are the patterns in organizing and representing knowledge • The research objects of the Semantic Web as an empirical science are Knowledge Patterns (KPs) 5
  • 6. Knoweldge Patterns • KPs are small well connected units of meaning, which are ✦ task based ✦ well grounded ✦ cognitively sound • KPs find their theoretical grounding in frames ✦ “… a frame is a data-structure for representing a stereotyped situation.” [Minsky 1975] ✦ “...the availability of global patterns of knowledge cuts down on non-determinacy enough to offset idiosyncratic bottom-up input that might otherwise be confusing.” [Beaugrande 1980] 6
  • 8. Empirical Semantic Web and KPs • KPs emerge from the knowledge soup deriving from the Web • A methodology for KP extraction from the Web 8
  • 9. KP extraction • The Web is populated by heterogeneous sources • We can classify sources in two categories ✦ Formal and semi-formal sources modeled by adopting a top-down approach ✴ e.g., foundational ontologies, frames, thesauri, etc. ✦ Non-formal sources modeled by adopting a bottom-up approach ✴ e.g., RDBs, Linked Data, Web pages, XML documents, etc. • Our KP extraction methodology is based on two complementary approaches ✦ A top-down approach ✦ A bottom-up approach 9
  • 11. KP detection and discovery • The top-down approach is aimed to extract KPs that already exists in a formal or semi-formal structure ✦ Possible techniques: reengineering, refactoring based on association rules, key concept identification, ontology mapping, etc. • The bottom-up approach is aimed to extract to discover or detect KPs from data ✦ Possible techniques: inductive techniques, machine learning, data mining, ontology mining, etc. 11
  • 12. KP validation • The top-down and the bottom-up approaches concur in the validation of KPs • KP extraction is a matter of understanding how the world or specific domains have been described from different perspectives ✦ The perspective of domain experts, ontologists, etc., which try to give formalizations either of the world or of specific domains ✦ The perspective of users, data entries, etc, which effectively populate and manage data that report facts about the world • For example it would be cognitively relevant if an occurrence of KP emerges both with the top-down and the bottom-up approach 12
  • 14. KP reengineering from FrameNet’s frames • FrameNet is a cognitive sound lexical knowledge base, which is grounded in a large corpus • FrameNet consists of a set of frames, which have frame elements lexical units, which pair words (lexemes) to frames, and relations to corpus elements ✦ Each frame can be interpreted as a class of situations 14
  • 15. An example of frame 15
  • 16. Using Semion for reengineering and refactoring FrameNet’s frame !"#$%"$#&'( !%)*+&( ,-./$-01%( !%)*+&( ,-./$-01%( 2&"&( 34#5$0( 2&"&( 6*7*#*.1&'( 2&"&( 16
  • 19. KP discovery from Wikipedia links • Hypothesis ✦ the types of linked resources that occur most often for a certain type of resource constitute its KP ✦ since we expect that any cognitive invariance in explaining/describing things is reflected in the wikilink graph, discovered KPs are cognitively sound • Contribution ✦ an EKP discovery procedure ✦ 184 EKPs published in OWL2 19
  • 20. Collecting paths from wikilinks dbpedia: dbpo:Person owl:Thing owl:Thing Organisation Path dbpo: dbpedia: db:Minnie_Mouse db:The_Walt_Disney_Company Company FictionalCharacter dbpo:wikiPageWikiLink Path rdf:type dbpo: db:Mickey_Mouse FictionalCharacter rdfs:subClassOf dbpo: owl:Thing FictionalCharacter dbpo:Person 20
  • 21. Path popularity Jackson_5 Dave_Grohl Michael_Jackson Jackie_Jackson Nirvana Madonna Prince Charlie_Parker Keith_Jarrett Foo Fighters Beatles nSubjectRes(Pi,j)/nRes(Si) John_Lennon Paul_McCartney 21
  • 22. Boundaries of KPs • An KP(Si) is a set of paths, such that Pi,j ∈ KP(Si) ! pathPopularity(Pi,j, Si) ≥ t • t is a threshold, under which a path is not included in an KP • How to get a good value for t? 22
  • 23. Boundary induction Step Description 1 For each path, calculate the path popularity For each subject type, get the 40 top-ranked path popularity 2 values* Apply multiple correlation (Pearson ρ) between the paths of all 3 subject types by rank, and check for homogeneity of ranks across subject types For each of the 40 path popularity ranks, calculate its mean 4 across all subject types 5 Apply k-means clustering on the 40 ranks Decide threshold(s) based on k-means as well as other 6 indicators (e.g. FrameNet roles distribution) 23
  • 25. How can be KPs evaluated and used? • The evaluation of KPs should be performed in terms of their capability to be cognitively sound in capturing and representing knowledge • A scenario that can be used as for evaluating the efficacy of KPs is the exploratory search combined with user studies. 25
  • 26. Why exploratory search? • Exploratory search is characterized “by uncertainty about the space being searched and the nature of the problem that motivates the search” [White Et Al., 2005] • KPs can be used for supporting exploratory search ✦ They can be used in order to filter knowledge by drawing a meaningful boundary around the retrieved data ✦ They allow to suggest exploratory paths based on cognitive criteria of relevance • We can investigate how KPs help users in exploratory search tasks 26
  • 27. Aemoo: KP-based exploratory search • A Web application that supports exploratory search on the Web based on KPs extracted from Wikipedia links • It aggregates knowledge from Linked Data, Wikipedia, Twitter and Google News by applying KPs as knowledge lenses over data • It provides an effective summary of knowledge about an entity, including explanations 27
  • 28. Exploring knowledge with Aemoo (1) 28
  • 29. Exploring knowledge with Aemoo (2) 29
  • 30. Conclusions • We want to contribute to the realization of the Semantic Web as an empirical science by providing a methodology for KP extraction • Our methodology for extracting KPs is based on two approaches ✦ a top-down approach ✦ a bottom-up approach • We have seen our experience in KP extraction so far ✦ KPs from FrameNet’s frames ✦ KPs from Wikipedia links • The evaluation we have in mind should be performed by means of exploratory search tasks ✦ Aemoo 30