SlideShare a Scribd company logo
1 of 24
1
Using Semantics and Statistics to
Turn Data into Knowledge
Based on: Pujara, Jay, et al. "Using Semantics and Statistics to Turn Data into Knowledge." AI
Magazine 36.1 (2015): 65-74.
And
Pujara, Jay, et al. "Knowledge graph identification." International Semantic Web Conference.
Springer Berlin Heidelberg, 2013.
Presented By: Sarasi Lalithsena
Semantic-Cognitive-Perceptual Computing Class -
Summer 2016
Knowledge graphs (KGs)
2
A Graph representation of facts where entities are connected by relationships
Image credit: http://searchengineland.com/laymans-visual-
guide-googles-knowledge-graph-search-api-241935
Google knowledge graph NELL
Mitchell, T, et al. AAAI 2015
Automatic Knowledge Graph Construction
Existing work on KG construction can be categorized broadly into
these groups,
• build on Wikipedia infoboxes and other structured data sources -
YAGO, DBpedia, Freebase, WikiData
• extract information from the entire web but uses a fixed
ontology/schema - NELL, Knowledge Vault
• extract information from the entire web but does not use a
schema - Reverb, OLLIE
• construct taxonomies - Probase
3
Automatic Knowledge Graph Extraction
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013 4
Challenge
5
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
Example of NELL errors
• Entity co-reference error
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013 6
Example of NELL errors
• Missing and incorrect types (labels)
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013 7
Example of NELL errors
• Missing and incorrect relations
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013 8
Violation of the schema knowledge
• Equivalence of co-referent entities (owl:sameAs)
– sameEntity(Kyrgyzstan, Kyrgyz Republic)
• Mutual exclusion of types (disjoint)
– MUT(country, bird)
• Constraint on relations (domain and range)
– LocatedIn(country, continent)
Requires reasoning jointly over the candidates
9
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
Problem Revisited
10
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
Approach – In a nutshell
11
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
Approach – In a nutshell
12
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
Approach – In a nutshell
13
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
Approach – In a nutshell
14
Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
Approach – Probabilistic Soft Logic (PSL)
Statistical Learning Approach
• Capture both the structure of the knowledge graph and the
logical dependencies between the facts
• Unlike traditional reasoning systems, it can treat ontological
constraints as weighted rules using them as hints
• Can be specified using predicates and rules written in first-
order logic syntax and translated into a probabilistic graphical
model.
15
Approach: Probabilistic Soft Logic
• A PSL model is composed of a set of weighted, first-order logic
rules, where each rule defines a set of features
• PSL associates a truth value for each ground rule
16
Probabilistic Soft Logic Rule
w is the weight of the rule
Approach: Probabilistic Soft Logic
• Fact extraction from can be done with multiple extractors –
Structural elements, Pattern-based classifiers
WCR-T: CandRELT(E1, E2, R) => REL(E1, E2, R)
WCL-T: CandLBLT(E, L) => LBL(E, L)
17
Every fact generated by each extractor has a weight
Approach: Probabilistic Soft Logic
• Incorporate co-reference entities
Uses soft logic formulation
Truth value is relaxed to [0,1] intervals
18
Pujara, Jay, et al. "Using Semantics and Statistics to Turn Data into Knowledge." AI Magazine 36.1 (2015): 65-74.
Approach: Probabilistic Soft Logic
• Incorporate schema constraints
19
Jian et al., ICDM 2012
Approach: Putting all together
Pujara, Jay, et al. "Using Semantics and Statistics to Turn Data into Knowledge." AI Magazine 36.1 (2015): 65-74. 20
Represent it in a graphical model – Each possible fact is a variable;
dependencies exist between facts
Approach: Putting all together
• Each ground rule has a weighted distance to satisfaction
derived from the formula’s truth value
• Out of all possible KGs, it find the best KG using the joint
distribution
• Uses convex function to deal with the scalability
21
Rule SatisfactionWeighted distance
Joint probability distribution over all variables in a KG
Evaluation – NELL experiments
• Full KG from uncertain extractions
Baseline: NELL with ontology consistence
Compare this to the KG created with PSL
22
Running time for completes in 130 minutes for 4.3 M facts for PSL
approach
Conclusions
• Probabilistic soft logic looks like a really interesting tool to
combine statistics and semantic
• It works well to identify a accurate KG from a noisy KG
23
24
Thank You !!!!!

More Related Content

Viewers also liked

Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Artificial Intelligence Institute at UofSC
 
Semantic, Cognitive and Perceptual Computing -Keynote artificial intelligence...
Semantic, Cognitive and Perceptual Computing -Keynote artificial intelligence...Semantic, Cognitive and Perceptual Computing -Keynote artificial intelligence...
Semantic, Cognitive and Perceptual Computing -Keynote artificial intelligence...Artificial Intelligence Institute at UofSC
 
Semantic, Cognitive and Perceptual Computing -Perceptual computing from the f...
Semantic, Cognitive and Perceptual Computing -Perceptual computing from the f...Semantic, Cognitive and Perceptual Computing -Perceptual computing from the f...
Semantic, Cognitive and Perceptual Computing -Perceptual computing from the f...Artificial Intelligence Institute at UofSC
 
Stream Reasoning: mastering the velocity and variety dimensions of Big Data...
Stream Reasoning: mastering the velocity and variety dimensions of Big Data...Stream Reasoning: mastering the velocity and variety dimensions of Big Data...
Stream Reasoning: mastering the velocity and variety dimensions of Big Data...Artificial Intelligence Institute at UofSC
 
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...Artificial Intelligence Institute at UofSC
 
Trending: Social media analysis to monitor cannabis and synthetic cannabino...
Trending: Social media analysis to monitor cannabis and synthetic cannabino...Trending: Social media analysis to monitor cannabis and synthetic cannabino...
Trending: Social media analysis to monitor cannabis and synthetic cannabino...Artificial Intelligence Institute at UofSC
 
Challenges in understanding clinical notes: Why NLP engines fall short and wh...
Challenges in understanding clinical notes: Why NLP engines fall short and wh...Challenges in understanding clinical notes: Why NLP engines fall short and wh...
Challenges in understanding clinical notes: Why NLP engines fall short and wh...Artificial Intelligence Institute at UofSC
 

Viewers also liked (20)

Word Embeddings to Enhance Twitter Gang Member Profile Identification
Word Embeddings to Enhance Twitter Gang Member Profile IdentificationWord Embeddings to Enhance Twitter Gang Member Profile Identification
Word Embeddings to Enhance Twitter Gang Member Profile Identification
 
Semantic, Cognitive and Perceptual Computing -Deep learning
Semantic, Cognitive and Perceptual Computing -Deep learning Semantic, Cognitive and Perceptual Computing -Deep learning
Semantic, Cognitive and Perceptual Computing -Deep learning
 
Semantic, Cognitive and Perceptual Computing -Moonwalking with einstein
Semantic, Cognitive and Perceptual Computing -Moonwalking with einsteinSemantic, Cognitive and Perceptual Computing -Moonwalking with einstein
Semantic, Cognitive and Perceptual Computing -Moonwalking with einstein
 
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...
 
Semantic, Cognitive and Perceptual Computing -Keynote artificial intelligence...
Semantic, Cognitive and Perceptual Computing -Keynote artificial intelligence...Semantic, Cognitive and Perceptual Computing -Keynote artificial intelligence...
Semantic, Cognitive and Perceptual Computing -Keynote artificial intelligence...
 
Semantic, Cognitive and Perceptual Computing -Perceptual computing from the f...
Semantic, Cognitive and Perceptual Computing -Perceptual computing from the f...Semantic, Cognitive and Perceptual Computing -Perceptual computing from the f...
Semantic, Cognitive and Perceptual Computing -Perceptual computing from the f...
 
Finding Street Gang Members on Twitter
Finding Street Gang Members on TwitterFinding Street Gang Members on Twitter
Finding Street Gang Members on Twitter
 
Stream Reasoning: mastering the velocity and variety dimensions of Big Data...
Stream Reasoning: mastering the velocity and variety dimensions of Big Data...Stream Reasoning: mastering the velocity and variety dimensions of Big Data...
Stream Reasoning: mastering the velocity and variety dimensions of Big Data...
 
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
Vahid Taslimitehrani PhD Dissertation Defense: Contrast Pattern Aided Regress...
 
Implicit Entity Linking in Tweets
Implicit Entity Linking in TweetsImplicit Entity Linking in Tweets
Implicit Entity Linking in Tweets
 
Trending: Social media analysis to monitor cannabis and synthetic cannabino...
Trending: Social media analysis to monitor cannabis and synthetic cannabino...Trending: Social media analysis to monitor cannabis and synthetic cannabino...
Trending: Social media analysis to monitor cannabis and synthetic cannabino...
 
Integrating Sensor and Social Data for Understanding City Events
Integrating Sensor and Social Data for Understanding City EventsIntegrating Sensor and Social Data for Understanding City Events
Integrating Sensor and Social Data for Understanding City Events
 
Big Data Challenges and Trust Management at CTS -2016
Big Data Challenges and Trust Management at CTS -2016Big Data Challenges and Trust Management at CTS -2016
Big Data Challenges and Trust Management at CTS -2016
 
Finding Street Gang Members on Twitter
Finding Street Gang Members on TwitterFinding Street Gang Members on Twitter
Finding Street Gang Members on Twitter
 
RDF Streams and Continuous SPARQL (C-SPARQL)
RDF Streams and Continuous SPARQL (C-SPARQL)RDF Streams and Continuous SPARQL (C-SPARQL)
RDF Streams and Continuous SPARQL (C-SPARQL)
 
Exploring Synthetic Cannabinoid Effects Using Web Forum Data
Exploring Synthetic Cannabinoid Effects Using Web Forum Data Exploring Synthetic Cannabinoid Effects Using Web Forum Data
Exploring Synthetic Cannabinoid Effects Using Web Forum Data
 
Challenges in understanding clinical notes: Why NLP engines fall short and wh...
Challenges in understanding clinical notes: Why NLP engines fall short and wh...Challenges in understanding clinical notes: Why NLP engines fall short and wh...
Challenges in understanding clinical notes: Why NLP engines fall short and wh...
 
Semantic perception tkp(1)
Semantic perception tkp(1)Semantic perception tkp(1)
Semantic perception tkp(1)
 
Semantic, Cognitive and Perceptual Computing -Human mental representation
Semantic, Cognitive and Perceptual Computing -Human mental representationSemantic, Cognitive and Perceptual Computing -Human mental representation
Semantic, Cognitive and Perceptual Computing -Human mental representation
 
Implicit Entity Recognition in Clinical Documents
Implicit Entity Recognition in Clinical DocumentsImplicit Entity Recognition in Clinical Documents
Implicit Entity Recognition in Clinical Documents
 

Similar to Semantic, Cognitive and Perceptual Computing -Using semantics and statistics to turn data into knowledge

From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsPaul Groth
 
Kwon Ph.D. Dissertation 2016
Kwon Ph.D. Dissertation 2016Kwon Ph.D. Dissertation 2016
Kwon Ph.D. Dissertation 2016Karl Kwon, Ph.D.
 
Kyeongan Kwon - PhD Dissertation 2016
Kyeongan Kwon - PhD Dissertation 2016Kyeongan Kwon - PhD Dissertation 2016
Kyeongan Kwon - PhD Dissertation 2016Karl Kwon, Ph.D.
 
Data Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIData Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIPaul Groth
 
ChemnitzDec2014.key.compressed
ChemnitzDec2014.key.compressedChemnitzDec2014.key.compressed
ChemnitzDec2014.key.compressedBrian Fisher
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...Bertram Ludäscher
 
Data sharing in neuroimaging: incentives, tools, and challenges
Data sharing in neuroimaging: incentives, tools, and challengesData sharing in neuroimaging: incentives, tools, and challenges
Data sharing in neuroimaging: incentives, tools, and challengesKrzysztof Gorgolewski
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Paul Groth
 
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET Journal
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfArmyTrilidiaDevegaSK
 
Scientific data management (v2)
Scientific data management (v2)Scientific data management (v2)
Scientific data management (v2)Jian Qin
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...Carole Goble
 
Data Science and What It Means to Library and Information Science
Data Science and What It Means to Library and Information ScienceData Science and What It Means to Library and Information Science
Data Science and What It Means to Library and Information ScienceJian Qin
 
Data-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsData-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsBalázs Kégl
 
On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)
On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)
On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)Galit Shmueli
 

Similar to Semantic, Cognitive and Perceptual Computing -Using semantics and statistics to turn data into knowledge (20)

From Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge GraphsFrom Text to Data to the World: The Future of Knowledge Graphs
From Text to Data to the World: The Future of Knowledge Graphs
 
Kwon Ph.D. Dissertation 2016
Kwon Ph.D. Dissertation 2016Kwon Ph.D. Dissertation 2016
Kwon Ph.D. Dissertation 2016
 
Kyeongan Kwon - PhD Dissertation 2016
Kyeongan Kwon - PhD Dissertation 2016Kyeongan Kwon - PhD Dissertation 2016
Kyeongan Kwon - PhD Dissertation 2016
 
Data Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AIData Curation and Debugging for Data Centric AI
Data Curation and Debugging for Data Centric AI
 
Kenett On Information NYU-Poly 2013
Kenett On Information NYU-Poly 2013Kenett On Information NYU-Poly 2013
Kenett On Information NYU-Poly 2013
 
ChemnitzDec2014.key.compressed
ChemnitzDec2014.key.compressedChemnitzDec2014.key.compressed
ChemnitzDec2014.key.compressed
 
Chemnitz dec2014
Chemnitz dec2014Chemnitz dec2014
Chemnitz dec2014
 
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
[Flashback 2005] Managing Scientific Data: From Data Integration to Scientifi...
 
Kmeans
KmeansKmeans
Kmeans
 
Data sharing in neuroimaging: incentives, tools, and challenges
Data sharing in neuroimaging: incentives, tools, and challengesData sharing in neuroimaging: incentives, tools, and challenges
Data sharing in neuroimaging: incentives, tools, and challenges
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...IRJET-  	  Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
IRJET- Improved Model for Big Data Analytics using Dynamic Multi-Swarm Op...
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
 
Scientific data management (v2)
Scientific data management (v2)Scientific data management (v2)
Scientific data management (v2)
 
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
Data Science and What It Means to Library and Information Science
Data Science and What It Means to Library and Information ScienceData Science and What It Means to Library and Information Science
Data Science and What It Means to Library and Information Science
 
Data-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsData-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural nets
 
Intro to Data Science Concepts
Intro to Data Science ConceptsIntro to Data Science Concepts
Intro to Data Science Concepts
 
On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)
On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)
On Information Quality: Can Your Data Do The Job? (SCECR 2015 Keynote)
 

Recently uploaded

Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxAnaBeatriceAblay2
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 

Recently uploaded (20)

Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 

Semantic, Cognitive and Perceptual Computing -Using semantics and statistics to turn data into knowledge

  • 1. 1 Using Semantics and Statistics to Turn Data into Knowledge Based on: Pujara, Jay, et al. "Using Semantics and Statistics to Turn Data into Knowledge." AI Magazine 36.1 (2015): 65-74. And Pujara, Jay, et al. "Knowledge graph identification." International Semantic Web Conference. Springer Berlin Heidelberg, 2013. Presented By: Sarasi Lalithsena Semantic-Cognitive-Perceptual Computing Class - Summer 2016
  • 2. Knowledge graphs (KGs) 2 A Graph representation of facts where entities are connected by relationships Image credit: http://searchengineland.com/laymans-visual- guide-googles-knowledge-graph-search-api-241935 Google knowledge graph NELL Mitchell, T, et al. AAAI 2015
  • 3. Automatic Knowledge Graph Construction Existing work on KG construction can be categorized broadly into these groups, • build on Wikipedia infoboxes and other structured data sources - YAGO, DBpedia, Freebase, WikiData • extract information from the entire web but uses a fixed ontology/schema - NELL, Knowledge Vault • extract information from the entire web but does not use a schema - Reverb, OLLIE • construct taxonomies - Probase 3
  • 4. Automatic Knowledge Graph Extraction Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013 4
  • 5. Challenge 5 Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
  • 6. Example of NELL errors • Entity co-reference error Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013 6
  • 7. Example of NELL errors • Missing and incorrect types (labels) Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013 7
  • 8. Example of NELL errors • Missing and incorrect relations Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013 8
  • 9. Violation of the schema knowledge • Equivalence of co-referent entities (owl:sameAs) – sameEntity(Kyrgyzstan, Kyrgyz Republic) • Mutual exclusion of types (disjoint) – MUT(country, bird) • Constraint on relations (domain and range) – LocatedIn(country, continent) Requires reasoning jointly over the candidates 9 Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
  • 10. Problem Revisited 10 Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
  • 11. Approach – In a nutshell 11 Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
  • 12. Approach – In a nutshell 12 Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
  • 13. Approach – In a nutshell 13 Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
  • 14. Approach – In a nutshell 14 Jay Pujara, Hui Miao, Lise Getoor, William Cohen, "Knowledge Graph Identification", Research Talk ISWC 2013
  • 15. Approach – Probabilistic Soft Logic (PSL) Statistical Learning Approach • Capture both the structure of the knowledge graph and the logical dependencies between the facts • Unlike traditional reasoning systems, it can treat ontological constraints as weighted rules using them as hints • Can be specified using predicates and rules written in first- order logic syntax and translated into a probabilistic graphical model. 15
  • 16. Approach: Probabilistic Soft Logic • A PSL model is composed of a set of weighted, first-order logic rules, where each rule defines a set of features • PSL associates a truth value for each ground rule 16 Probabilistic Soft Logic Rule w is the weight of the rule
  • 17. Approach: Probabilistic Soft Logic • Fact extraction from can be done with multiple extractors – Structural elements, Pattern-based classifiers WCR-T: CandRELT(E1, E2, R) => REL(E1, E2, R) WCL-T: CandLBLT(E, L) => LBL(E, L) 17 Every fact generated by each extractor has a weight
  • 18. Approach: Probabilistic Soft Logic • Incorporate co-reference entities Uses soft logic formulation Truth value is relaxed to [0,1] intervals 18 Pujara, Jay, et al. "Using Semantics and Statistics to Turn Data into Knowledge." AI Magazine 36.1 (2015): 65-74.
  • 19. Approach: Probabilistic Soft Logic • Incorporate schema constraints 19 Jian et al., ICDM 2012
  • 20. Approach: Putting all together Pujara, Jay, et al. "Using Semantics and Statistics to Turn Data into Knowledge." AI Magazine 36.1 (2015): 65-74. 20 Represent it in a graphical model – Each possible fact is a variable; dependencies exist between facts
  • 21. Approach: Putting all together • Each ground rule has a weighted distance to satisfaction derived from the formula’s truth value • Out of all possible KGs, it find the best KG using the joint distribution • Uses convex function to deal with the scalability 21 Rule SatisfactionWeighted distance Joint probability distribution over all variables in a KG
  • 22. Evaluation – NELL experiments • Full KG from uncertain extractions Baseline: NELL with ontology consistence Compare this to the KG created with PSL 22 Running time for completes in 130 minutes for 4.3 M facts for PSL approach
  • 23. Conclusions • Probabilistic soft logic looks like a really interesting tool to combine statistics and semantic • It works well to identify a accurate KG from a noisy KG 23