SlideShare a Scribd company logo
Contextualized Knowledge Graph
from two perspectives
Semantic Web and Graph Database
with an application in
Presenter: Vinh Nguyen
2
3
What is Knowledge Graph?
10/25/2018 4
What is Knowledge Graph?
10/25/2018 5
What is Contextualized Knowledge Graph?
10/25/2018 6
A contextualized knowledge graph is a knowledge graph in which
every fact is qualified with a set of contextual properties.
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
Bob Dylan marriedTo Carolyn Dennis 1986-06-## 1992-10-##
Motivation Scenario
Facts:
Meta Queries:
Query type Sample query
Provenance P1. Where is this fact from?
P2. When was it created?
P3. Who created this fact?
Time T1. When did this fact occur?
T2. What is the time span of this fact?
T3. Which events happened in the same year?
Location L1. What is the location associated with this fact?
L2. Which events happened at the same place?
Certainty C1. What is the author confidence of this fact?
7
Subject Predicate Object
Bob Dylan marriedTo Sarah Lownds
Bob Dylan marriedTo Carolyn Dennis
8
Contextualized Knowledge Graph
from
Semantic Web perspective
9
2973 datasets with 149 billion triples
Linked Data principles
Use URIs as names
Use HTTP URLs to be looked up
URI provides useful info using
standard
Include links to other URIs to
discover more
10
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
RDF Reification
Form of Triples: RDF Reification
Pros:
1. Intuitive, easy to understand
Cons:
1. Takes 3N triples (4N if including
Statement typing) to represent a
statement => Not scalable
2. No formal semantics defined =>
Semantics is unclear
3. Discouraged in LOD!
Time-aware Facts:
11
Subject Predicate Object
#stmt1 type Statement
#stmt1 hasSubject BobDylan
#stmt1 hasProperty marriedTo
#stmt1 hasObject Sara Lownds
Bob Dylan marriedTo Sarah Lownds
#stmt1 starts 1965-11-22
#stmt1 ends 1977-06-29
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
RDF Reification
RDF Reification vs. Singleton Property
Time-aware Facts:
Subject Predicate Object
#stmt1 type Statement
#stmt1 hasSubject BobDylan
#stmt1 hasProperty marriedTo
#stmt1 hasObject Sara Lownds
Bob Dylan marriedTo Sarah Lownds
#stmt1 starts 1965-11-22
#stmt1 ends 1977-06-29
Subject Predicate Object
marriedTo#1 rdf:sp marriedTo
BobDylan marriedTo#1 Sarah Lownds
marriedTo#1 starts 1965-11-22
marriedTo#1 ends 1977-06-29
Singleton Property
12
Vinh Nguyen, Olivier Bodenreider, and Amit Sheth. "Don't like RDF reification?: making statements about statements
using singleton property." In Proceedings of the 23rd international conference on World wide web, pp. 759-770. ACM,
2014.
Subject Predicate Object Source DateExtracted
Bob Dylan marriedTo Sarah Lownds wikipage:Bob_Dylan 2009-06-07
Form of Triples: PaCE
Pros:
1. Save ~50% number of triples
compared to reification thanks
to the repeated subject,
predicate, and object.
Cons:
1. Not intuitive, hard to
understand
2. Limited expressiveness
Provenance-aware Facts:
13
Provenance-aware Context Entity
Subject Predicate Object
BobDylan_wp rdf:type Bob Dylan
SaraLownds_wp rdf:type Sara Lownds
BobDylan_wp marriedTo SaraLownds_wp
BobDylan_wp hasSource wiki:Bob_Dylan
BobDylan_wp hasDateExt 2009-06-07
Satya S. Sahoo, Olivier Bodenreider, Pascal Hitzler, Amit Sheth, and Krishnaprasad Thirunarayan. 2010. Provenance
context entity (PaCE): scalable provenance tracking for scientific RDF data. In Proceedings of the 22nd international
conference on Scientific and statistical database management (SSDBM'10),
Subject Predicate Object Source DateExtracted
Bob Dylan marriedTo Sarah Lownds wikipage:Bob_Dylan 2009-06-07
Provenance-aware Context Entity
Subject Predicate Object
BobDylan_wp rdf:type Bob Dylan
SaraLownds_wp rdf:type Sara Lownds
BobDylan_wp marriedTo SaraLownds_wp
BobDylan_wp hasSource wiki:Bob_Dylan
BobDylan_wp hasDateExt 2009-06-07
Facts and Provenance:
14
PaCE vs. Singleton Property
Subject Predicate Object
marriedTo#1 rdf:sp marriedTo
BobDylan marriedTo#1 Sarah Lownds
marriedTo#1 hasSource wp:Bob_Dylan
marriedTo#1 hasDateExt 2009-06-07
Singleton Property
Form of Quadruples: Named Graph
Pros:
1. Intuitive --creating # named graphs
for # sources
2. Attach metadata for a set of triples
3. SPARQL supported
Cons:
1. Defined for provenance only
2. Ambiguous semantics while
associating different types of
metadata at triple level
Time-aware Facts:
* Carroll, Jeremy J., et al. "Named graphs, provenance and trust." Proceedings of the 14th international conference on World Wide Web. ACM, 2005.
15
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
Named Graph
Subject Predicate Object NG
Bob Dylan marriedTo Sarah Lownds ng_1
ng_1 starts 1965-11-22 Prov_graph
ng_2 ends 1977-06-29 Prov_graph
Named Graph
Subject Predicate Object NG
Bob Dylan marriedTo Sarah Lownds ng_1
ng_1 starts 1965-11-22 Prov_graph
ng_2 ends 1977-06-29 Prov_graph
Time-aware Facts:
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
Named Graph vs. Singleton Property
Subject Predicate Object
marriedTo#1 rdf:sp marriedTo
Bob Dylan marriedTo#1 Sarah Lownds
marriedTo#1 starts 1965-11-22
marriedTo#1 ends 1977-06-29 16
Singleton Property
RDF+:
Subject Predicate Object Meta Property Meta value
Bob Dylan marriedTo Sarah Lownds starts 1965-11-22
Bob Dylan marriedTo Sarah Lownds ends 1977-06-29
Form of Quintuples: RDF+
Cons:
1. The representation is not in the form of RDF. Statement identifiers are used
internally. Require the mappings from RDF to RDF+ and vice versa.
2. The SPARQL query syntax and semantics need to be extended to support RDF+
Facts and Temporal Information:
* Dividino, Renata, et al. "Querying for provenance, trust, uncertainty and other meta knowledge in RDF." Web
Semantics: Science, Services and Agents on the World Wide Web 7.3 (2009): 204-219.
17
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
Experiment: BKR with Provenance
All datasets are available at http://wiki.knoesis.org/index.php/Singleton_Property 20
• Five data sets generated from the same seed BKR
 Singleton Property (SP)
 Reification (R)
 PaCE C1 (C1)
 PaCE C2 (C2)
 PaCE C3 (C3)
Experiment Results
(A) random-value queries vs. fixed-value queries in msec.
(B) query length and execution time in msec. 21
• Gang Fu, Evan Bolton, Núria Queralt Rosinach, Laura I Furlong, Vinh Nguyen, Amit
Sheth, Olivier Bodenreider, Michel Dumontier. Exposing provenance metadata using
different RDF models. In Proceedings of Semantic Web Applications and Tools for
Life Science (SWAT4LS), 2016.
https://pubchem.ncbi.nlm.nih.gov/
• Hernández, Daniel, Aidan Hogan, and Markus Krötzsch. "Reifying RDF: What works
well with wikidata?." SSWS@ ISWC 1457 (2015): 32-47.
• Frey, Johannes, Kay Müller, Sebastian Hellmann, Erhard Rahm, and Maria-Esther
Vidal. "Evaluation of Metadata Representations in RDF stores.”
• Daniel Hernández, Aidan Hogan, Cristian Riveros, Carlos Rojas, Enzo Zerega:
Querying Wikidata: Comparing SPARQL, Relational and Graph Databases.
International Semantic Web Conference (2) 2016: 88-103
22
External Evaluation
Subject Predicate Object Source FromDataset Confidence
CID5280961(Genistein) inhibits GID2100(ESR2) PMID12502307 ChemBL
CID5757(Estradiol) activates GID2100(ESR2) PMID19128016 ChemBL
10/25/2018
Exposing provenance metadata using different RDF models
Gang Fu, Evan Bolton, Núria Queralt Rosinach, Laura I Furlong, Vinh Nguyen, Amit Sheth, Olivier Bodenreider, Michel Dumontier
Model I Model II Model III Model IV Model V
22,787,218 21,445,348 19,575,298 17,239,427 27,605,782
24
PubChem
• Five data sets generated from the same seed
 N-ary with cardinal assertion (Model I)
 N-ary without cardinal assertion (Model II)
 Singleton property with cardinal assertion (Model III)
 Singleton property without cardinal assertion (Model IV)
 NanoPublication (Model V)
• Comparing sizes of generated datasets
 SP datasets are the most compact ones
Gang Fu, Evan Bolton, Núria Queralt Rosinach, Laura I Furlong, Vinh Nguyen, Amit Sheth, Olivier
Bodenreider, Michel Dumontier. Exposing provenance metadata using different RDF models. In
Proceedings of Semantic Web Applications and Tools for Life Science (SWAT4LS), 2016.
25
PubChem
• Query performance in secs
 SP models (III and IV) outperforms other models in Virtuoso
26
PubChem (cont)
27
WikiData
• Four data sets generated from the same seed
 Standard Reification (SR)
 N-ary relation (NR)
 Singleton property (SP)
 Named Graph (NG)
• Comparing sizes of generated datasets
 SP dataset is the most compact one
Hernández, Daniel, Aidan Hogan, and Markus Krötzsch. "Reifying RDF: What works well with
wikidata?." SSWS@ ISWC 1457 (2015): 32-47.
28
WikiData
• Query performance in 4store and GraphDB
 SP models are not supported by 4store and GraphDB
• Query performance in Virtuoso and BlazeGraph
 Reification and NG are well-supported by Virtuoso and
BlazeGraph
 SP is little faster than NR in Virtuoso, slower in BlazeGraph
29
WikiData
• Six data sets generated from the same seed
 Standard Reification (stdreif)
 N-ary relation (naryrel)
 Singleton property (sgprop)
 Companion property (cpprop)
 Named Graph (ngraphs)
 RDF* (rdr)
• Comparing sizes of generated datasets
 SP dataset is the most compact triple representation
 Fastest in loading time for WikiData
 Best query performance for StarDog in all cases
 Slowest in Virtuoso but not by much for WikiData queries
 Not encounter performance issues with SP
Frey, Johannes, Kay Müller, Sebastian Hellmann, Erhard Rahm, and Maria-Esther Vidal. "Evaluation of
Metadata Representations in RDF stores."
30
Experimental Comparison
• Dataset size
 SP offers the most concise representation in all cases
• Query performance
 SP performs reasonably well in Virtuoso, best in StarDog, OK in
BlazeGraph
 SP may have the potential for the performance gain if
supported and optimized by the query engines
Is SP representation optimal?
31
Contextualized Knowledge Graph
from
Graph Database perspective
Subject Predicate Object Starts Ends
Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
Bob Dylan marriedTo Carolyn Dennis 1986-06-## 1992-10-##
Property Graph
Facts:
32
Subject Predicate Object
Bob Dylan marriedTo Sarah Lownds
Bob Dylan marriedTo Carolyn Dennis
Name: CarolynDennisName: SaraLownds
2 3
Name: BobDylan
1
marriededTo marriededTo
Starts: 1965-11-22
Ends: 1977-06-29
Starts: 1986-06-##
Ends: 1992-10-##
33
Contextualized Knowledge Graph
with an application in
10/25/2018
Neighbor: only available through REST interface
10/25/2018 35
PubChem Neighbor
10/25/2018 36
Current PubChem Neighbor
• Number of links
 92,000,000 * 92,000,000 / 2 = 4.232 * 10^15
 4 quadrillion
• Challenges
⨯ Number of triples increases to quadrillion
⨯ SPARQL query processing for Quadrillion triples
• Is it worth?
 Chemical similarity is one of the most important concept in
chemoinformatics
 Similar compounds have similar properties
10/25/2018
Current PubChem Neighbor
Subject Predicate Object
nbr:CID1_CID2_2DSim has_measurement_value nbr:CID1_CID2_2DTanimotoScore
nbr:CID1_CID2_2DSim refers_to compound:CID1
nbr:CID1_CID2_2DSim refers_to compound:CID2
nbr:CID1_CID2_2DSim type pcvocab:PC2D_structural_similarity
nbr:CID1_CID2_2DTanimotoScore has_value 0.91^^xsd:float
nbr:CID1_CID2_2DTanimotoScore Is_output_of sio:CHEMINF_000333
nbr:CID1_CID2_2DTanimotoScore type pcvocab:PC2D_Fingerprint_TanimotorScore
10/25/2018
1 neighbor link: 7 triples
compound:CID1 sio:CHEMINF_000482 compound:CID2
4 quadrillion x 7 = 28 quadrillion triples
PubChem Neighbor using CKG Model
10/25/2018
Subject Predicate Object
nbr:CID1_CID2_2DSim has_measurement_value nbr:CID1_CID2_2DTanimotoScore
nbr:CID1_CID2_2DSim refers_to compound:CID1
nbr:CID1_CID2_2DSim refers_to compound:CID2
nbr:CID1_CID2_2DSim type pcvocab:PC2D_structural_similarity
nbr:CID1_CID2_2DTanimotoScore has_value 0.91^^xsd:float
nbr:CID1_CID2_2DTanimotoScore Is_output_of sio:CHEMINF_000333
nbr:CID1_CID2_2DTanimotoScore type pcvocab:PC2D_Fingerprint_TanimotorScore
Subject Predicate Object
compound:CID1 has_structural_similarity?sp=1&ds=pc&is_output_of=sio:CHEMINF_00
0333&has_2d_tanimoto_score=0.91^^xsd
compound:CID1
1 neighbor link: 1 triple
4 quadrillion x 1 = 4 quadrillion triples
10/25/2018
< 20 billion CKG triples

More Related Content

Similar to Contextualized Knowledge Graph from two perspectives: Semantic Web and Graph Database with an application in PubChem

NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
National Information Standards Organization (NISO)
 
Radically Open Cultural Heritage Data on the Web
Radically Open Cultural Heritage Data on the WebRadically Open Cultural Heritage Data on the Web
Radically Open Cultural Heritage Data on the Web
Julie Allinson
 
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataCEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
PRELIDA Project
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4j
Simon Jupp
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked Data
Jane Stevenson
 
IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
Dr.-Ing. Thomas Hartmann
 
Linked data experiments at the National Library of Scotland / Alexandra De Pr...
Linked data experiments at the National Library of Scotland / Alexandra De Pr...Linked data experiments at the National Library of Scotland / Alexandra De Pr...
Linked data experiments at the National Library of Scotland / Alexandra De Pr...
CIGScotland
 
Lecture linked data cloud & sparql
Lecture linked data cloud & sparqlLecture linked data cloud & sparql
Lecture linked data cloud & sparql
Dhavalkumar Thakker
 
(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG
Ratko Mutavdzic
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
Laura Hollink
 
RDF presentation at DrupalCon San Francisco 2010
RDF presentation at DrupalCon San Francisco 2010RDF presentation at DrupalCon San Francisco 2010
RDF presentation at DrupalCon San Francisco 2010
scorlosquet
 
Perspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from textPerspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from text
Jennifer D'Souza
 
The Semantic Web - Interacting with the Unknown
The Semantic Web - Interacting with the UnknownThe Semantic Web - Interacting with the Unknown
The Semantic Web - Interacting with the Unknown
Steffen Staab
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
Sören Auer
 
The Next Decade in Web Design
The Next Decade in Web DesignThe Next Decade in Web Design
The Next Decade in Web Design
Micah Cowsik-Herstand
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
Marie Gustafsson Friberger
 
Big data search
Big data search Big data search
Big data search
Thanh Tran
 
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Jeff Z. Pan
 
NCompass Live: RDA: Are We There Yet?
NCompass Live: RDA: Are We There Yet?NCompass Live: RDA: Are We There Yet?
NCompass Live: RDA: Are We There Yet?
Nebraska Library Commission
 

Similar to Contextualized Knowledge Graph from two perspectives: Semantic Web and Graph Database with an application in PubChem (20)

NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
 
Radically Open Cultural Heritage Data on the Web
Radically Open Cultural Heritage Data on the WebRadically Open Cultural Heritage Data on the Web
Radically Open Cultural Heritage Data on the Web
 
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical DataCEDAR & PRELIDA Preservation of Linked Socio-Historical Data
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4j
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked Data
 
IASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with TriplesIASSIST 2012 - DDI-RDF - Trouble with Triples
IASSIST 2012 - DDI-RDF - Trouble with Triples
 
Linked data experiments at the National Library of Scotland / Alexandra De Pr...
Linked data experiments at the National Library of Scotland / Alexandra De Pr...Linked data experiments at the National Library of Scotland / Alexandra De Pr...
Linked data experiments at the National Library of Scotland / Alexandra De Pr...
 
Lecture linked data cloud & sparql
Lecture linked data cloud & sparqlLecture linked data cloud & sparql
Lecture linked data cloud & sparql
 
(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG(PROJEKTURA) Big Data Open Data story for TGG
(PROJEKTURA) Big Data Open Data story for TGG
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
RDF presentation at DrupalCon San Francisco 2010
RDF presentation at DrupalCon San Francisco 2010RDF presentation at DrupalCon San Francisco 2010
RDF presentation at DrupalCon San Francisco 2010
 
Perspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from textPerspectives on mining knowledge graphs from text
Perspectives on mining knowledge graphs from text
 
The Semantic Web - Interacting with the Unknown
The Semantic Web - Interacting with the UnknownThe Semantic Web - Interacting with the Unknown
The Semantic Web - Interacting with the Unknown
 
The web of interlinked data and knowledge stripped
The web of interlinked data and knowledge strippedThe web of interlinked data and knowledge stripped
The web of interlinked data and knowledge stripped
 
The Next Decade in Web Design
The Next Decade in Web DesignThe Next Decade in Web Design
The Next Decade in Web Design
 
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State UniversityPrateek Jain dissertation defense, Kno.e.sis, Wright State University
Prateek Jain dissertation defense, Kno.e.sis, Wright State University
 
Open data and linked data
Open data and linked dataOpen data and linked data
Open data and linked data
 
Big data search
Big data search Big data search
Big data search
 
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
 
NCompass Live: RDA: Are We There Yet?
NCompass Live: RDA: Are We There Yet?NCompass Live: RDA: Are We There Yet?
NCompass Live: RDA: Are We There Yet?
 

Recently uploaded

Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Neo4j
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
AstuteBusiness
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
BibashShahi
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 

Recently uploaded (20)

Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid ResearchHarnessing the Power of NLP and Knowledge Graphs for Opioid Research
Harnessing the Power of NLP and Knowledge Graphs for Opioid Research
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |Astute Business Solutions | Oracle Cloud Partner |
Astute Business Solutions | Oracle Cloud Partner |
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Principle of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptxPrinciple of conventional tomography-Bibash Shahi ppt..pptx
Principle of conventional tomography-Bibash Shahi ppt..pptx
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 

Contextualized Knowledge Graph from two perspectives: Semantic Web and Graph Database with an application in PubChem

  • 1. Contextualized Knowledge Graph from two perspectives Semantic Web and Graph Database with an application in Presenter: Vinh Nguyen
  • 2. 2
  • 3. 3
  • 4. What is Knowledge Graph? 10/25/2018 4
  • 5. What is Knowledge Graph? 10/25/2018 5
  • 6. What is Contextualized Knowledge Graph? 10/25/2018 6 A contextualized knowledge graph is a knowledge graph in which every fact is qualified with a set of contextual properties.
  • 7. Subject Predicate Object Starts Ends Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29 Bob Dylan marriedTo Carolyn Dennis 1986-06-## 1992-10-## Motivation Scenario Facts: Meta Queries: Query type Sample query Provenance P1. Where is this fact from? P2. When was it created? P3. Who created this fact? Time T1. When did this fact occur? T2. What is the time span of this fact? T3. Which events happened in the same year? Location L1. What is the location associated with this fact? L2. Which events happened at the same place? Certainty C1. What is the author confidence of this fact? 7 Subject Predicate Object Bob Dylan marriedTo Sarah Lownds Bob Dylan marriedTo Carolyn Dennis
  • 9. 9 2973 datasets with 149 billion triples Linked Data principles Use URIs as names Use HTTP URLs to be looked up URI provides useful info using standard Include links to other URIs to discover more
  • 10. 10
  • 11. Subject Predicate Object Starts Ends Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29 RDF Reification Form of Triples: RDF Reification Pros: 1. Intuitive, easy to understand Cons: 1. Takes 3N triples (4N if including Statement typing) to represent a statement => Not scalable 2. No formal semantics defined => Semantics is unclear 3. Discouraged in LOD! Time-aware Facts: 11 Subject Predicate Object #stmt1 type Statement #stmt1 hasSubject BobDylan #stmt1 hasProperty marriedTo #stmt1 hasObject Sara Lownds Bob Dylan marriedTo Sarah Lownds #stmt1 starts 1965-11-22 #stmt1 ends 1977-06-29
  • 12. Subject Predicate Object Starts Ends Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29 RDF Reification RDF Reification vs. Singleton Property Time-aware Facts: Subject Predicate Object #stmt1 type Statement #stmt1 hasSubject BobDylan #stmt1 hasProperty marriedTo #stmt1 hasObject Sara Lownds Bob Dylan marriedTo Sarah Lownds #stmt1 starts 1965-11-22 #stmt1 ends 1977-06-29 Subject Predicate Object marriedTo#1 rdf:sp marriedTo BobDylan marriedTo#1 Sarah Lownds marriedTo#1 starts 1965-11-22 marriedTo#1 ends 1977-06-29 Singleton Property 12 Vinh Nguyen, Olivier Bodenreider, and Amit Sheth. "Don't like RDF reification?: making statements about statements using singleton property." In Proceedings of the 23rd international conference on World wide web, pp. 759-770. ACM, 2014.
  • 13. Subject Predicate Object Source DateExtracted Bob Dylan marriedTo Sarah Lownds wikipage:Bob_Dylan 2009-06-07 Form of Triples: PaCE Pros: 1. Save ~50% number of triples compared to reification thanks to the repeated subject, predicate, and object. Cons: 1. Not intuitive, hard to understand 2. Limited expressiveness Provenance-aware Facts: 13 Provenance-aware Context Entity Subject Predicate Object BobDylan_wp rdf:type Bob Dylan SaraLownds_wp rdf:type Sara Lownds BobDylan_wp marriedTo SaraLownds_wp BobDylan_wp hasSource wiki:Bob_Dylan BobDylan_wp hasDateExt 2009-06-07 Satya S. Sahoo, Olivier Bodenreider, Pascal Hitzler, Amit Sheth, and Krishnaprasad Thirunarayan. 2010. Provenance context entity (PaCE): scalable provenance tracking for scientific RDF data. In Proceedings of the 22nd international conference on Scientific and statistical database management (SSDBM'10),
  • 14. Subject Predicate Object Source DateExtracted Bob Dylan marriedTo Sarah Lownds wikipage:Bob_Dylan 2009-06-07 Provenance-aware Context Entity Subject Predicate Object BobDylan_wp rdf:type Bob Dylan SaraLownds_wp rdf:type Sara Lownds BobDylan_wp marriedTo SaraLownds_wp BobDylan_wp hasSource wiki:Bob_Dylan BobDylan_wp hasDateExt 2009-06-07 Facts and Provenance: 14 PaCE vs. Singleton Property Subject Predicate Object marriedTo#1 rdf:sp marriedTo BobDylan marriedTo#1 Sarah Lownds marriedTo#1 hasSource wp:Bob_Dylan marriedTo#1 hasDateExt 2009-06-07 Singleton Property
  • 15. Form of Quadruples: Named Graph Pros: 1. Intuitive --creating # named graphs for # sources 2. Attach metadata for a set of triples 3. SPARQL supported Cons: 1. Defined for provenance only 2. Ambiguous semantics while associating different types of metadata at triple level Time-aware Facts: * Carroll, Jeremy J., et al. "Named graphs, provenance and trust." Proceedings of the 14th international conference on World Wide Web. ACM, 2005. 15 Subject Predicate Object Starts Ends Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29 Named Graph Subject Predicate Object NG Bob Dylan marriedTo Sarah Lownds ng_1 ng_1 starts 1965-11-22 Prov_graph ng_2 ends 1977-06-29 Prov_graph
  • 16. Named Graph Subject Predicate Object NG Bob Dylan marriedTo Sarah Lownds ng_1 ng_1 starts 1965-11-22 Prov_graph ng_2 ends 1977-06-29 Prov_graph Time-aware Facts: Subject Predicate Object Starts Ends Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29 Named Graph vs. Singleton Property Subject Predicate Object marriedTo#1 rdf:sp marriedTo Bob Dylan marriedTo#1 Sarah Lownds marriedTo#1 starts 1965-11-22 marriedTo#1 ends 1977-06-29 16 Singleton Property
  • 17. RDF+: Subject Predicate Object Meta Property Meta value Bob Dylan marriedTo Sarah Lownds starts 1965-11-22 Bob Dylan marriedTo Sarah Lownds ends 1977-06-29 Form of Quintuples: RDF+ Cons: 1. The representation is not in the form of RDF. Statement identifiers are used internally. Require the mappings from RDF to RDF+ and vice versa. 2. The SPARQL query syntax and semantics need to be extended to support RDF+ Facts and Temporal Information: * Dividino, Renata, et al. "Querying for provenance, trust, uncertainty and other meta knowledge in RDF." Web Semantics: Science, Services and Agents on the World Wide Web 7.3 (2009): 204-219. 17 Subject Predicate Object Starts Ends Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29
  • 18. Experiment: BKR with Provenance All datasets are available at http://wiki.knoesis.org/index.php/Singleton_Property 20 • Five data sets generated from the same seed BKR  Singleton Property (SP)  Reification (R)  PaCE C1 (C1)  PaCE C2 (C2)  PaCE C3 (C3)
  • 19. Experiment Results (A) random-value queries vs. fixed-value queries in msec. (B) query length and execution time in msec. 21
  • 20. • Gang Fu, Evan Bolton, Núria Queralt Rosinach, Laura I Furlong, Vinh Nguyen, Amit Sheth, Olivier Bodenreider, Michel Dumontier. Exposing provenance metadata using different RDF models. In Proceedings of Semantic Web Applications and Tools for Life Science (SWAT4LS), 2016. https://pubchem.ncbi.nlm.nih.gov/ • Hernández, Daniel, Aidan Hogan, and Markus Krötzsch. "Reifying RDF: What works well with wikidata?." SSWS@ ISWC 1457 (2015): 32-47. • Frey, Johannes, Kay Müller, Sebastian Hellmann, Erhard Rahm, and Maria-Esther Vidal. "Evaluation of Metadata Representations in RDF stores.” • Daniel Hernández, Aidan Hogan, Cristian Riveros, Carlos Rojas, Enzo Zerega: Querying Wikidata: Comparing SPARQL, Relational and Graph Databases. International Semantic Web Conference (2) 2016: 88-103 22 External Evaluation
  • 21. Subject Predicate Object Source FromDataset Confidence CID5280961(Genistein) inhibits GID2100(ESR2) PMID12502307 ChemBL CID5757(Estradiol) activates GID2100(ESR2) PMID19128016 ChemBL 10/25/2018 Exposing provenance metadata using different RDF models Gang Fu, Evan Bolton, Núria Queralt Rosinach, Laura I Furlong, Vinh Nguyen, Amit Sheth, Olivier Bodenreider, Michel Dumontier
  • 22. Model I Model II Model III Model IV Model V 22,787,218 21,445,348 19,575,298 17,239,427 27,605,782 24 PubChem • Five data sets generated from the same seed  N-ary with cardinal assertion (Model I)  N-ary without cardinal assertion (Model II)  Singleton property with cardinal assertion (Model III)  Singleton property without cardinal assertion (Model IV)  NanoPublication (Model V) • Comparing sizes of generated datasets  SP datasets are the most compact ones Gang Fu, Evan Bolton, Núria Queralt Rosinach, Laura I Furlong, Vinh Nguyen, Amit Sheth, Olivier Bodenreider, Michel Dumontier. Exposing provenance metadata using different RDF models. In Proceedings of Semantic Web Applications and Tools for Life Science (SWAT4LS), 2016.
  • 23. 25 PubChem • Query performance in secs  SP models (III and IV) outperforms other models in Virtuoso
  • 25. 27 WikiData • Four data sets generated from the same seed  Standard Reification (SR)  N-ary relation (NR)  Singleton property (SP)  Named Graph (NG) • Comparing sizes of generated datasets  SP dataset is the most compact one Hernández, Daniel, Aidan Hogan, and Markus Krötzsch. "Reifying RDF: What works well with wikidata?." SSWS@ ISWC 1457 (2015): 32-47.
  • 26. 28 WikiData • Query performance in 4store and GraphDB  SP models are not supported by 4store and GraphDB • Query performance in Virtuoso and BlazeGraph  Reification and NG are well-supported by Virtuoso and BlazeGraph  SP is little faster than NR in Virtuoso, slower in BlazeGraph
  • 27. 29 WikiData • Six data sets generated from the same seed  Standard Reification (stdreif)  N-ary relation (naryrel)  Singleton property (sgprop)  Companion property (cpprop)  Named Graph (ngraphs)  RDF* (rdr) • Comparing sizes of generated datasets  SP dataset is the most compact triple representation  Fastest in loading time for WikiData  Best query performance for StarDog in all cases  Slowest in Virtuoso but not by much for WikiData queries  Not encounter performance issues with SP Frey, Johannes, Kay Müller, Sebastian Hellmann, Erhard Rahm, and Maria-Esther Vidal. "Evaluation of Metadata Representations in RDF stores."
  • 28. 30 Experimental Comparison • Dataset size  SP offers the most concise representation in all cases • Query performance  SP performs reasonably well in Virtuoso, best in StarDog, OK in BlazeGraph  SP may have the potential for the performance gain if supported and optimized by the query engines Is SP representation optimal?
  • 30. Subject Predicate Object Starts Ends Bob Dylan marriedTo Sarah Lownds 1965-11-22 1977-06-29 Bob Dylan marriedTo Carolyn Dennis 1986-06-## 1992-10-## Property Graph Facts: 32 Subject Predicate Object Bob Dylan marriedTo Sarah Lownds Bob Dylan marriedTo Carolyn Dennis Name: CarolynDennisName: SaraLownds 2 3 Name: BobDylan 1 marriededTo marriededTo Starts: 1965-11-22 Ends: 1977-06-29 Starts: 1986-06-## Ends: 1992-10-##
  • 32. 10/25/2018 Neighbor: only available through REST interface
  • 34. 10/25/2018 36 Current PubChem Neighbor • Number of links  92,000,000 * 92,000,000 / 2 = 4.232 * 10^15  4 quadrillion • Challenges ⨯ Number of triples increases to quadrillion ⨯ SPARQL query processing for Quadrillion triples • Is it worth?  Chemical similarity is one of the most important concept in chemoinformatics  Similar compounds have similar properties
  • 36. Current PubChem Neighbor Subject Predicate Object nbr:CID1_CID2_2DSim has_measurement_value nbr:CID1_CID2_2DTanimotoScore nbr:CID1_CID2_2DSim refers_to compound:CID1 nbr:CID1_CID2_2DSim refers_to compound:CID2 nbr:CID1_CID2_2DSim type pcvocab:PC2D_structural_similarity nbr:CID1_CID2_2DTanimotoScore has_value 0.91^^xsd:float nbr:CID1_CID2_2DTanimotoScore Is_output_of sio:CHEMINF_000333 nbr:CID1_CID2_2DTanimotoScore type pcvocab:PC2D_Fingerprint_TanimotorScore 10/25/2018 1 neighbor link: 7 triples compound:CID1 sio:CHEMINF_000482 compound:CID2 4 quadrillion x 7 = 28 quadrillion triples
  • 37. PubChem Neighbor using CKG Model 10/25/2018 Subject Predicate Object nbr:CID1_CID2_2DSim has_measurement_value nbr:CID1_CID2_2DTanimotoScore nbr:CID1_CID2_2DSim refers_to compound:CID1 nbr:CID1_CID2_2DSim refers_to compound:CID2 nbr:CID1_CID2_2DSim type pcvocab:PC2D_structural_similarity nbr:CID1_CID2_2DTanimotoScore has_value 0.91^^xsd:float nbr:CID1_CID2_2DTanimotoScore Is_output_of sio:CHEMINF_000333 nbr:CID1_CID2_2DTanimotoScore type pcvocab:PC2D_Fingerprint_TanimotorScore Subject Predicate Object compound:CID1 has_structural_similarity?sp=1&ds=pc&is_output_of=sio:CHEMINF_00 0333&has_2d_tanimoto_score=0.91^^xsd compound:CID1 1 neighbor link: 1 triple 4 quadrillion x 1 = 4 quadrillion triples
  • 38. 10/25/2018 < 20 billion CKG triples

Editor's Notes

  1. Semantic Web Technology, enhanced by a massive use of open linked data, plays a crucial role in the overall Deep QA architecture
  2. CEO Sundar Pichai led the charge here, noting that Google's Knowledge Graph (the easily accessible information that pop up under the search bar for certain queries) now encompasses 70 billion facts
  3. 1163 datasets Using Semantic Web technologies 149,423,660,620 triples from 2973 datasets (retrieved Dec 14) Use URIs as names for things Use HTTP URIs so that people can look up those names. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) Include links to other URIs. so that they can discover more things.
  4. Five datasets
  5. One slide shows the graph database approach One slide compares the SP and property graph
  6. One slide shows the schema One slide shows similarity score file One slide shows the numbers in the schema One slides show the numbers for all approaches
  7. nbr:CID1_CID2_2DSim has_measurement_value nbr:CID1_CID2_2dTani . nbr:CID1_CID2_2DSim refers_to compound:CID1 . nbr:CID1_CID2_2DSim refers_to compound:CID2 . nbr:CID1_CID2_2DSim type pcvocab:PC2D_structural_similarity . nbr:CID1_CID2_2dTani has_value 0.91^^xsd:float . nbr:CID1_CID2_2dTani is_output_of sio:CHEMINF_000333 . nbr:CID1_CID2_2dTani type pcvocab:PC2D_Fingerprint_TanimotorScore
  8. nbr:CID1_CID2_2DSim has_measurement_value nbr:CID1_CID2_2dTani . nbr:CID1_CID2_2DSim refers_to compound:CID1 . nbr:CID1_CID2_2DSim refers_to compound:CID2 . nbr:CID1_CID2_2DSim type pcvocab:PC2D_structural_similarity . nbr:CID1_CID2_2dTani has_value 0.91^^xsd:float . nbr:CID1_CID2_2dTani is_output_of sio:CHEMINF_000333 . nbr:CID1_CID2_2dTani type pcvocab:PC2D_Fingerprint_TanimotorScore
  9. 96280729533/7=13754389933