SlideShare a Scribd company logo
1 of 42
Download to read offline
The power of the
Cognitive Probability Graph
(aka Cognitive Computing)
June 2016
Jans Aasman
ja@franz.com
10 years ago
Structured Data
7 years ago
Structured Data Unstructured Data
4 to 5 years ago
Structured Data
Unstructured
Data
Knowledge
Domain knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontologies
New #1: Learning. Feed output of data
science back into data infrastructure
Structured
Data
Unstructured
Data
Knowledge
Domain knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontolo
gies
Probabilistic
Inferences.
New # 2: everything in one (distributed)
semantic graph
Structured
Data
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Probabilistic
Inferences.
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
AKA: Cognitive Computing
Structured
Data
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Probabilistic
Inferences.
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Examples
Examples
• Healthcare: If I have this class of diagnostics and I get this procedure what are some of the new
symptoms I might get in the next two years.
• eCommerce and brand protection: find all my products based on product similarity
• Logistics: what can I statistically predict about part P breaking down and what other parts do I
usually buy after that part breaks down.
• Police Intelligence: find the most plausible story of a temporally orderend shortest path between
two criminal through observed (hard) facts and inferred (soft facts)
• Fraud detection: find links between your local chamber of commerce and the panama papers
through similar names and addresses.
Example healthcare
• Franz and Montefiore are partners in the Semantic Data Lake project.
One cognitive computing platform for all healthcare analytics
Example healthcare
• We created a single data centric platform that can serve any type of
analytic without building a new data mart for every new question.
• Currently 2.7 million patients with 10 years of data
• All data captured in a Unified Clinical Event model with 350 classes of
events.
Healthcare: structured and unstructured data
Structured patient data combined with complex
integrated terminology
Provenance for every value
Healthcare: the knowledge bases
• More > 180 vocabularies and terminology systems integrated in on
unified terminology system (Mesh, Snomed, UMLS, RxNorm, LOINC etc,
etc)
• External databases and
• Linked Open Data
OMOP
11089001
6600349
11894800
5
7534205
16790501
14667809
35896705
9209005
1732609
9908905
1469609
329005
LOINC
113345001 140460009
skos:semanticRelation
skos:narrower
118948005
skos:broader
“9209005”
SNOMEDCT
M0024135
M0008124 M0004742
skos:semanticRelation
skos:narrower
M0015742
skos:broader“Abdominal Pain”
“M0024135”
skos:exactMatch
A0549302
A0978543
skos-xl:prefLabel
9209005
“Abdominal Pain”
SAB
AUI
SUI
MeSH
SNOMEDCT
MedDRA
rdfs:subClassOf
rdf:type
C0172359 C0232487
C0238551
“Abdominal Pain”
“C0000737”
skos:semanticRelation
skos:broader
UMLS - MTH
skos:notation
S035799
skos-xl:label
MTH
STR
C000737
Everything linked through SKOS
SKOS/SKOS-XL
ConceptScheme
Concept
UMLS - Semantic Net
Entity Event
Label
Population,
Community
Time
Pt.Pt.Pt.Pt.
SDL Paradigm:
Pt.Pt.Pt.Pt.
Diagnosis
Codes
Disease
Classification
OMIM,
GONG
Genetic
Profile
Procedure
CodeHCPC
Manufacturers
PharmKGB
Drug
Classification
Drug
Codes
DrugBank
ClinicalTrials
CER
PubMed
Analytic Tapestry
(closed loop analytics)
Healthcare: probabilistic inferences
Why is this so important?
• Usually the output of data science results in reports and publications but
• No formal trace where the data came from
• No formal link to the actual methods you used, or who did it, or when you did it
• Cannot be compared to earlier results
• Cannot be used as building blocks for further research
• In general : the output is not queryable
• This is not good for delivery of care, reproducibility of research findings,
security and compliance, and results in loss of value-added information,
and enterprise intellectual property and assets, and unnecessary
duplication of efforts
Odds ratio
Association rules
K-means clustering
And then a query you could do never before
• Using the Knowledge Base, the Structured Data and the Probabilistic
inferences all at the same time.
• To find the statistical links between Diabetes and Vision problems in our
Semantic Data Lake
• Find the set of ICD9s that are connected via one or more steps to
concepts in the KB that mention Diabetes
• Find the set of ICD9s that are connected via one or more steps to
vision* or eye* or retinal*
• An show how those two sets are related in the space of odds ratios
And then just a few other examples
In the ecommerce world: find similar objects based on > 10
criteria, including description, product codes, pictures, etc
Returns a Graph in a Table (ughh )
But powerful when visualized
Or like this
And linking with the panama papers
And now the researchers can start investigating
Summary: this is the new paradigm of computing
Structured
Data
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Probabilistic
Inferences.
Unstructured
Data
Knowledge
Domain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies

More Related Content

What's hot

Analytics and Access to the UK web archive
Analytics and Access to the UK web archiveAnalytics and Access to the UK web archive
Analytics and Access to the UK web archive
Lewis Crawford
 
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
South London Geek Nights
 
Panda Provenance
Panda ProvenancePanda Provenance
Panda Provenance
Vlad Vega
 

What's hot (20)

OpenRefine Tutorial
OpenRefine TutorialOpenRefine Tutorial
OpenRefine Tutorial
 
Slide 2 collecting, storing and analyzing big data
Slide 2 collecting, storing and analyzing big dataSlide 2 collecting, storing and analyzing big data
Slide 2 collecting, storing and analyzing big data
 
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4jCongressional PageRank: Graph Analytics of US Congress With Neo4j
Congressional PageRank: Graph Analytics of US Congress With Neo4j
 
GraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on DemandGraphDB Cloud: Enterprise Ready RDF Database on Demand
GraphDB Cloud: Enterprise Ready RDF Database on Demand
 
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Distributed machine learning 101 using apache spark from a browser   devoxx.b...Distributed machine learning 101 using apache spark from a browser   devoxx.b...
Distributed machine learning 101 using apache spark from a browser devoxx.b...
 
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
Discovering Emerging Tech through Graph Analysis - Henry Hwangbo @ GraphConne...
 
Fast Data processing with RFX
Fast Data processing with RFXFast Data processing with RFX
Fast Data processing with RFX
 
Smarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing PlatformSmarter content with a Dynamic Semantic Publishing Platform
Smarter content with a Dynamic Semantic Publishing Platform
 
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
 
Machine learning for java developers
Machine learning for java developersMachine learning for java developers
Machine learning for java developers
 
Spark for Recommender Systems
Spark for Recommender SystemsSpark for Recommender Systems
Spark for Recommender Systems
 
Analytics and Access to the UK web archive
Analytics and Access to the UK web archiveAnalytics and Access to the UK web archive
Analytics and Access to the UK web archive
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
NoSQL: what does it mean, how did we get here, and why should I care? - Hugo ...
 
Nicola Pagni - Anomaly Detection in Elasticsearch
Nicola Pagni - Anomaly Detection in ElasticsearchNicola Pagni - Anomaly Detection in Elasticsearch
Nicola Pagni - Anomaly Detection in Elasticsearch
 
Python for data science
Python for data sciencePython for data science
Python for data science
 
(Big) Data Science
(Big) Data Science(Big) Data Science
(Big) Data Science
 
R vs Python vs SAS
R vs Python vs SASR vs Python vs SAS
R vs Python vs SAS
 
Panda Provenance
Panda ProvenancePanda Provenance
Panda Provenance
 

Viewers also liked

Chatt State Library Staff: Parks and Recreation
Chatt State Library Staff: Parks and RecreationChatt State Library Staff: Parks and Recreation
Chatt State Library Staff: Parks and Recreation
orangejayhawk
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
InfiniteGraph
 

Viewers also liked (9)

InfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast
InfiniteGraph Presentation from Oct 21, 2010 DBTA WebcastInfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast
InfiniteGraph Presentation from Oct 21, 2010 DBTA Webcast
 
Chatt State Library Staff: Parks and Recreation
Chatt State Library Staff: Parks and RecreationChatt State Library Staff: Parks and Recreation
Chatt State Library Staff: Parks and Recreation
 
PowerOfRelationshipsInBigData_SVNoSQL
PowerOfRelationshipsInBigData_SVNoSQLPowerOfRelationshipsInBigData_SVNoSQL
PowerOfRelationshipsInBigData_SVNoSQL
 
Objectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL DatabaseObjectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL Database
 
Sparksee Technology overview
Sparksee Technology overviewSparksee Technology overview
Sparksee Technology overview
 
Sparksee overview
Sparksee overviewSparksee overview
Sparksee overview
 
An Introduction to Graph Databases
An Introduction to Graph DatabasesAn Introduction to Graph Databases
An Introduction to Graph Databases
 
Allegograph
AllegographAllegograph
Allegograph
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 

Similar to AllegroGraph - Cognitive Probability Graph webcast

Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
Building safety-critical medical device platforms and Meaningful Use EHR gate...
Building safety-critical medical device platforms and Meaningful Use EHR gate...Building safety-critical medical device platforms and Meaningful Use EHR gate...
Building safety-critical medical device platforms and Meaningful Use EHR gate...
Shahid Shah
 

Similar to AllegroGraph - Cognitive Probability Graph webcast (20)

Big Data in Healthcare and Medical Devices
Big Data in Healthcare and Medical DevicesBig Data in Healthcare and Medical Devices
Big Data in Healthcare and Medical Devices
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Cri big data
Cri big dataCri big data
Cri big data
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Dynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & StatisticsDynamic Search Using Semantics & Statistics
Dynamic Search Using Semantics & Statistics
 
Bayesian reasoning
Bayesian reasoningBayesian reasoning
Bayesian reasoning
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Nordic health data metadata
Nordic health data   metadataNordic health data   metadata
Nordic health data metadata
 
Melissa Informatics - Data Quality and AI
Melissa Informatics - Data Quality and AIMelissa Informatics - Data Quality and AI
Melissa Informatics - Data Quality and AI
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
 
Using Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical PathwaysUsing Machine Learning to Automate Clinical Pathways
Using Machine Learning to Automate Clinical Pathways
 
2016 Scope david cocker
2016 Scope david cocker2016 Scope david cocker
2016 Scope david cocker
 
Week_2_Lecture.pdf
Week_2_Lecture.pdfWeek_2_Lecture.pdf
Week_2_Lecture.pdf
 
Building safety-critical medical device platforms and Meaningful Use EHR gate...
Building safety-critical medical device platforms and Meaningful Use EHR gate...Building safety-critical medical device platforms and Meaningful Use EHR gate...
Building safety-critical medical device platforms and Meaningful Use EHR gate...
 
Data science 101
Data science 101Data science 101
Data science 101
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

AllegroGraph - Cognitive Probability Graph webcast