SlideShare a Scribd company logo
1 of 52
Download to read offline
Prof. Dr. Sören Auer
Faculty of Electrical Engineering & Computer Science
Leibniz University of Hannover
TIB Technische Informationsbibliothek
Knowledge Graphs for
Scholarly Communication
Page 2
Zuse Z3: the
beginning of
Computing –
close to the
hardware
Foto: Konrad Zuse
Internet
Archiv/Deutsches
Museum/DFG
Page 3
Page 4
We can make things
more intuitive
Picture: The illustrated recipes
of lucy eldridge
http://thefoxisblack.com/2013/
07/18/the-illustrated-recipes-
of-lucy-eldridge/
Computing more inuitive: procedural programming
Page 6Sören Auer 6
Computing more inuitive: OO programming
Page 8Sören Auer 8
Page 9
Computing even more inuitive: with cognitive data?!
Sören Auer 9
Page 10
Linked Data Principles
Addressing the neglected third V (Variety)
1. Use URIs to identify the “things” in your data
2. Use http:// URIs so people (and machines) can look them up
on the web
3. When a URI is looked up, return a description of the thing in
the W3C Resource Description Format (RDF)
4. Include links to related things
http://www.w3.org/DesignIssues/LinkedData.html
Page 11
1. Graph based RDF data model consisting of S-P-O statements (facts)
2. Serialised as RDF Triples:
InfAI conf:organizes LSWT2019 .
LSWT2019 conf:starts “2019-22-07”^^xsd:date .
LSWT2019 conf:takesPlaceAt dbpedia:Leipzig .
3. Publication under URL in Web, Intranet, Extranet
RDF & Linked Data in a Nutshell
LSWT2019
dbpedia:Leipzig
22.05.2019
Infai
conf:organizes conf:starts
conf:takesPlaceInSubject Predicate Object
Page 12
Creating Knowledge Graphs with RDF
Linked Data
DHL
Post Tower
162.5 m
Bonn
Logistics Logistik
DHL International GmbH
??
located in
label
industry
headquarters
height
label
full name
located in
label
industry
headquarters
full nameDHL
Post Tower
162.5 m
Bonn
Logistics Logistik
DHL International GmbH
height
物流
label
Page 13
 Fabric of concept, class, property, relationships, entity desc.
 Uses a knowledge representation formalism (RDF, OWL)
 Holistic knowledge (multi-domain, source, granularity):
 instance data (ground truth),
 open (e.g. DBpedia, WikiData), private (e.g. supply chain data), closed data (product models),
 derived, aggregated data,
 schema data (vocabularies, ontologies)
 meta-data (e.g. provenance, versioning, documentation licensing)
 comprehensive taxonomies to categorize entities
 links between internal and external data
 mappings to data stored in other systems and databases
Knowledge Graphs – A definition
Page 14Source: https://cacm.acm.org/system/assets/0002/2618/021116_Google_KnowledgeGraph.large.jpg?1476779500&1455222197
Source:
https://pic2.zhimg.com/v2-
878ad2a55c440b18c889394a7
abaa5d3_1200x500.jpg
WDAqua project vision
● Answer natural language
questions
● Exploit knowledge encoded in
the Web of Data
● Provide QA services to
citizens, communities, and
industry
15
Q
A
Web of Data
Who is the director of
Clockwork Orange?
16
Who is the director of
Clockwork Orange?
17
Understand a
spoken question
Who is the director of
Clockwork Orange?
18
Understand a
spoken question
Analyse question
Who is the director of
Clockwork Orange?
19
Understand a
spoken question
Analyse question
Find data to
answer the
question
Who is the director of
Clockwork Orange?
20
Understand a
spoken question
Analyse question
Find data to
answer the
question
Present the
answer
Who is the director of
Clockwork Orange?
21
Understand a
spoken question
Analyse question
Find data to
answer the
question
Present the
answer
Data source:
22
Who is the director of
Clockwork Orange?
Understand a
spoken question
Analyse question
Find data to
answer the
question
Present the
answer
Demo:
http://wdaqua.eu/qa
Page 23
How did information flows
change in the digital era?
Page 24
Computer
Source: http://todde.bplaced.net/c64otto2.jpg
Page 25
Road Maps
Source: http://www.stanfords.co.uk/content/images/thumbs/013/0136835_sample_171296_LondonMiniStreetAtlas_A-Zpbk_carto.jpeg
Source: https://images-na.ssl-images-amazon.com/images/I/5135QBYNWHL.
_SX348_BO1,204,203,200_.jpg
Page 26
Phone Books
Source: http://www.stanfords.co.uk/content/images/thumbs/013/0136835_sample_
171296_LondonMiniStreetAtlas_A-Zpbk_carto.jpeg Source: https://i0.wp.com/media.boingboing.net/wp-content/uploads/2017/06/Batman.jpg?w=640&ssl=1
Page 27
How does it
work today?
Page 28https://www.idealo.de/preisvergleich/ProductCategory/19116.html?qd=smartphone, 04.2019
Page 29https://www.google.de/maps/place/Atlanta,+Georgia,+USA/@33.756009,-84.4151149,13.5z/data=!4m5!3m4!1s0x88f5045d6993098d:0x66fede2f990b630b!8m2!3d33.7489954!4d-84.3879824, 04.2019
Page 30
 New means adapted to the new posibilities were developed,
e.g. „zooming“, dynamics
 Business models changed completely
 More focus on data, interlinking of data / services and search in the data
 Integration, crowdsourcing play an important role
The World of Publishing &
Communication has profundely changed
Page 31
What about
Scholarly
Communication?
Page 32
Scholarly Communication has not changed (much)
17th century 19th century 20th century 21th century
Meanwhile other information intense domains were completely disrupted:
mail order catalogs, street maps, phone books, …
Page 33
Challenges we are facing:
We need to rethink the way how research
is represented and communicated
[1] http://thecostofknowledge.com, https://www.projekt-deal.de
[2] M. Baker: 1,500 scientists lift the lid on reproducibility, Nature, 2016.
[3] Science and Engineering Publication Output Trends, National Science Foundation, 2018.
[4] J. Couzin-Frankel: Secretive and Subjective, Peer Review Proves Resistant to Study. Science, 2013.
Digitalisation
of Science
 Data integration
and analysis
 Digital
collaboration
Monopolisation by
commercial actors
 Publisher
look-in effects
 Maximization
of profits [1]
Reproducibility
Crisis
 Majority of
experiments are
hard or not
reproducible [2]
Proliferation
of publications
 Publication output
doubled within a
decade
 continues to rise [3]
Deficiency
of Peer Review
 Deteriorating
quality [4]
 Predatory
publishing
Page 34
Search for CRISPR:
> 9.000 Results
Source: https://www.tib.eu/de/suchen/?id=198&tx_tibsearch_search%5Bquery%5D=CRISPR&tx_tibsearch_search%5Bsrt%5D=rk&tx_tibsearch_search%5Bcnt%5D=20, 04.2019
Page 35
How good is CRISPR
(wrt. precision, safety, cost)?
What specifics has genome
editing with insects?
Who has applied it to
butterflies?
Search for CRISPR:
> 238.000 Results
Source: https://scholar.google.de/scholar?hl=de&as_sdt=0%2C5&q=CRISPR&btnG=, 04.2019
Page 36
How can
we fix it?
Page 37
Realizing Vannevar Bush‘s
vision of Memex
Source: http://photos1.blogger.com/blogger/5874/1071/1600/Memex.jpg Source: http://tntindex.blogspot.com/2014/10/tabletalk-vannevar-bushs-memex.html
Page 38
Mathematics
• Definitions
• Theorems
• Proofs
• Methods
• …
Physics
• Experiments
• Data
• Models
• …
Chemistry
• Substances
• Structures
• Reactions
• …
Computer
Science
• Concepts
• Implemen-
tations
• Evaluations
• …
Technology
• Standards
• Processes
• Elements
• Units,
Sensor data
Architecture
• Regulations
• Elements
• Models
• …
Concepts
Overarching Concepts
 Research problems
 Definitions
 Research approaches
 Methods
Artefacts
 Publications
 Data
 Software
 Image/Audio/Video
 Knowledge Graphs / Ontologies
Domain specific Concepts
Page 39
Chemistry Example: CRISPR Genome Editing
Source: https://cacm.acm.org/system/assets/0002/2618/021116_Google_KnowledgeGraph.large.jpg?1476779500&1455222197
Page 40
1. Original Publication
Chemistry Example: Populating the Graph
2. Adaptive Graph Curation & Completion
Author Robert Reed
Research Problem Genome editing in Lepidoptera
Methods CRISPR / cas9
Applied on Lepidoptera
Experimental Data https://doi.org/10.5281/zenodo.896916
3. Graph representation
CRISPR / cas9 editing
in Lepidoptera
https://doi.org/10.1101/130344
Robert Reed
https://orcid.org/0000-0002-6065-6728
Genome editing in
Lepidoptera
Experimental Data
https://doi.org/10.5281/zenodo.896916
adresses
CRSPRS/cas9isEvaluatedWith
Genome editing
https://www.wikidata.org/wiki/Q24630389
Page 41
Research Challenge:
• Intuitive exploration leveraging the
rich semantic representations
• Answer natural language questions
Exploration and Question Answering
Question
parsing
Named
Entity
Recognition
(NER) &
Linking
(NEL)
Relation
extraction
Query
con-
struction
Query
execution
Result
rendering
Q: How do different
genome editing techniques compare?
SELECT Approach, Feature WHERE {
Approach adresses GenomEditing .
Approach hasFeature Feature }
Q: How do different
genome editing techniques compare?
Page 42
Engineered Nucleases Site-specificity Safety Ease-of-use / costs/ speed
zinc finger nucleases (ZFN) ++
9-18nt
+ --
$$$: screening, testing to define efficiency
transcription activator-like
effector nucleases (TALENs)
+++
9-16nt
++ ++
Easy to engineer
1 week / few hundred dollar
engineered meganucleases +++
12-40 nt
0 --
$$$ Protein engineering, high-throughput
screening
CRISPR system/cas9 ++
5-12 nt
- +++
Easy to engineer
few days / less 200 dollar
Result:
Automatic Generation of Comparisons / Surveys
Q: How do different genome editing techniques compare?
Page 43
Page 44
Page 45
Page 46
Page 47
Page 48
Page 49
Facilitating Comparisons of Research Contributions
Page 50
High-level Data Model: RDF + Metadata
Statement
Predicate
resource_id: R1
date: 2019-01-23
user: 1234
Resource Resource
Literal
resource_id: R5
date: 2019-01-23
user: 6789
literal_id: L17
value: „ORKG“
date: 2019-01-23
user: 1234
predicate_id: P4
date: 2019-01-23
user: 6789
statement_id: S2
date: 2019-01-23
user: 6789
Page 51
Business Logic
(Data input/output, consistency)
REST API
(Interface to the outside world)
SPARQL
(Data Query
Language)
GraphQL
(API Query
Language)
Neo4j
(Linked Property Graph)
Virtuoso
(Triple Store)
Domain Model
(Statements, Resouces, etc.)
AuthN & AuthZ
(ORCID or other SSO)
Contribute Curate Explore
Third-party
Apps
?
(Other Database)
User
Interface
PersistenceDomainApplication
High-Level Architecture: Neo4j Graph Application
Contact
Prof. Dr. Sören Auer
TIB & Leibniz University of Hannover
auer@tib.eu
https://de.linkedin.com/in/soerenauer
https://twitter.com/soerenauer
https://www.xing.com/profile/Soeren_Auer
http://www.researchgate.net/profile/Soeren_Auer

More Related Content

What's hot

The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
Herbert Van de Sompel
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
Herbert Van de Sompel
 

What's hot (20)

Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps.
 
The SFX Framework for Context-Sensitive Reference Linking
The SFX Framework for  Context-Sensitive Reference LinkingThe SFX Framework for  Context-Sensitive Reference Linking
The SFX Framework for Context-Sensitive Reference Linking
 
RDAP13 Lorrie Johnson: Facilitating Access to Scientific Data
RDAP13 Lorrie Johnson: Facilitating Access to Scientific DataRDAP13 Lorrie Johnson: Facilitating Access to Scientific Data
RDAP13 Lorrie Johnson: Facilitating Access to Scientific Data
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
 
Building collaborative workflows for scientific data
Building collaborative workflows for scientific dataBuilding collaborative workflows for scientific data
Building collaborative workflows for scientific data
 
How Does Data Science Impact the Semantic Web?
How Does Data Science Impact the Semantic Web?How Does Data Science Impact the Semantic Web?
How Does Data Science Impact the Semantic Web?
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
 
FAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning IssueFAIR Signposting: A KISS Approach to a Burning Issue
FAIR Signposting: A KISS Approach to a Burning Issue
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
How Does Data Science Impact the Semantic Web?
How Does Data Science Impact the Semantic Web?How Does Data Science Impact the Semantic Web?
How Does Data Science Impact the Semantic Web?
 
DCC Keynote 2007
DCC Keynote 2007DCC Keynote 2007
DCC Keynote 2007
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Data management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK StoryData management, data sharing: the SysMO-SEEK Story
Data management, data sharing: the SysMO-SEEK Story
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
Learning Multilingual Semantics from Big Data on the Web
Learning Multilingual Semantics from Big Data on the WebLearning Multilingual Semantics from Big Data on the Web
Learning Multilingual Semantics from Big Data on the Web
 
Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction) Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction)
 

Similar to Knowledge Graphs for Scholarly Communication

Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Sören Auer
 
CSCW in Times of Social Media
CSCW in Times of Social MediaCSCW in Times of Social Media
CSCW in Times of Social Media
Hendrik Drachsler
 

Similar to Knowledge Graphs for Scholarly Communication (20)

Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
 
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVF
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
Ciard Initiative and a Global Infrastructure for Linked Open Data
Ciard Initiative and a Global Infrastructure for Linked Open Data Ciard Initiative and a Global Infrastructure for Linked Open Data
Ciard Initiative and a Global Infrastructure for Linked Open Data
 
Cornell 2011 05-13
Cornell 2011 05-13Cornell 2011 05-13
Cornell 2011 05-13
 
CSCW in Times of Social Media
CSCW in Times of Social MediaCSCW in Times of Social Media
CSCW in Times of Social Media
 
Metadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data RepositoriesMetadata as Linked Data for Research Data Repositories
Metadata as Linked Data for Research Data Repositories
 
Towards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphTowards an Open Research Knowledge Graph
Towards an Open Research Knowledge Graph
 
Open Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | FutureOpen Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | Future
 
The End(s) of e-Research
The End(s) of e-ResearchThe End(s) of e-Research
The End(s) of e-Research
 
Scott Edmunds & Rob Davidson's talk at the Metabolomics Society 2014 Meeting ...
Scott Edmunds & Rob Davidson's talk at the Metabolomics Society 2014 Meeting ...Scott Edmunds & Rob Davidson's talk at the Metabolomics Society 2014 Meeting ...
Scott Edmunds & Rob Davidson's talk at the Metabolomics Society 2014 Meeting ...
 
LiDIA: An integration architecture to query Linked Open Data from multiple da...
LiDIA: An integration architecture to query Linked Open Data from multiple da...LiDIA: An integration architecture to query Linked Open Data from multiple da...
LiDIA: An integration architecture to query Linked Open Data from multiple da...
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Emerging Forms of Data and Analytics
Emerging Forms of Data and AnalyticsEmerging Forms of Data and Analytics
Emerging Forms of Data and Analytics
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Knowledge Graph Introduction
Knowledge Graph IntroductionKnowledge Graph Introduction
Knowledge Graph Introduction
 
DN18 | From Counting to Connecting: A Networked and Data-Driven Approach to M...
DN18 | From Counting to Connecting: A Networked and Data-Driven Approach to M...DN18 | From Counting to Connecting: A Networked and Data-Driven Approach to M...
DN18 | From Counting to Connecting: A Networked and Data-Driven Approach to M...
 
New challenges for digital scholarship and curation in the era of ubiquitous ...
New challenges for digital scholarship and curation in the era of ubiquitous ...New challenges for digital scholarship and curation in the era of ubiquitous ...
New challenges for digital scholarship and curation in the era of ubiquitous ...
 
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in BiocomputingCyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in Biocomputing
 

More from Leipziger Semantic Web Tag

More from Leipziger Semantic Web Tag (17)

GeniusTex - a Smart Textiles innovation platform with semantic technologies i...
GeniusTex - a Smart Textiles innovation platform with semantic technologies i...GeniusTex - a Smart Textiles innovation platform with semantic technologies i...
GeniusTex - a Smart Textiles innovation platform with semantic technologies i...
 
Präsentation der Semantic Web Lehrkonzepte an der TH Brandenburg
Präsentation der Semantic Web Lehrkonzepte an der TH Brandenburg Präsentation der Semantic Web Lehrkonzepte an der TH Brandenburg
Präsentation der Semantic Web Lehrkonzepte an der TH Brandenburg
 
Semantic Web in the Digital Humanities
Semantic Web in the Digital HumanitiesSemantic Web in the Digital Humanities
Semantic Web in the Digital Humanities
 
Das QROWD-Projekt - Because Big Data Integration is Humanly Possible
Das QROWD-Projekt - Because Big Data Integration is Humanly PossibleDas QROWD-Projekt - Because Big Data Integration is Humanly Possible
Das QROWD-Projekt - Because Big Data Integration is Humanly Possible
 
An Ontology Engineering Approach to Support Personalized Gamification of CSCL
An Ontology Engineering Approach to Support Personalized Gamification of CSCLAn Ontology Engineering Approach to Support Personalized Gamification of CSCL
An Ontology Engineering Approach to Support Personalized Gamification of CSCL
 
The DBpedia databus
The DBpedia databusThe DBpedia databus
The DBpedia databus
 
tech4comp - Kompetenzmessung durch Datenanalyse für E-Assessment
tech4comp - Kompetenzmessung durch Datenanalyse für E-Assessmenttech4comp - Kompetenzmessung durch Datenanalyse für E-Assessment
tech4comp - Kompetenzmessung durch Datenanalyse für E-Assessment
 
PlatonaM - Plattform-Ökosystem for innovative maintenance management through ...
PlatonaM - Plattform-Ökosystem for innovative maintenance management through ...PlatonaM - Plattform-Ökosystem for innovative maintenance management through ...
PlatonaM - Plattform-Ökosystem for innovative maintenance management through ...
 
Jekyll RDF
Jekyll RDFJekyll RDF
Jekyll RDF
 
Mushroom Effect or Why You Need Knowledge Graphs for Dialogue Systems
Mushroom Effect or Why You Need Knowledge Graphs for Dialogue SystemsMushroom Effect or Why You Need Knowledge Graphs for Dialogue Systems
Mushroom Effect or Why You Need Knowledge Graphs for Dialogue Systems
 
xCOR - a Value Chain Framework Ontology
xCOR - a Value Chain Framework OntologyxCOR - a Value Chain Framework Ontology
xCOR - a Value Chain Framework Ontology
 
Linked Data Publication Pipelines for Agri-Related use cases
Linked Data Publication Pipelines for Agri-Related use casesLinked Data Publication Pipelines for Agri-Related use cases
Linked Data Publication Pipelines for Agri-Related use cases
 
Das LIMBO Projekt – Linked Data Enterprise Use-Cases unter Verwendung der Dat...
Das LIMBO Projekt – Linked Data Enterprise Use-Cases unter Verwendung der Dat...Das LIMBO Projekt – Linked Data Enterprise Use-Cases unter Verwendung der Dat...
Das LIMBO Projekt – Linked Data Enterprise Use-Cases unter Verwendung der Dat...
 
SNIK: An Ontology of Information Management in Hospitals
SNIK: An Ontology of Information Management in HospitalsSNIK: An Ontology of Information Management in Hospitals
SNIK: An Ontology of Information Management in Hospitals
 
The WUMM Project Semantic Data and Innovation Management
The WUMM Project Semantic Data and Innovation ManagementThe WUMM Project Semantic Data and Innovation Management
The WUMM Project Semantic Data and Innovation Management
 
BEXIS 2 - Semantic Web Techniques in Research Data Management
BEXIS 2 - Semantic Web Techniques in Research Data ManagementBEXIS 2 - Semantic Web Techniques in Research Data Management
BEXIS 2 - Semantic Web Techniques in Research Data Management
 
Towards a productive Linked Data environment within Enterprises
Towards a productive Linked Data environment within EnterprisesTowards a productive Linked Data environment within Enterprises
Towards a productive Linked Data environment within Enterprises
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
UK Journal
 

Recently uploaded (20)

What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
 
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
Choosing the Right FDO Deployment Model for Your Application _ Geoffrey at In...
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 

Knowledge Graphs for Scholarly Communication

  • 1. Prof. Dr. Sören Auer Faculty of Electrical Engineering & Computer Science Leibniz University of Hannover TIB Technische Informationsbibliothek Knowledge Graphs for Scholarly Communication
  • 2. Page 2 Zuse Z3: the beginning of Computing – close to the hardware Foto: Konrad Zuse Internet Archiv/Deutsches Museum/DFG
  • 4. Page 4 We can make things more intuitive Picture: The illustrated recipes of lucy eldridge http://thefoxisblack.com/2013/ 07/18/the-illustrated-recipes- of-lucy-eldridge/
  • 5. Computing more inuitive: procedural programming
  • 7. Computing more inuitive: OO programming
  • 9. Page 9 Computing even more inuitive: with cognitive data?! Sören Auer 9
  • 10. Page 10 Linked Data Principles Addressing the neglected third V (Variety) 1. Use URIs to identify the “things” in your data 2. Use http:// URIs so people (and machines) can look them up on the web 3. When a URI is looked up, return a description of the thing in the W3C Resource Description Format (RDF) 4. Include links to related things http://www.w3.org/DesignIssues/LinkedData.html
  • 11. Page 11 1. Graph based RDF data model consisting of S-P-O statements (facts) 2. Serialised as RDF Triples: InfAI conf:organizes LSWT2019 . LSWT2019 conf:starts “2019-22-07”^^xsd:date . LSWT2019 conf:takesPlaceAt dbpedia:Leipzig . 3. Publication under URL in Web, Intranet, Extranet RDF & Linked Data in a Nutshell LSWT2019 dbpedia:Leipzig 22.05.2019 Infai conf:organizes conf:starts conf:takesPlaceInSubject Predicate Object
  • 12. Page 12 Creating Knowledge Graphs with RDF Linked Data DHL Post Tower 162.5 m Bonn Logistics Logistik DHL International GmbH ?? located in label industry headquarters height label full name located in label industry headquarters full nameDHL Post Tower 162.5 m Bonn Logistics Logistik DHL International GmbH height 物流 label
  • 13. Page 13  Fabric of concept, class, property, relationships, entity desc.  Uses a knowledge representation formalism (RDF, OWL)  Holistic knowledge (multi-domain, source, granularity):  instance data (ground truth),  open (e.g. DBpedia, WikiData), private (e.g. supply chain data), closed data (product models),  derived, aggregated data,  schema data (vocabularies, ontologies)  meta-data (e.g. provenance, versioning, documentation licensing)  comprehensive taxonomies to categorize entities  links between internal and external data  mappings to data stored in other systems and databases Knowledge Graphs – A definition
  • 15. WDAqua project vision ● Answer natural language questions ● Exploit knowledge encoded in the Web of Data ● Provide QA services to citizens, communities, and industry 15 Q A Web of Data
  • 16. Who is the director of Clockwork Orange? 16
  • 17. Who is the director of Clockwork Orange? 17 Understand a spoken question
  • 18. Who is the director of Clockwork Orange? 18 Understand a spoken question Analyse question
  • 19. Who is the director of Clockwork Orange? 19 Understand a spoken question Analyse question Find data to answer the question
  • 20. Who is the director of Clockwork Orange? 20 Understand a spoken question Analyse question Find data to answer the question Present the answer
  • 21. Who is the director of Clockwork Orange? 21 Understand a spoken question Analyse question Find data to answer the question Present the answer Data source:
  • 22. 22 Who is the director of Clockwork Orange? Understand a spoken question Analyse question Find data to answer the question Present the answer Demo: http://wdaqua.eu/qa
  • 23. Page 23 How did information flows change in the digital era?
  • 25. Page 25 Road Maps Source: http://www.stanfords.co.uk/content/images/thumbs/013/0136835_sample_171296_LondonMiniStreetAtlas_A-Zpbk_carto.jpeg Source: https://images-na.ssl-images-amazon.com/images/I/5135QBYNWHL. _SX348_BO1,204,203,200_.jpg
  • 26. Page 26 Phone Books Source: http://www.stanfords.co.uk/content/images/thumbs/013/0136835_sample_ 171296_LondonMiniStreetAtlas_A-Zpbk_carto.jpeg Source: https://i0.wp.com/media.boingboing.net/wp-content/uploads/2017/06/Batman.jpg?w=640&ssl=1
  • 27. Page 27 How does it work today?
  • 30. Page 30  New means adapted to the new posibilities were developed, e.g. „zooming“, dynamics  Business models changed completely  More focus on data, interlinking of data / services and search in the data  Integration, crowdsourcing play an important role The World of Publishing & Communication has profundely changed
  • 32. Page 32 Scholarly Communication has not changed (much) 17th century 19th century 20th century 21th century Meanwhile other information intense domains were completely disrupted: mail order catalogs, street maps, phone books, …
  • 33. Page 33 Challenges we are facing: We need to rethink the way how research is represented and communicated [1] http://thecostofknowledge.com, https://www.projekt-deal.de [2] M. Baker: 1,500 scientists lift the lid on reproducibility, Nature, 2016. [3] Science and Engineering Publication Output Trends, National Science Foundation, 2018. [4] J. Couzin-Frankel: Secretive and Subjective, Peer Review Proves Resistant to Study. Science, 2013. Digitalisation of Science  Data integration and analysis  Digital collaboration Monopolisation by commercial actors  Publisher look-in effects  Maximization of profits [1] Reproducibility Crisis  Majority of experiments are hard or not reproducible [2] Proliferation of publications  Publication output doubled within a decade  continues to rise [3] Deficiency of Peer Review  Deteriorating quality [4]  Predatory publishing
  • 34. Page 34 Search for CRISPR: > 9.000 Results Source: https://www.tib.eu/de/suchen/?id=198&tx_tibsearch_search%5Bquery%5D=CRISPR&tx_tibsearch_search%5Bsrt%5D=rk&tx_tibsearch_search%5Bcnt%5D=20, 04.2019
  • 35. Page 35 How good is CRISPR (wrt. precision, safety, cost)? What specifics has genome editing with insects? Who has applied it to butterflies? Search for CRISPR: > 238.000 Results Source: https://scholar.google.de/scholar?hl=de&as_sdt=0%2C5&q=CRISPR&btnG=, 04.2019
  • 37. Page 37 Realizing Vannevar Bush‘s vision of Memex Source: http://photos1.blogger.com/blogger/5874/1071/1600/Memex.jpg Source: http://tntindex.blogspot.com/2014/10/tabletalk-vannevar-bushs-memex.html
  • 38. Page 38 Mathematics • Definitions • Theorems • Proofs • Methods • … Physics • Experiments • Data • Models • … Chemistry • Substances • Structures • Reactions • … Computer Science • Concepts • Implemen- tations • Evaluations • … Technology • Standards • Processes • Elements • Units, Sensor data Architecture • Regulations • Elements • Models • … Concepts Overarching Concepts  Research problems  Definitions  Research approaches  Methods Artefacts  Publications  Data  Software  Image/Audio/Video  Knowledge Graphs / Ontologies Domain specific Concepts
  • 39. Page 39 Chemistry Example: CRISPR Genome Editing Source: https://cacm.acm.org/system/assets/0002/2618/021116_Google_KnowledgeGraph.large.jpg?1476779500&1455222197
  • 40. Page 40 1. Original Publication Chemistry Example: Populating the Graph 2. Adaptive Graph Curation & Completion Author Robert Reed Research Problem Genome editing in Lepidoptera Methods CRISPR / cas9 Applied on Lepidoptera Experimental Data https://doi.org/10.5281/zenodo.896916 3. Graph representation CRISPR / cas9 editing in Lepidoptera https://doi.org/10.1101/130344 Robert Reed https://orcid.org/0000-0002-6065-6728 Genome editing in Lepidoptera Experimental Data https://doi.org/10.5281/zenodo.896916 adresses CRSPRS/cas9isEvaluatedWith Genome editing https://www.wikidata.org/wiki/Q24630389
  • 41. Page 41 Research Challenge: • Intuitive exploration leveraging the rich semantic representations • Answer natural language questions Exploration and Question Answering Question parsing Named Entity Recognition (NER) & Linking (NEL) Relation extraction Query con- struction Query execution Result rendering Q: How do different genome editing techniques compare? SELECT Approach, Feature WHERE { Approach adresses GenomEditing . Approach hasFeature Feature } Q: How do different genome editing techniques compare?
  • 42. Page 42 Engineered Nucleases Site-specificity Safety Ease-of-use / costs/ speed zinc finger nucleases (ZFN) ++ 9-18nt + -- $$$: screening, testing to define efficiency transcription activator-like effector nucleases (TALENs) +++ 9-16nt ++ ++ Easy to engineer 1 week / few hundred dollar engineered meganucleases +++ 12-40 nt 0 -- $$$ Protein engineering, high-throughput screening CRISPR system/cas9 ++ 5-12 nt - +++ Easy to engineer few days / less 200 dollar Result: Automatic Generation of Comparisons / Surveys Q: How do different genome editing techniques compare?
  • 49. Page 49 Facilitating Comparisons of Research Contributions
  • 50. Page 50 High-level Data Model: RDF + Metadata Statement Predicate resource_id: R1 date: 2019-01-23 user: 1234 Resource Resource Literal resource_id: R5 date: 2019-01-23 user: 6789 literal_id: L17 value: „ORKG“ date: 2019-01-23 user: 1234 predicate_id: P4 date: 2019-01-23 user: 6789 statement_id: S2 date: 2019-01-23 user: 6789
  • 51. Page 51 Business Logic (Data input/output, consistency) REST API (Interface to the outside world) SPARQL (Data Query Language) GraphQL (API Query Language) Neo4j (Linked Property Graph) Virtuoso (Triple Store) Domain Model (Statements, Resouces, etc.) AuthN & AuthZ (ORCID or other SSO) Contribute Curate Explore Third-party Apps ? (Other Database) User Interface PersistenceDomainApplication High-Level Architecture: Neo4j Graph Application
  • 52. Contact Prof. Dr. Sören Auer TIB & Leibniz University of Hannover auer@tib.eu https://de.linkedin.com/in/soerenauer https://twitter.com/soerenauer https://www.xing.com/profile/Soeren_Auer http://www.researchgate.net/profile/Soeren_Auer