SlideShare a Scribd company logo

Towards Automating Data Narratives

dgarijo
dgarijo

We propose a new area of research on automating data narratives. Data narratives are containers of information about computationally generated research findings. They have three major components: 1) A record of events, that describe a new result through a workflow and/or provenance of all the computations executed; 2) Persistent entries for key entities involved for data, software versions, and workflows; 3) A set of narrative accounts that are automatically generated human-consumable renderings of the record and entities and can be included in a paper. Different narrative accounts can be used for different audiences with different content and details, based on the level of interest or expertise of the reader. Data narratives can make science more transparent and reproducible, because they ensure that the text description of the computational experiment reflects with high fidelity what was actually done. Data narratives can be incorporated in papers, either in the methods section or as supplementary materials. We introduce DANA, a prototype that illustrates how to generate data narratives automatically, and describe the information it uses from the computational records. We also present a formative evaluation of our approach and discuss potential uses of automated data narratives.

Towards Automating Data Narratives

1 of 23
Download to read offline
TOWARDS AUTOMATING DATA NARRATIVES
Yolanda Gil, Daniel Garijo
Information Sciences Institute
University of Southern California
@yolandagil, @dgarijov
{gil,dgarijo}@isi.edu
Information
Sciences
Institute
The Scientific Research Process
Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo
Formulate
hypothesis
Define the experiment
(data + method)
Find data
Run
experiments
(methods)
Meta-analysis of
results
Revise
hypothesis
The products of scientific research
Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 3
Formulate
hypothesis
Define the experiment
(data + method)
Find data
Run
experiments
(methods)
Meta-analysis of
results
Revise
hypothesis
Publication
Methods
Data
Software
Execution traces
Reconstructing the Computations from the
Text in the Paper
Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 4
Comparison of Ligand Binding Sites
The SMAP software was used to compare
the binding sites of the 749 M.tb protein
structures plus 1,446 homology models
(a total of 2,195 protein structures) with the
962 binding sites of 274 approved drugs,
in an all-against-all manner. While the
binding sites of the approved drugs were
already defined by the bound ligand, the
entire protein surface of each of the 2,195
M.tb protein structures was scanned in
order to identify alternative binding sites.
For each pairwise comparison, a P -value
representing the significance of the binding
site similarity was calculated.
“The Mycobacterium Tuberculosis Drugome and Its Polypharmacological Implications.” Kinnings, S. L.; Xie, L.; Fung, K.
H.; Jackson, R. M.; Xie, L.; and Bourne, P. E. PLoS Computational Biology, 2011.
Problem with current approaches:
what the paper said vs what the software did
Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 5
“The Mycobacterium Tuberculosis Drugome and Its Polypharmacological Implications.” Kinnings, S. L.; Xie, L.; Fung, K.
H.; Jackson, R. M.; Xie, L.; and Bourne, P. E. PLoS Computational Biology, 2011.
Actual computation
Problem with current approaches
Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 6
 Incomplete
 Missing steps and intermediate
data
 Ambiguous
 Several interpretations about how
computations are done
 Inconsistent level of detail
 Mixing of general methods
with execution details
Step1
Step ??
Step 2
?
Step1
Step 2
Step1’
Step 2’
Implementation 1?
Implementation 2?
Step1
Step 2
Param1 = 2
File = “Input.txt”
Ad

Recommended

FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...dgarijo
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...dgarijo
 
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
 
Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
 

More Related Content

What's hot

Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019dgarijo
 
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Amélie Gyrard
 
Defining iot.schema.org: Using Knowledge Extraction from Existing IoT-based ...
Defining iot.schema.org: Using Knowledge Extraction from  Existing IoT-based ...Defining iot.schema.org: Using Knowledge Extraction from  Existing IoT-based ...
Defining iot.schema.org: Using Knowledge Extraction from Existing IoT-based ...Amélie Gyrard
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
 
The State of Linked Government Data
The State of Linked Government DataThe State of Linked Government Data
The State of Linked Government DataRichard Cyganiak
 
Data mining on social networks for students learning experiences
Data mining on social networks for students learning experiences Data mining on social networks for students learning experiences
Data mining on social networks for students learning experiences Biplab Debnath
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Softwaredgarijo
 
Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction) Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction) Jamie Bisset
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
Fighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial IntelligenceFighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial Intelligencevty
 
Converting scripts into reproducible workflow research objects
Converting scripts into reproducible workflow research objectsConverting scripts into reproducible workflow research objects
Converting scripts into reproducible workflow research objectsKhalid Belhajjame
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsCarole Goble
 
Mortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data qualityMortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data qualityLars Albertsson
 
Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zenecaKerstin Forsberg
 
A Sightseeing Tour of Prov and Some of its Extensions
A Sightseeing Tour of Prov and Some of its ExtensionsA Sightseeing Tour of Prov and Some of its Extensions
A Sightseeing Tour of Prov and Some of its ExtensionsKhalid Belhajjame
 
Big Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesBig Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesSrinath Srinivasa
 

What's hot (20)

Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019
 
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
 
Defining iot.schema.org: Using Knowledge Extraction from Existing IoT-based ...
Defining iot.schema.org: Using Knowledge Extraction from  Existing IoT-based ...Defining iot.schema.org: Using Knowledge Extraction from  Existing IoT-based ...
Defining iot.schema.org: Using Knowledge Extraction from Existing IoT-based ...
 
Kohlmeier "Innovations in Academic Search & Discovery - A Case Study From the...
Kohlmeier "Innovations in Academic Search & Discovery - A Case Study From the...Kohlmeier "Innovations in Academic Search & Discovery - A Case Study From the...
Kohlmeier "Innovations in Academic Search & Discovery - A Case Study From the...
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 
The State of Linked Government Data
The State of Linked Government DataThe State of Linked Government Data
The State of Linked Government Data
 
Data mining on social networks for students learning experiences
Data mining on social networks for students learning experiences Data mining on social networks for students learning experiences
Data mining on social networks for students learning experiences
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction) Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction)
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Fighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial IntelligenceFighting COVID-19 with Artificial Intelligence
Fighting COVID-19 with Artificial Intelligence
 
Converting scripts into reproducible workflow research objects
Converting scripts into reproducible workflow research objectsConverting scripts into reproducible workflow research objects
Converting scripts into reproducible workflow research objects
 
Benchmarking Linked Data Introductory Remarks
Benchmarking Linked Data Introductory RemarksBenchmarking Linked Data Introductory Remarks
Benchmarking Linked Data Introductory Remarks
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
Hobbit presentation at Apache Big Data Europe 2016
Hobbit presentation at Apache Big Data Europe 2016Hobbit presentation at Apache Big Data Europe 2016
Hobbit presentation at Apache Big Data Europe 2016
 
Mortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data qualityMortal analytics - Covid-19 and the problem of data quality
Mortal analytics - Covid-19 and the problem of data quality
 
Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zeneca
 
A Sightseeing Tour of Prov and Some of its Extensions
A Sightseeing Tour of Prov and Some of its ExtensionsA Sightseeing Tour of Prov and Some of its Extensions
A Sightseeing Tour of Prov and Some of its Extensions
 
Big Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesBig Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and Opportunities
 

Viewers also liked

Strategien din behøver en strategi: Agile eksempel
Strategien din behøver en strategi: Agile eksempelStrategien din behøver en strategi: Agile eksempel
Strategien din behøver en strategi: Agile eksempelEspen Haugen
 
Death of environmental science
Death of environmental scienceDeath of environmental science
Death of environmental scienceamulya123
 
Softchoice Discovery Series: Cloud Cost Governance
Softchoice Discovery Series: Cloud Cost GovernanceSoftchoice Discovery Series: Cloud Cost Governance
Softchoice Discovery Series: Cloud Cost GovernanceSoftchoice Corporation
 
Vriendin van Deefje geregistreerd als 'pay2me'
Vriendin van Deefje geregistreerd als 'pay2me'Vriendin van Deefje geregistreerd als 'pay2me'
Vriendin van Deefje geregistreerd als 'pay2me'Thierry Debels
 
Linux女子部 firewalld徹底入門!
Linux女子部 firewalld徹底入門!Linux女子部 firewalld徹底入門!
Linux女子部 firewalld徹底入門!Etsuji Nakai
 
Fiche projet ATOUTS Numériques II - Hôtel les Châtaigniers Privas
Fiche projet ATOUTS Numériques II - Hôtel les Châtaigniers PrivasFiche projet ATOUTS Numériques II - Hôtel les Châtaigniers Privas
Fiche projet ATOUTS Numériques II - Hôtel les Châtaigniers PrivasCYB@RDECHE
 
How to use Google AdWords to drive traffic to your business
How to use Google AdWords to drive traffic to your businessHow to use Google AdWords to drive traffic to your business
How to use Google AdWords to drive traffic to your businessIntuit Inc.
 
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a StartupMinimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a StartupRandy Shoup
 
Creating a social movement
Creating a social movementCreating a social movement
Creating a social movementHelen Bevan
 
Manuale tecnico - Corbetta | Fia
Manuale tecnico - Corbetta | FiaManuale tecnico - Corbetta | Fia
Manuale tecnico - Corbetta | FiaCorbettaFia
 
10 Tips for failing at microservices
10 Tips for failing at microservices10 Tips for failing at microservices
10 Tips for failing at microservicesDavid Schmitz
 
Project EU XXI: Future Europe: Social Europe™, Digital Europe™, Green Europe™...
Project EU XXI: Future Europe: Social Europe™, Digital Europe™, Green Europe™...Project EU XXI: Future Europe: Social Europe™, Digital Europe™, Green Europe™...
Project EU XXI: Future Europe: Social Europe™, Digital Europe™, Green Europe™...Azamat Abdoullaev
 
QNBFS Weekly Market Report March 16, 2017
QNBFS Weekly Market Report March 16, 2017QNBFS Weekly Market Report March 16, 2017
QNBFS Weekly Market Report March 16, 2017QNB Group
 
Planificar para tu jubilación: Banco Mediolanum en Forinvest
Planificar para tu jubilación: Banco Mediolanum en ForinvestPlanificar para tu jubilación: Banco Mediolanum en Forinvest
Planificar para tu jubilación: Banco Mediolanum en ForinvestRankia
 
La mise en cache et ses secrets
La mise en cache et ses secretsLa mise en cache et ses secrets
La mise en cache et ses secretsAymeric Bouillat
 
Badanie i implementacja aspektu QoE (ang. Quality of Experience) w aplikacjac...
Badanie i implementacja aspektu QoE (ang. Quality of Experience) w aplikacjac...Badanie i implementacja aspektu QoE (ang. Quality of Experience) w aplikacjac...
Badanie i implementacja aspektu QoE (ang. Quality of Experience) w aplikacjac...Mikolaj Leszczuk
 

Viewers also liked (20)

Strategien din behøver en strategi: Agile eksempel
Strategien din behøver en strategi: Agile eksempelStrategien din behøver en strategi: Agile eksempel
Strategien din behøver en strategi: Agile eksempel
 
Death of environmental science
Death of environmental scienceDeath of environmental science
Death of environmental science
 
Softchoice Discovery Series: Cloud Cost Governance
Softchoice Discovery Series: Cloud Cost GovernanceSoftchoice Discovery Series: Cloud Cost Governance
Softchoice Discovery Series: Cloud Cost Governance
 
Thank you 3.16.2017
Thank you 3.16.2017Thank you 3.16.2017
Thank you 3.16.2017
 
Vriendin van Deefje geregistreerd als 'pay2me'
Vriendin van Deefje geregistreerd als 'pay2me'Vriendin van Deefje geregistreerd als 'pay2me'
Vriendin van Deefje geregistreerd als 'pay2me'
 
Durabilidad de producto. Un nuevo factor de competitividad empresarial
Durabilidad de producto. Un nuevo factor de competitividad empresarialDurabilidad de producto. Un nuevo factor de competitividad empresarial
Durabilidad de producto. Un nuevo factor de competitividad empresarial
 
Linux女子部 firewalld徹底入門!
Linux女子部 firewalld徹底入門!Linux女子部 firewalld徹底入門!
Linux女子部 firewalld徹底入門!
 
Fiche projet ATOUTS Numériques II - Hôtel les Châtaigniers Privas
Fiche projet ATOUTS Numériques II - Hôtel les Châtaigniers PrivasFiche projet ATOUTS Numériques II - Hôtel les Châtaigniers Privas
Fiche projet ATOUTS Numériques II - Hôtel les Châtaigniers Privas
 
How to use Google AdWords to drive traffic to your business
How to use Google AdWords to drive traffic to your businessHow to use Google AdWords to drive traffic to your business
How to use Google AdWords to drive traffic to your business
 
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a StartupMinimum Viable Architecture -- Good Enough is Good Enough in a Startup
Minimum Viable Architecture -- Good Enough is Good Enough in a Startup
 
Creating a social movement
Creating a social movementCreating a social movement
Creating a social movement
 
Manuale tecnico - Corbetta | Fia
Manuale tecnico - Corbetta | FiaManuale tecnico - Corbetta | Fia
Manuale tecnico - Corbetta | Fia
 
Dynamic content with Angular
Dynamic content with AngularDynamic content with Angular
Dynamic content with Angular
 
10 Tips for failing at microservices
10 Tips for failing at microservices10 Tips for failing at microservices
10 Tips for failing at microservices
 
Project EU XXI: Future Europe: Social Europe™, Digital Europe™, Green Europe™...
Project EU XXI: Future Europe: Social Europe™, Digital Europe™, Green Europe™...Project EU XXI: Future Europe: Social Europe™, Digital Europe™, Green Europe™...
Project EU XXI: Future Europe: Social Europe™, Digital Europe™, Green Europe™...
 
QNBFS Weekly Market Report March 16, 2017
QNBFS Weekly Market Report March 16, 2017QNBFS Weekly Market Report March 16, 2017
QNBFS Weekly Market Report March 16, 2017
 
Commissione peti 22-23 marzo 2017
Commissione peti   22-23 marzo 2017Commissione peti   22-23 marzo 2017
Commissione peti 22-23 marzo 2017
 
Planificar para tu jubilación: Banco Mediolanum en Forinvest
Planificar para tu jubilación: Banco Mediolanum en ForinvestPlanificar para tu jubilación: Banco Mediolanum en Forinvest
Planificar para tu jubilación: Banco Mediolanum en Forinvest
 
La mise en cache et ses secrets
La mise en cache et ses secretsLa mise en cache et ses secrets
La mise en cache et ses secrets
 
Badanie i implementacja aspektu QoE (ang. Quality of Experience) w aplikacjac...
Badanie i implementacja aspektu QoE (ang. Quality of Experience) w aplikacjac...Badanie i implementacja aspektu QoE (ang. Quality of Experience) w aplikacjac...
Badanie i implementacja aspektu QoE (ang. Quality of Experience) w aplikacjac...
 

Similar to Towards Automating Data Narratives

From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...dgarijo
 
Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Benjamin Bengfort
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW
 
From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...dgarijo
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformSanjay Padhi, Ph.D
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Betacowork
 
DSBDA Miniproject Assignment - TE A (1).pdf
DSBDA Miniproject Assignment - TE A (1).pdfDSBDA Miniproject Assignment - TE A (1).pdf
DSBDA Miniproject Assignment - TE A (1).pdfAbhiThorat6
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Sciencedgarijo
 
Introduction to Computational Statistics
Introduction to Computational StatisticsIntroduction to Computational Statistics
Introduction to Computational StatisticsSetia Pramana
 
grizzly - informal overview - pydata boston 2013
grizzly - informal overview - pydata boston 2013 grizzly - informal overview - pydata boston 2013
grizzly - informal overview - pydata boston 2013 adrianheilbut
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with PythonBenjamin Bengfort
 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedRobert Grossman
 
Data Tools cosystem_for_non_programmers
Data Tools cosystem_for_non_programmersData Tools cosystem_for_non_programmers
Data Tools cosystem_for_non_programmersitnig
 
Data tools ecosystem for non-programmers
Data tools ecosystem for non-programmersData tools ecosystem for non-programmers
Data tools ecosystem for non-programmersOutliers Collective
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...dgarijo
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsGaignard Alban
 
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...COST Action TD1210
 

Similar to Towards Automating Data Narratives (20)

From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...
 
Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)Building Data Products with Python (Georgetown)
Building Data Products with Python (Georgetown)
 
SSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow TutorialSSSW2015 Data Workflow Tutorial
SSSW2015 Data Workflow Tutorial
 
From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...From Scientific Workflows to Research Objects: Publication and Abstraction of...
From Scientific Workflows to Research Objects: Publication and Abstraction of...
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez Course 8 : How to start your big data project by Eric Rodriguez
Course 8 : How to start your big data project by Eric Rodriguez
 
DSBDA Miniproject Assignment - TE A (1).pdf
DSBDA Miniproject Assignment - TE A (1).pdfDSBDA Miniproject Assignment - TE A (1).pdf
DSBDA Miniproject Assignment - TE A (1).pdf
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
Introduction to Computational Statistics
Introduction to Computational StatisticsIntroduction to Computational Statistics
Introduction to Computational Statistics
 
grizzly - informal overview - pydata boston 2013
grizzly - informal overview - pydata boston 2013 grizzly - informal overview - pydata boston 2013
grizzly - informal overview - pydata boston 2013
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with Python
 
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed DeployedCrossing the Analytics Chasm and Getting the Models You Developed Deployed
Crossing the Analytics Chasm and Getting the Models You Developed Deployed
 
Data Tools cosystem_for_non_programmers
Data Tools cosystem_for_non_programmersData Tools cosystem_for_non_programmers
Data Tools cosystem_for_non_programmers
 
Data tools ecosystem for non-programmers
Data tools ecosystem for non-programmersData tools ecosystem for non-programmers
Data tools ecosystem for non-programmers
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...Detecting common scientific workflow fragments using templates and execution ...
Detecting common scientific workflow fragments using templates and execution ...
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
Paul Groth: Data Analysis in a Changing Discourse: The Challenges of Scholarl...
 
Internship Presentation.pdf
Internship Presentation.pdfInternship Presentation.pdf
Internship Presentation.pdf
 

More from dgarijo

SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationdgarijo
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Datadgarijo
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...dgarijo
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologiesdgarijo
 
Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflowsdgarijo
 
OntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Softwaredgarijo
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineeringdgarijo
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesdgarijo
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overviewdgarijo
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsdgarijo
 
Publicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigacióndgarijo
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overviewdgarijo
 
Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)dgarijo
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsdgarijo
 
Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods dgarijo
 
Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015dgarijo
 
Towards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard RepresentationsTowards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard Representationsdgarijo
 
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline UsersWorkflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Usersdgarijo
 
Frag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific WorkflowsFrag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific Workflowsdgarijo
 
User requirments for geospatial provenance
User requirments for geospatial provenanceUser requirments for geospatial provenance
User requirments for geospatial provenancedgarijo
 

More from dgarijo (20)

SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentation
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologies
 
Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflows
 
OntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Software
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflows
 
Publicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigación
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overview
 
Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
 
Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods
 
Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015
 
Towards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard RepresentationsTowards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard Representations
 
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline UsersWorkflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
 
Frag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific WorkflowsFrag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific Workflows
 
User requirments for geospatial provenance
User requirments for geospatial provenanceUser requirments for geospatial provenance
User requirments for geospatial provenance
 

Recently uploaded

UniSC Moreton Bay Library self-guided tour
UniSC Moreton Bay Library self-guided tourUniSC Moreton Bay Library self-guided tour
UniSC Moreton Bay Library self-guided tourUSC_Library
 
2.15.24 The Birmingham Campaign and MLK.pptx
2.15.24 The Birmingham Campaign and MLK.pptx2.15.24 The Birmingham Campaign and MLK.pptx
2.15.24 The Birmingham Campaign and MLK.pptxMaryPotorti1
 
50 ĐỀ THI THỬ TỐT NGHIỆP THPT TIẾNG ANH 2024 CÓ GIẢI CHI TIẾT - GIỚI HẠN KHO...
50 ĐỀ THI THỬ TỐT NGHIỆP THPT TIẾNG ANH 2024 CÓ GIẢI CHI TIẾT - GIỚI HẠN KHO...50 ĐỀ THI THỬ TỐT NGHIỆP THPT TIẾNG ANH 2024 CÓ GIẢI CHI TIẾT - GIỚI HẠN KHO...
50 ĐỀ THI THỬ TỐT NGHIỆP THPT TIẾNG ANH 2024 CÓ GIẢI CHI TIẾT - GIỚI HẠN KHO...Nguyen Thanh Tu Collection
 
Intuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov ChainsIntuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov ChainsTushar Tank
 
Dr. NN Chavan Keynote address on ADNEXAL MASS- APPROACH TO MANAGEMENT in the...
Dr. NN Chavan Keynote address on ADNEXAL MASS-  APPROACH TO MANAGEMENT in the...Dr. NN Chavan Keynote address on ADNEXAL MASS-  APPROACH TO MANAGEMENT in the...
Dr. NN Chavan Keynote address on ADNEXAL MASS- APPROACH TO MANAGEMENT in the...Niranjan Chavan
 
ACTIVIDAD DE CLASE No 1 - SOPA DE LETRAS
ACTIVIDAD DE CLASE No 1 - SOPA DE LETRASACTIVIDAD DE CLASE No 1 - SOPA DE LETRAS
ACTIVIDAD DE CLASE No 1 - SOPA DE LETRASMaria Lucia Céspedes
 
The Institutional Origins of Canada’s Telecommunications Mosaic
The Institutional Origins of Canada’s Telecommunications MosaicThe Institutional Origins of Canada’s Telecommunications Mosaic
The Institutional Origins of Canada’s Telecommunications MosaicUniversity of Canberra
 
Chromatography-Gas chromatography-Principle
Chromatography-Gas chromatography-PrincipleChromatography-Gas chromatography-Principle
Chromatography-Gas chromatography-Principleblessipriyanka
 
Detailed Presentation on Human Rights(1).pptx
Detailed Presentation on Human Rights(1).pptxDetailed Presentation on Human Rights(1).pptx
Detailed Presentation on Human Rights(1).pptxDrOsiaMajeed
 
Unleashing the Power of AI Tools for Enhancing Research, International FDP on...
Unleashing the Power of AI Tools for Enhancing Research, International FDP on...Unleashing the Power of AI Tools for Enhancing Research, International FDP on...
Unleashing the Power of AI Tools for Enhancing Research, International FDP on...Dr. Vinod Kumar Kanvaria
 
D.pharmacy Pharmacology 4th unit notes.pdf
D.pharmacy Pharmacology 4th unit notes.pdfD.pharmacy Pharmacology 4th unit notes.pdf
D.pharmacy Pharmacology 4th unit notes.pdfSUMIT TIWARI
 
CapTechTalks Webinar Feb 2024 Darrell Burrell.pptx
CapTechTalks Webinar Feb 2024 Darrell Burrell.pptxCapTechTalks Webinar Feb 2024 Darrell Burrell.pptx
CapTechTalks Webinar Feb 2024 Darrell Burrell.pptxCapitolTechU
 
FILIPINO 7 IKATLO AT IKAAPAT NA LINGGO 3RD QUARTER.pptx
FILIPINO 7 IKATLO AT IKAAPAT NA LINGGO 3RD QUARTER.pptxFILIPINO 7 IKATLO AT IKAAPAT NA LINGGO 3RD QUARTER.pptx
FILIPINO 7 IKATLO AT IKAAPAT NA LINGGO 3RD QUARTER.pptxmarielouisemiranda1
 
Different types of animal Tissues DMLT .pptx
Different types of animal Tissues DMLT .pptxDifferent types of animal Tissues DMLT .pptx
Different types of animal Tissues DMLT .pptxPunamSahoo3
 
UNIT 1 BIOMOLECULE_CARBOHYDRATES PRESENTATION
UNIT 1 BIOMOLECULE_CARBOHYDRATES PRESENTATIONUNIT 1 BIOMOLECULE_CARBOHYDRATES PRESENTATION
UNIT 1 BIOMOLECULE_CARBOHYDRATES PRESENTATIONSayali Powar
 
CONCEPTS OF ENVIRONMENT & ECOSYSTEM.pptx
CONCEPTS OF ENVIRONMENT & ECOSYSTEM.pptxCONCEPTS OF ENVIRONMENT & ECOSYSTEM.pptx
CONCEPTS OF ENVIRONMENT & ECOSYSTEM.pptxAnupkumar Sharma
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxJisc
 
Grantseeking Solo- Securing Awards with Limited Staff PDF.pdf
Grantseeking Solo- Securing Awards with Limited Staff  PDF.pdfGrantseeking Solo- Securing Awards with Limited Staff  PDF.pdf
Grantseeking Solo- Securing Awards with Limited Staff PDF.pdfTechSoup
 
The Ministry of Utmost Happiness by Arundhati Roy
The Ministry of Utmost Happiness by Arundhati RoyThe Ministry of Utmost Happiness by Arundhati Roy
The Ministry of Utmost Happiness by Arundhati RoyTrushali Dodiya
 

Recently uploaded (20)

UniSC Moreton Bay Library self-guided tour
UniSC Moreton Bay Library self-guided tourUniSC Moreton Bay Library self-guided tour
UniSC Moreton Bay Library self-guided tour
 
2.15.24 The Birmingham Campaign and MLK.pptx
2.15.24 The Birmingham Campaign and MLK.pptx2.15.24 The Birmingham Campaign and MLK.pptx
2.15.24 The Birmingham Campaign and MLK.pptx
 
50 ĐỀ THI THỬ TỐT NGHIỆP THPT TIẾNG ANH 2024 CÓ GIẢI CHI TIẾT - GIỚI HẠN KHO...
50 ĐỀ THI THỬ TỐT NGHIỆP THPT TIẾNG ANH 2024 CÓ GIẢI CHI TIẾT - GIỚI HẠN KHO...50 ĐỀ THI THỬ TỐT NGHIỆP THPT TIẾNG ANH 2024 CÓ GIẢI CHI TIẾT - GIỚI HẠN KHO...
50 ĐỀ THI THỬ TỐT NGHIỆP THPT TIẾNG ANH 2024 CÓ GIẢI CHI TIẾT - GIỚI HẠN KHO...
 
Intuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov ChainsIntuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov Chains
 
Dr. NN Chavan Keynote address on ADNEXAL MASS- APPROACH TO MANAGEMENT in the...
Dr. NN Chavan Keynote address on ADNEXAL MASS-  APPROACH TO MANAGEMENT in the...Dr. NN Chavan Keynote address on ADNEXAL MASS-  APPROACH TO MANAGEMENT in the...
Dr. NN Chavan Keynote address on ADNEXAL MASS- APPROACH TO MANAGEMENT in the...
 
ACTIVIDAD DE CLASE No 1 - SOPA DE LETRAS
ACTIVIDAD DE CLASE No 1 - SOPA DE LETRASACTIVIDAD DE CLASE No 1 - SOPA DE LETRAS
ACTIVIDAD DE CLASE No 1 - SOPA DE LETRAS
 
The Institutional Origins of Canada’s Telecommunications Mosaic
The Institutional Origins of Canada’s Telecommunications MosaicThe Institutional Origins of Canada’s Telecommunications Mosaic
The Institutional Origins of Canada’s Telecommunications Mosaic
 
Chromatography-Gas chromatography-Principle
Chromatography-Gas chromatography-PrincipleChromatography-Gas chromatography-Principle
Chromatography-Gas chromatography-Principle
 
Detailed Presentation on Human Rights(1).pptx
Detailed Presentation on Human Rights(1).pptxDetailed Presentation on Human Rights(1).pptx
Detailed Presentation on Human Rights(1).pptx
 
Caldecott Medal Book Winners and Media Used
Caldecott Medal Book Winners and Media UsedCaldecott Medal Book Winners and Media Used
Caldecott Medal Book Winners and Media Used
 
Unleashing the Power of AI Tools for Enhancing Research, International FDP on...
Unleashing the Power of AI Tools for Enhancing Research, International FDP on...Unleashing the Power of AI Tools for Enhancing Research, International FDP on...
Unleashing the Power of AI Tools for Enhancing Research, International FDP on...
 
D.pharmacy Pharmacology 4th unit notes.pdf
D.pharmacy Pharmacology 4th unit notes.pdfD.pharmacy Pharmacology 4th unit notes.pdf
D.pharmacy Pharmacology 4th unit notes.pdf
 
CapTechTalks Webinar Feb 2024 Darrell Burrell.pptx
CapTechTalks Webinar Feb 2024 Darrell Burrell.pptxCapTechTalks Webinar Feb 2024 Darrell Burrell.pptx
CapTechTalks Webinar Feb 2024 Darrell Burrell.pptx
 
FILIPINO 7 IKATLO AT IKAAPAT NA LINGGO 3RD QUARTER.pptx
FILIPINO 7 IKATLO AT IKAAPAT NA LINGGO 3RD QUARTER.pptxFILIPINO 7 IKATLO AT IKAAPAT NA LINGGO 3RD QUARTER.pptx
FILIPINO 7 IKATLO AT IKAAPAT NA LINGGO 3RD QUARTER.pptx
 
Different types of animal Tissues DMLT .pptx
Different types of animal Tissues DMLT .pptxDifferent types of animal Tissues DMLT .pptx
Different types of animal Tissues DMLT .pptx
 
UNIT 1 BIOMOLECULE_CARBOHYDRATES PRESENTATION
UNIT 1 BIOMOLECULE_CARBOHYDRATES PRESENTATIONUNIT 1 BIOMOLECULE_CARBOHYDRATES PRESENTATION
UNIT 1 BIOMOLECULE_CARBOHYDRATES PRESENTATION
 
CONCEPTS OF ENVIRONMENT & ECOSYSTEM.pptx
CONCEPTS OF ENVIRONMENT & ECOSYSTEM.pptxCONCEPTS OF ENVIRONMENT & ECOSYSTEM.pptx
CONCEPTS OF ENVIRONMENT & ECOSYSTEM.pptx
 
Digital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptxDigital Storytelling Community Launch!.pptx
Digital Storytelling Community Launch!.pptx
 
Grantseeking Solo- Securing Awards with Limited Staff PDF.pdf
Grantseeking Solo- Securing Awards with Limited Staff  PDF.pdfGrantseeking Solo- Securing Awards with Limited Staff  PDF.pdf
Grantseeking Solo- Securing Awards with Limited Staff PDF.pdf
 
The Ministry of Utmost Happiness by Arundhati Roy
The Ministry of Utmost Happiness by Arundhati RoyThe Ministry of Utmost Happiness by Arundhati Roy
The Ministry of Utmost Happiness by Arundhati Roy
 

Towards Automating Data Narratives

  • 1. TOWARDS AUTOMATING DATA NARRATIVES Yolanda Gil, Daniel Garijo Information Sciences Institute University of Southern California @yolandagil, @dgarijov {gil,dgarijo}@isi.edu Information Sciences Institute
  • 2. The Scientific Research Process Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo Formulate hypothesis Define the experiment (data + method) Find data Run experiments (methods) Meta-analysis of results Revise hypothesis
  • 3. The products of scientific research Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 3 Formulate hypothesis Define the experiment (data + method) Find data Run experiments (methods) Meta-analysis of results Revise hypothesis Publication Methods Data Software Execution traces
  • 4. Reconstructing the Computations from the Text in the Paper Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 4 Comparison of Ligand Binding Sites The SMAP software was used to compare the binding sites of the 749 M.tb protein structures plus 1,446 homology models (a total of 2,195 protein structures) with the 962 binding sites of 274 approved drugs, in an all-against-all manner. While the binding sites of the approved drugs were already defined by the bound ligand, the entire protein surface of each of the 2,195 M.tb protein structures was scanned in order to identify alternative binding sites. For each pairwise comparison, a P -value representing the significance of the binding site similarity was calculated. “The Mycobacterium Tuberculosis Drugome and Its Polypharmacological Implications.” Kinnings, S. L.; Xie, L.; Fung, K. H.; Jackson, R. M.; Xie, L.; and Bourne, P. E. PLoS Computational Biology, 2011.
  • 5. Problem with current approaches: what the paper said vs what the software did Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 5 “The Mycobacterium Tuberculosis Drugome and Its Polypharmacological Implications.” Kinnings, S. L.; Xie, L.; Fung, K. H.; Jackson, R. M.; Xie, L.; and Bourne, P. E. PLoS Computational Biology, 2011. Actual computation
  • 6. Problem with current approaches Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 6  Incomplete  Missing steps and intermediate data  Ambiguous  Several interpretations about how computations are done  Inconsistent level of detail  Mixing of general methods with execution details Step1 Step ?? Step 2 ? Step1 Step 2 Step1’ Step 2’ Implementation 1? Implementation 2? Step1 Step 2 Param1 = 2 File = “Input.txt”
  • 7. Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 7 Formulate hypothesis Define the experiment (data + method) Find data Run experiments (methods) Meta-analysis of results Revise hypothesis Publication Methods Data http://ext.net/wp-content/uploads/tortoise-svn-logo.png Execution traces Report generation Our approach: From research outputs to text https://image.flaticon.com/icons/svg/28/28842.svg
  • 8. Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 8 Formulate hypothesis Define the experiment (data + method) Find data Run experiments (methods) Meta-analysis of results Revise hypothesis Publication Methods Data http://ext.net/wp-content/uploads/tortoise-svn-logo.png Execution traces Report generation Our approach: From research outputs to text http://www.hurricanesoftwares.com/wp-content/uploads/2009/03/import-CSV-in-php.png Reports must: • Be true to actual events • Enable inspection • Be human-understandable • Abstract details
  • 9. Data Narratives • Interlinked record of • High level workflows (methods) • Provenance of results (method executions) • Data • Software metadata • Persistent identifiers • Data narrative accounts • Alternative descriptions of a result with a different level of detail. Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 9 http://bitpoetry.io/content/images/2016/03/uriurnurl.png https://en.wikipedia.org/wiki/File:DOI_logo.svg Truth to actual records Inspectability Human readable, levels of abstraction
  • 10. Data Narrative Accounts: An example Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 10 How was the dataset used in this visualization generated?
  • 11. Data Narrative Accounts: An example Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 11 “Topic modeling was run on the Reuters R8 dataset (10.6084/ m9.figshare.776887), and English Words dataset (10.6084/m9.figshare.776888), with iterations set to 100, stop word size set to 3, number of topics set to 10 and batch size set to 10. The results are at 10.6084/m9.figshare.776856” “The topics at 10.6084/m9.figshare.776856 were found in the Reuters R8 dataset (10.6084/m9.figshare.776887) and English Words dataset (10.6084/m9.figshare.776888)” • Execution view • Inputs, parameters and main outputs • Data view • Just the data that influenced the results • Method view • Main steps based on their functionality “Topic training was run on the input dataset. The results are product of PlotTopics, a visualization step”
  • 12. • Dependency view • How the steps depend on each other • Implementation view • How the steps were implemented in the execution • Software view • Details on the software used to implement the steps Data Narrative Accounts: An example Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 12 “First, the input data is filtered by Stop Words, followed by Small Words, Format Dataset, and Train Topics. The final results are produced by Plot Topics” “Train topics was implemented using Latent Dirichlet allocation” “The train topics step was generated with Online LDA open source software, written in Java. Plot topics was generated with the Termite software.”
  • 13. DANA: DAta NArratives Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 13 Experiment Records Provenance RepositoryExperiment- specific Knowledge Base DANA Generator Narrative accounts Software registry Query patterns Data Narrative aggregator Input Resource request Response Resource request Response Output Get query Pattern result Get pattern 1. Identify which experiment records to describe 2. Generation of an Experiment-specific knowledge base 3. Creation of the Data Narrative from templates 4. Produce narrative accounts
  • 14. Generation of an experiment-specific knowledge base: scientific workflows Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 14 WINGS workflow system • High level workflow templates that can be elaborated through component ontologies http://www.wings-workflows.org/
  • 15. Generation of an experiment-specific knowledge base: provenance records as RDF Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 15 See a hyperlinked description/visualization at its persistent URL: https://goo.gl/v8EPg5 http://www.opmw.org/export/page/resource/WorkflowExecutionAccount/ACCOUNT1348628778528 10.6084/m9.figshare.776887
  • 16. Generation of an experiment-specific knowledge base: Software metadata • Catalog of motifs [Garijo et al 2013] • A catalog of common domain independent workflow patterns based on the functionality of workflow steps • Ontosoft distributed software registry [Gil et al 2016] • Descriptions of hundreds of software components • Key metadata of software: • License • Usage • Authors • Web page • Code repository • Etc. Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 16 [Garijo et al 2016]: Common Motifs in Scientific Workflows: An Empirical Analysis. Garijo, D.; Alper, P.; Belhajjame, K.; Corcho, O.; Gil, Y.; and Goble, C. Future Generation Computer Systems, . 2013. . http://purl.org/net/wf-motifs http://www.ontosoft.org/portal [Gil et al 2016]: OntoSoft: A Distributed Semantic Registry for Scientific Software. Gil, Y.; Garijo, D.; Mishra, S.; and Ratnakar, V. In Proceedings of the Twelfth IEEE Conference on eScience, Baltimore, MD, 2016.
  • 17. Generating narrative accounts Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 17 RDF Account template
  • 18. Formative evaluation • Survey with 6 target scenarios • Each scenario: • Description of a situation where a user has to do a task Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 18
  • 19. Formative evaluation • Survey with 6 target scenarios • Each scenario: • Description of a situation where a user has to do a task • A workflow sketch of the analysis done • Six candidate narratives of that workflow sketch. Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 19
  • 20. Formative evaluation • Survey with 6 target scenarios • Each scenario: • Description of a situation where a user has to do a task • A workflow sketch of the analysis done • Six candidate narratives of that workflow sketch. • 12 responses from users • Results • Each narrative is considered appropriate for describing some scenario • Different users chose different narratives for each scenario Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 20
  • 21. Summary: Benefits of Data Narratives Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 21 Features Data Narratives Provenance Records Visualizations Articles Electronic Notebooks Truth to actual records Y Y Just data Maybe Maybe Enable inspection Y Y Just data N Y Human understandable Y N Y Y Y Abstract details Y N Y Y N Part of papers Y N Y Y Maybe Persistent Y Maybe N Y Maybe Different audiences Y N N N N Automatically generated Y Y Maybe N N
  • 22. Conclusions and future work • Data Narratives • Interlink data, software, workflows and provenance of a scientific experiment • Persistent identifiers • Narrative accounts • Future work: • Ease navigation through levels of detail • Mixing details of different narratives • Improve summarization of results • Additional evaluation of narrative usefulness Towards Automating Data Narratives. Yolanda Gil and Daniel Garijo 22 See more: http://dgarijo.github.io/DataNarratives/
  • 23. TOWARDS AUTOMATING DATA NARRATIVES Yolanda Gil, Daniel Garijo Information Sciences Institute and Department of Computer Science @yolandagil, @dgarijov {gil,dgarijo}@isi.edu