Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Nils Gehlenborg
Nils GehlenborgAssistant Professor of Biomedical Informatics
Approaches for the Integration of Visual and
Computational Analysis of Biomedical Data
HARVARD MEDICAL SCHOOL
DEPARTMENT OF BIOMEDICAL INFORMATICS
NILS GEHLENBORG
@nils_gehlenborg
http://gehlenborglab.org
FRITZ LEKSCHAS
HARVARD MEDICAL SCHOOL
BIG PILES OF DATA …
Data Repositories
general specialized
ArrayExpress
GEO
Metabolights
PRIDE
dbGAP
…
ENCODE
Roadmap
Epigenomics
…
… OFFER OPPORTUNITIES …
SINGLE OR FEW DATA SETS
Test hypotheses without generating new data.
Use published data as supporting evidence for findings based on
our your own data sets.
MANY DATA SETS
Conduct meta analyses, e.g. characterize expression patterns in
human tissues or to link diseases.
M. Lukk, et al., Nature Biotechnology, 28(4):322–324 (2010)
S. Suthram et al.,PLoS Computational Biology 6(2)(2010)
SINGLE OR FEW DATA SETS
Test hypotheses without generating new data.
Use published data as supporting evidence for findings based on
our your own data sets.
MANY DATA SETS
Conduct meta analyses, e.g. characterize expression patterns in
human tissues or to link diseases.
COMMON BEHAVIOR OF RESEARCH PARASITES!
N Gehlenborg et al. , manuscript in preparation
|
DATA REPOSITORY
VISUALIZATION TOOLS
ANALYSIS PIPELINES
N Gehlenborg et al. , manuscript in preparation
|
DATA REPOSITORY
VISUALIZATION TOOLS
ANALYSIS PIPELINES
ANALYSIS PIPELINES
N Gehlenborg et al. , manuscript in preparation
|
DATA REPOSITORY
VISUALIZATION TOOLS
ANALYSIS PIPELINES
ANALYSIS PIPELINES
N Gehlenborg et al. , manuscript in preparation
|
DATA REPOSITORY
VISUALIZATION TOOLS
ANALYSIS PIPELINES
GALAXY Toolshed
Workflow Editor
Tools
REST
API
ANALYSIS PIPELINES
N Gehlenborg et al. , manuscript in preparation
|
DATA REPOSITORY
VISUALIZATION TOOLS
ANALYSIS PIPELINES
GALAXY Toolshed
Workflow Editor
Tools
REST
API
Workflow Inputs
Workflow Outputs
N Gehlenborg et al. , manuscript in preparation
|
DATA REPOSITORY
VISUALIZATION TOOLS
ANALYSIS PIPELINES
http://www.refinery-platform.org
… BUT NOT SO FAST!
Z
Text-Bas
Data Sets
Metadata
Data Files
X Y Z
A1
X Y
Z
A2
A3
A4
X Y
Z- -
K K K K
L M L M
Free Text
Annotation
Mapping
K
L, M
X, Y
Z
X YZX Y
Keywords
Z
Text-Based Search
Data Sets
Metadata
Data Files
X Y
Ontologies
Z
A1
X Y
Z
A2
A3
A4
X Y
Z- -
K K K K
L M L M
Free Text
Annotation
Mapping
K
L, M
X, Y
Z
X YZX Y
Terminal
Root
subclassof
Keywords
Z
Text-Based Search
Data Sets
Metadata
Data Files
X Y
Ontologies
Z
A1
X Y
Z
A2
A3
A4
X Y
Z- -
K K K K
L M L M
Free Text
Annotation
Mapping
K
L, M
X, Y
Z
X YZX Y
Terminal
Root
subclassof
Keywords
Z
Text-Based Search
Data Sets
Metadata
Data Files
X Y
Ontologies
Z
A1
X Y
Z
A2
A3
A4
X Y
Z- -
K K K K
L M L M
Free Text
Annotation
Mapping
K
L, M
X, Y
Z
X YZX Y
Terminal
Root
subclassof
Keywords
Approaches for the Integration of Visual and Computational Analysis of Biomedical Data
Approaches for the Integration of Visual and Computational Analysis of Biomedical Data
Approaches for the Integration of Visual and Computational Analysis of Biomedical Data
Approaches for the Integration of Visual and Computational Analysis of Biomedical Data
Z
Text-Based Search
Data Sets
Metadata
Data Files
X Y
Ontologies
Z
A1
X Y
Z
A2
A3
A4
X Y
Z- -
K K K K
L M L M
Free Text
Annotation
Mapping
K
L, M
X, Y
Z
X YZX Y
Terminal
Root
subclassof
Keywords
X
Semantic Visual
Exploration
Y
Z
Text-Based Search
Data Sets
Metadata
Data Files
X Y
Ontologies
Z
A1
X Y
Z
A2
A3
A4
X Y
Z- -
K K K K
L M L M
Free Text
Annotation
Mapping
K
L, M
X, Y
Z
X YZX Y
SATORI
Terminal
Root
subclassof
Keywords
YX
Z
Z
X
SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories
http://satori.refinery-platform.org
D
R
C
Data Analyst Group Leader Data Curator
D
R
C
Data Analyst Group Leader Data Curator
D
R
C
Data Analyst Group Leader Data Curator
D
R
C
Data Analyst Group Leader Data Curator
Need 1

find data sets that match certain experimental characteristics.
Need 2

find data sets that are similar (or dissimilar) to given data sets.
Need 3

get an overview of the distribution of the experimental characteristics
across a collection of data sets.
Need 4

get an overview of the annotation term hierarchy and term usage.
Peter Pirolli and Stu Card
SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories
http://satori.refinery-platform.org
C
A B
C
List graph
B C
B
Tree
Tree map A
A B
C
Data sets
B
C
B
C
B
C
CB
CB
A B
C
Scenario 1:
Scenario 2:
Scenario 3:
AnnotationsTerm
1 2 3 4
SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories
http://satori.refinery-platform.org
SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories
http://satori.refinery-platform.org
Approaches for the Integration of Visual and Computational Analysis of Biomedical Data
The Art Institute of Chicago
HARVARD MEDICAL SCHOOL
JOHANNES KEPLER UNIVERSITY LINZ Stefan Luger, Holger Stitz, Marc Streit
Web
http://satori.refinery-platform.org · http://refinery-platform.org
Acknowledgements
Peter J Park & all members of the Computational Genomics Lab
Fritz Lekschas, Jennifer K Marx, Scott Ouellette, Anton Xue,
Psalm Haseley
HARVARD SCHOOL OF PUBLIC HEALTH Ilya Sytchev, Shannan Ho Sui
UNIVERSITY OF SHEFFIELD David R Jones, Winston Hide
Funding
NIH/NHGRI R00 HG007583, Harvard Stem Cell Institute
We are hiring postdocs & developers!
HARVARD MEDICAL SCHOOL
DEPARTMENT OF BIOMEDICAL INFORMATICS
See http://gehlenborglab.org or http://dbmi.med.harvard.edu for details.
Data visualization, analysis, and management for:
• genomic structural variants
• dynamics of the 3D genome
• cancer subtypes in patient cohorts
• exploration tools for data repositories
• provenance graphs
X
B
A
D
A
X XX Term Terminal term To be deleted
A
A
X To be duplicated
A A
C
ABA
C
B
C'
0 0 00 5 5 5 5
0 5
1 5
5 10 5 10
Term size Cumulative sizeX1 2
2 7
2 7
1 5
D
C
F D
C
F
F'
1. Global 2. Tree Map 3. Node-Link Diagram
5 10
1 5 1 105 5
0 10
G G
BB
B
C
C
C E EA'C
1 of 42

Recommended

Data Visualization to Enhance our Understanding of the Cancer Genome by
Data Visualization to Enhance our Understanding of the Cancer GenomeData Visualization to Enhance our Understanding of the Cancer Genome
Data Visualization to Enhance our Understanding of the Cancer GenomeNils Gehlenborg
1K views68 slides
Visual Exploration of Clinical and Genomic Data for Patient Stratification by
Visual Exploration of Clinical and Genomic Data for Patient StratificationVisual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient StratificationNils Gehlenborg
1.2K views131 slides
Current advances to bridge the usability-expressivity gap in biomedical seman... by
Current advances to bridge the usability-expressivity gap in biomedical seman...Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...Maulik Kamdar
1.2K views51 slides
Visualizing Primary Data form Taxonomic Literature by
Visualizing Primary Data form Taxonomic LiteratureVisualizing Primary Data form Taxonomic Literature
Visualizing Primary Data form Taxonomic Literaturemillerjeremya
485 views1 slide
Emerging challenges in data-intensive genomics by
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsmikaelhuss
5.5K views27 slides
Mikel egana itbam_2010_ogo_system by
Mikel egana itbam_2010_ogo_systemMikel egana itbam_2010_ogo_system
Mikel egana itbam_2010_ogo_systemMikel Egaña Aranguren, Ph.D.
376 views25 slides

More Related Content

What's hot

FedCentric_Presentation by
FedCentric_PresentationFedCentric_Presentation
FedCentric_PresentationYatpang Cheung
107 views1 slide
Bioinformatics Final Report by
Bioinformatics Final ReportBioinformatics Final Report
Bioinformatics Final ReportShruthi Choudary
9.1K views38 slides
Semantic approaches for biomedical knowledge discovery - Discovery Science 20... by
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Michel Dumontier
1.3K views63 slides
Global phenotypic data sharing standards to maximize diagnostic discovery by
Global phenotypic data sharing standards to maximize diagnostic discoveryGlobal phenotypic data sharing standards to maximize diagnostic discovery
Global phenotypic data sharing standards to maximize diagnostic discoverymhaendel
817 views66 slides
Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery by
Data Translator: an Open Science Data Platform for Mechanistic Disease DiscoveryData Translator: an Open Science Data Platform for Mechanistic Disease Discovery
Data Translator: an Open Science Data Platform for Mechanistic Disease Discoverymhaendel
350 views26 slides
Bioinformatics Databases by
Bioinformatics DatabasesBioinformatics Databases
Bioinformatics Databasescschlos2
1.5K views11 slides

What's hot(20)

Semantic approaches for biomedical knowledge discovery - Discovery Science 20... by Michel Dumontier
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Michel Dumontier1.3K views
Global phenotypic data sharing standards to maximize diagnostic discovery by mhaendel
Global phenotypic data sharing standards to maximize diagnostic discoveryGlobal phenotypic data sharing standards to maximize diagnostic discovery
Global phenotypic data sharing standards to maximize diagnostic discovery
mhaendel817 views
Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery by mhaendel
Data Translator: an Open Science Data Platform for Mechanistic Disease DiscoveryData Translator: an Open Science Data Platform for Mechanistic Disease Discovery
Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery
mhaendel350 views
Bioinformatics Databases by cschlos2
Bioinformatics DatabasesBioinformatics Databases
Bioinformatics Databases
cschlos21.5K views
Data analysis & integration challenges in genomics by mikaelhuss
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
mikaelhuss10.2K views
Ondex: Data integration and visualisation by Biogeeks
Ondex: Data integration and visualisationOndex: Data integration and visualisation
Ondex: Data integration and visualisation
Biogeeks1.5K views
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery by Michel Dumontier
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
Michel Dumontier716 views
An integrated dataset for in silico drug discovery by Simon Cockell
An integrated dataset for in silico drug discoveryAn integrated dataset for in silico drug discovery
An integrated dataset for in silico drug discovery
Simon Cockell1.1K views
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For... by dkNET
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET277 views
Claudia medina: Linking Health Records for Population Health Research in Brazil. by Flávio Codeço Coelho
Claudia medina: Linking Health Records for Population Health Research in Brazil.Claudia medina: Linking Health Records for Population Health Research in Brazil.
Claudia medina: Linking Health Records for Population Health Research in Brazil.
Opening up pharmacological space, the OPEN PHACTs api by Chris Evelo
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
Chris Evelo687 views
Molecular scaffolds are special and useful guides to discovery by Jeremy Yang
Molecular scaffolds are special and useful guides to discoveryMolecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discovery
Jeremy Yang6K views
Behavior ontology workshop princeton by Cyndy Parr
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princeton
Cyndy Parr518 views
Pistoia Alliance-Elsevier Datathon by Pistoia Alliance
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance3.1K views
is there life between standards? Data interoperability for AI. by Chris Evelo
is there life between standards? Data interoperability for AI.is there life between standards? Data interoperability for AI.
is there life between standards? Data interoperability for AI.
Chris Evelo77 views

Viewers also liked

Computational Analysis in an extended model of E. Coli by
Computational Analysis in an extended model of E. ColiComputational Analysis in an extended model of E. Coli
Computational Analysis in an extended model of E. ColiSteven Stadler
390 views41 slides
Computational Analysis NCP ICM - Copy by
Computational Analysis NCP ICM - CopyComputational Analysis NCP ICM - Copy
Computational Analysis NCP ICM - CopyVernon D Dutch Jr
198 views20 slides
Computational Biology - Signaling networks and drug repositioning by
Computational Biology - Signaling networks and drug repositioningComputational Biology - Signaling networks and drug repositioning
Computational Biology - Signaling networks and drug repositioningLars Juhl Jensen
1.4K views127 slides
A Computational Analysis of Agenda Setting Theory by
A Computational Analysis of Agenda Setting TheoryA Computational Analysis of Agenda Setting Theory
A Computational Analysis of Agenda Setting TheoryAlice Oh
2.6K views27 slides
Computational Analysis Of A Thin Plate by
Computational Analysis Of A Thin PlateComputational Analysis Of A Thin Plate
Computational Analysis Of A Thin PlateDavid Parker
1.9K views44 slides
Computational Drug Design by
Computational Drug DesignComputational Drug Design
Computational Drug Designbaoilleach
12.7K views20 slides

Viewers also liked(7)

Computational Analysis in an extended model of E. Coli by Steven Stadler
Computational Analysis in an extended model of E. ColiComputational Analysis in an extended model of E. Coli
Computational Analysis in an extended model of E. Coli
Steven Stadler390 views
Computational Biology - Signaling networks and drug repositioning by Lars Juhl Jensen
Computational Biology - Signaling networks and drug repositioningComputational Biology - Signaling networks and drug repositioning
Computational Biology - Signaling networks and drug repositioning
Lars Juhl Jensen1.4K views
A Computational Analysis of Agenda Setting Theory by Alice Oh
A Computational Analysis of Agenda Setting TheoryA Computational Analysis of Agenda Setting Theory
A Computational Analysis of Agenda Setting Theory
Alice Oh2.6K views
Computational Analysis Of A Thin Plate by David Parker
Computational Analysis Of A Thin PlateComputational Analysis Of A Thin Plate
Computational Analysis Of A Thin Plate
David Parker1.9K views
Computational Drug Design by baoilleach
Computational Drug DesignComputational Drug Design
Computational Drug Design
baoilleach12.7K views
COMPUTATIONAL ANALYSIS OF STEPPED AND STRAIGHT MICROCHANNEL HEAT SINK by IAEME Publication
COMPUTATIONAL ANALYSIS OF STEPPED AND STRAIGHT MICROCHANNEL HEAT SINK COMPUTATIONAL ANALYSIS OF STEPPED AND STRAIGHT MICROCHANNEL HEAT SINK
COMPUTATIONAL ANALYSIS OF STEPPED AND STRAIGHT MICROCHANNEL HEAT SINK
IAEME Publication216 views

Similar to Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

Session III Census and registers - M. Scannapieco,The Italian Integrated Syst... by
Session III Census and registers - M. Scannapieco,The Italian Integrated Syst...Session III Census and registers - M. Scannapieco,The Italian Integrated Syst...
Session III Census and registers - M. Scannapieco,The Italian Integrated Syst...Istituto nazionale di statistica
114 views14 slides
Bioinformatics databases: Current Trends and Future Perspectives by
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesUniversity of Malaya
1.5K views37 slides
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks by
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
1.3K views83 slides
The need for a transparent data supply chain by
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chainPaul Groth
2.8K views20 slides
Computation and Knowledge by
Computation and KnowledgeComputation and Knowledge
Computation and KnowledgeIan Foster
2.3K views50 slides
Data retriveal ,srg and dbget by
Data retriveal ,srg and dbgetData retriveal ,srg and dbget
Data retriveal ,srg and dbgetSurendraKumar338
437 views35 slides

Similar to Approaches for the Integration of Visual and Computational Analysis of Biomedical Data(20)

Bioinformatics databases: Current Trends and Future Perspectives by University of Malaya
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks by Carole Goble
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Carole Goble1.3K views
The need for a transparent data supply chain by Paul Groth
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chain
Paul Groth2.8K views
Computation and Knowledge by Ian Foster
Computation and KnowledgeComputation and Knowledge
Computation and Knowledge
Ian Foster2.3K views
2 Discovery and Acquisition of Data1.pptx by vijayapraba1
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx
vijayapraba15 views
The Ondex Data Integration Framework by bosc
The Ondex Data Integration FrameworkThe Ondex Data Integration Framework
The Ondex Data Integration Framework
bosc4.9K views
The eCrystals Federation by ManjulaPatel
The eCrystals FederationThe eCrystals Federation
The eCrystals Federation
ManjulaPatel1.4K views
Nucl. Acids Res.-2014-Howe-nar-gku1244 by Yasel Cruz
Nucl. Acids Res.-2014-Howe-nar-gku1244Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244
Yasel Cruz220 views
ABSTAT: Ontology-driven Linked Data Summaries with Pattern Minimalization by Blerina Spahiu
ABSTAT: Ontology-driven Linked Data Summaries with Pattern MinimalizationABSTAT: Ontology-driven Linked Data Summaries with Pattern Minimalization
ABSTAT: Ontology-driven Linked Data Summaries with Pattern Minimalization
Blerina Spahiu540 views
Role of bioinformatics in life sciences research by Anshika Bansal
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
Anshika Bansal4K views
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat... by Araport
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
Araport925 views

More from Nils Gehlenborg

HiGlass & Friends by
HiGlass & FriendsHiGlass & Friends
HiGlass & FriendsNils Gehlenborg
269 views20 slides
Power to the People: Data Visualization in Biology and Medicine by
Power to the People: Data Visualization in Biology and MedicinePower to the People: Data Visualization in Biology and Medicine
Power to the People: Data Visualization in Biology and MedicineNils Gehlenborg
303 views38 slides
Cancer Genomics Visualization across Scales: Nucleotides to Cohorts by
Cancer Genomics Visualization across Scales: Nucleotides to CohortsCancer Genomics Visualization across Scales: Nucleotides to Cohorts
Cancer Genomics Visualization across Scales: Nucleotides to CohortsNils Gehlenborg
402 views159 slides
A Unified Approach to Exploration, Authoring, and Communication with Reproduc... by
A Unified Approach to Exploration, Authoring, and Communication with Reproduc...A Unified Approach to Exploration, Authoring, and Communication with Reproduc...
A Unified Approach to Exploration, Authoring, and Communication with Reproduc...Nils Gehlenborg
238 views64 slides
EMBL John Kendrew Award Lecture 2018 by
EMBL John Kendrew Award Lecture 2018EMBL John Kendrew Award Lecture 2018
EMBL John Kendrew Award Lecture 2018Nils Gehlenborg
282 views112 slides
Mining Gems from the Data Visualization Literature by
Mining Gems from the Data Visualization LiteratureMining Gems from the Data Visualization Literature
Mining Gems from the Data Visualization LiteratureNils Gehlenborg
786 views58 slides

More from Nils Gehlenborg(20)

Power to the People: Data Visualization in Biology and Medicine by Nils Gehlenborg
Power to the People: Data Visualization in Biology and MedicinePower to the People: Data Visualization in Biology and Medicine
Power to the People: Data Visualization in Biology and Medicine
Nils Gehlenborg303 views
Cancer Genomics Visualization across Scales: Nucleotides to Cohorts by Nils Gehlenborg
Cancer Genomics Visualization across Scales: Nucleotides to CohortsCancer Genomics Visualization across Scales: Nucleotides to Cohorts
Cancer Genomics Visualization across Scales: Nucleotides to Cohorts
Nils Gehlenborg402 views
A Unified Approach to Exploration, Authoring, and Communication with Reproduc... by Nils Gehlenborg
A Unified Approach to Exploration, Authoring, and Communication with Reproduc...A Unified Approach to Exploration, Authoring, and Communication with Reproduc...
A Unified Approach to Exploration, Authoring, and Communication with Reproduc...
Nils Gehlenborg238 views
EMBL John Kendrew Award Lecture 2018 by Nils Gehlenborg
EMBL John Kendrew Award Lecture 2018EMBL John Kendrew Award Lecture 2018
EMBL John Kendrew Award Lecture 2018
Nils Gehlenborg282 views
Mining Gems from the Data Visualization Literature by Nils Gehlenborg
Mining Gems from the Data Visualization LiteratureMining Gems from the Data Visualization Literature
Mining Gems from the Data Visualization Literature
Nils Gehlenborg786 views
Patients, Genomes, Time: Visualizing Disease Cohorts by Nils Gehlenborg
Patients, Genomes, Time: Visualizing Disease CohortsPatients, Genomes, Time: Visualizing Disease Cohorts
Patients, Genomes, Time: Visualizing Disease Cohorts
Nils Gehlenborg303 views
Data Visualization in Biomedical Sciences: More than Meets the Eye by Nils Gehlenborg
Data Visualization in Biomedical Sciences: More than Meets the EyeData Visualization in Biomedical Sciences: More than Meets the Eye
Data Visualization in Biomedical Sciences: More than Meets the Eye
Nils Gehlenborg488 views
Visualizing Patient Cohorts: Integrating Data Types, Relationships, and Time by Nils Gehlenborg
Visualizing Patient Cohorts: Integrating Data Types, Relationships, and TimeVisualizing Patient Cohorts: Integrating Data Types, Relationships, and Time
Visualizing Patient Cohorts: Integrating Data Types, Relationships, and Time
Nils Gehlenborg548 views
Visualization of 3D Genome Data by Nils Gehlenborg
Visualization of 3D Genome DataVisualization of 3D Genome Data
Visualization of 3D Genome Data
Nils Gehlenborg1.2K views
HiGlass + HiPiler: Making Sense of Chromosome Interaction Data with Multi-Sca... by Nils Gehlenborg
HiGlass + HiPiler: Making Sense of Chromosome Interaction Data with Multi-Sca...HiGlass + HiPiler: Making Sense of Chromosome Interaction Data with Multi-Sca...
HiGlass + HiPiler: Making Sense of Chromosome Interaction Data with Multi-Sca...
Nils Gehlenborg647 views
Relaxation Techniques for the Upset Data Scientist by Nils Gehlenborg
Relaxation Techniques for the Upset Data ScientistRelaxation Techniques for the Upset Data Scientist
Relaxation Techniques for the Upset Data Scientist
Nils Gehlenborg776 views
Multi-Scale Visualization Tools for Exploration of Chromosome Interaction ... by Nils Gehlenborg
Multi-Scale  Visualization Tools for  Exploration of  Chromosome Interaction ...Multi-Scale  Visualization Tools for  Exploration of  Chromosome Interaction ...
Multi-Scale Visualization Tools for Exploration of Chromosome Interaction ...
Nils Gehlenborg438 views
SMC-RNA BioVis Data Visualization DREAM Challenge Preview by Nils Gehlenborg
SMC-RNA BioVis Data Visualization DREAM Challenge PreviewSMC-RNA BioVis Data Visualization DREAM Challenge Preview
SMC-RNA BioVis Data Visualization DREAM Challenge Preview
Nils Gehlenborg577 views
Tracing the Origins of Data and Ideas - Provenance Visualization for Biomedic... by Nils Gehlenborg
Tracing the Origins of Data and Ideas - Provenance Visualization for Biomedic...Tracing the Origins of Data and Ideas - Provenance Visualization for Biomedic...
Tracing the Origins of Data and Ideas - Provenance Visualization for Biomedic...
Nils Gehlenborg321 views
Visualization Tools for the Refinery Platform - Supporting reproducible resea... by Nils Gehlenborg
Visualization Tools for the Refinery Platform - Supporting reproducible resea...Visualization Tools for the Refinery Platform - Supporting reproducible resea...
Visualization Tools for the Refinery Platform - Supporting reproducible resea...
Nils Gehlenborg1K views
Visualization Approaches for Biomedical Omics Data: Putting It All Together by Nils Gehlenborg
Visualization Approaches for Biomedical Omics Data: Putting It All TogetherVisualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All Together
Nils Gehlenborg1.2K views
Biological Visualization Community Meetup 2014 by Nils Gehlenborg
Biological Visualization Community Meetup 2014Biological Visualization Community Meetup 2014
Biological Visualization Community Meetup 2014
Nils Gehlenborg513 views

Recently uploaded

A giant thin stellar stream in the Coma Galaxy Cluster by
A giant thin stellar stream in the Coma Galaxy ClusterA giant thin stellar stream in the Coma Galaxy Cluster
A giant thin stellar stream in the Coma Galaxy ClusterSérgio Sacani
19 views14 slides
Vegetable grafting: A new crop improvement approach.pptx by
Vegetable grafting: A new crop improvement approach.pptxVegetable grafting: A new crop improvement approach.pptx
Vegetable grafting: A new crop improvement approach.pptxHimul Suthar
8 views69 slides
Applications of Large Language Models in Materials Discovery and Design by
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignAnubhav Jain
14 views17 slides
2. Natural Sciences and Technology Author Siyavula.pdf by
2. Natural Sciences and Technology Author Siyavula.pdf2. Natural Sciences and Technology Author Siyavula.pdf
2. Natural Sciences and Technology Author Siyavula.pdfssuser821efa
11 views232 slides
Oral_Presentation_by_Fatma (2).pdf by
Oral_Presentation_by_Fatma (2).pdfOral_Presentation_by_Fatma (2).pdf
Oral_Presentation_by_Fatma (2).pdffatmaalmrzqi
8 views7 slides
Presentation on experimental laboratory animal- Hamster by
Presentation on experimental laboratory animal- HamsterPresentation on experimental laboratory animal- Hamster
Presentation on experimental laboratory animal- HamsterKanika13641
6 views8 slides

Recently uploaded(20)

A giant thin stellar stream in the Coma Galaxy Cluster by Sérgio Sacani
A giant thin stellar stream in the Coma Galaxy ClusterA giant thin stellar stream in the Coma Galaxy Cluster
A giant thin stellar stream in the Coma Galaxy Cluster
Sérgio Sacani19 views
Vegetable grafting: A new crop improvement approach.pptx by Himul Suthar
Vegetable grafting: A new crop improvement approach.pptxVegetable grafting: A new crop improvement approach.pptx
Vegetable grafting: A new crop improvement approach.pptx
Himul Suthar8 views
Applications of Large Language Models in Materials Discovery and Design by Anubhav Jain
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and Design
Anubhav Jain14 views
2. Natural Sciences and Technology Author Siyavula.pdf by ssuser821efa
2. Natural Sciences and Technology Author Siyavula.pdf2. Natural Sciences and Technology Author Siyavula.pdf
2. Natural Sciences and Technology Author Siyavula.pdf
ssuser821efa11 views
Oral_Presentation_by_Fatma (2).pdf by fatmaalmrzqi
Oral_Presentation_by_Fatma (2).pdfOral_Presentation_by_Fatma (2).pdf
Oral_Presentation_by_Fatma (2).pdf
fatmaalmrzqi8 views
Presentation on experimental laboratory animal- Hamster by Kanika13641
Presentation on experimental laboratory animal- HamsterPresentation on experimental laboratory animal- Hamster
Presentation on experimental laboratory animal- Hamster
Kanika136416 views
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe... by Anmol Vishnu Gupta
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...
Experimental animal Guinea pigs.pptx by Mansee Arya
Experimental animal Guinea pigs.pptxExperimental animal Guinea pigs.pptx
Experimental animal Guinea pigs.pptx
Mansee Arya40 views
selection of preformed arch wires during the alignment stage of preadjusted o... by MaherFouda1
selection of preformed arch wires during the alignment stage of preadjusted o...selection of preformed arch wires during the alignment stage of preadjusted o...
selection of preformed arch wires during the alignment stage of preadjusted o...
MaherFouda17 views
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F... by SwagatBehera9
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...
Effect of Integrated Nutrient Management on Growth and Yield of Solanaceous F...
SwagatBehera95 views
A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance... by InsideScientific
A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance...A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance...
A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance...
InsideScientific115 views
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by ILRI
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI6 views
Discovery of therapeutic agents targeting PKLR for NAFLD using drug repositio... by Trustlife
Discovery of therapeutic agents targeting PKLR for NAFLD using drug repositio...Discovery of therapeutic agents targeting PKLR for NAFLD using drug repositio...
Discovery of therapeutic agents targeting PKLR for NAFLD using drug repositio...
Trustlife146 views
Best Hybrid Event Platform.pptx by Harriet Davis
Best Hybrid Event Platform.pptxBest Hybrid Event Platform.pptx
Best Hybrid Event Platform.pptx
Harriet Davis8 views

Approaches for the Integration of Visual and Computational Analysis of Biomedical Data

  • 1. Approaches for the Integration of Visual and Computational Analysis of Biomedical Data HARVARD MEDICAL SCHOOL DEPARTMENT OF BIOMEDICAL INFORMATICS NILS GEHLENBORG @nils_gehlenborg http://gehlenborglab.org
  • 3. BIG PILES OF DATA …
  • 6. SINGLE OR FEW DATA SETS Test hypotheses without generating new data. Use published data as supporting evidence for findings based on our your own data sets. MANY DATA SETS Conduct meta analyses, e.g. characterize expression patterns in human tissues or to link diseases.
  • 7. M. Lukk, et al., Nature Biotechnology, 28(4):322–324 (2010)
  • 8. S. Suthram et al.,PLoS Computational Biology 6(2)(2010)
  • 9. SINGLE OR FEW DATA SETS Test hypotheses without generating new data. Use published data as supporting evidence for findings based on our your own data sets. MANY DATA SETS Conduct meta analyses, e.g. characterize expression patterns in human tissues or to link diseases. COMMON BEHAVIOR OF RESEARCH PARASITES!
  • 10. N Gehlenborg et al. , manuscript in preparation | DATA REPOSITORY VISUALIZATION TOOLS ANALYSIS PIPELINES
  • 11. N Gehlenborg et al. , manuscript in preparation | DATA REPOSITORY VISUALIZATION TOOLS ANALYSIS PIPELINES
  • 12. ANALYSIS PIPELINES N Gehlenborg et al. , manuscript in preparation | DATA REPOSITORY VISUALIZATION TOOLS ANALYSIS PIPELINES
  • 13. ANALYSIS PIPELINES N Gehlenborg et al. , manuscript in preparation | DATA REPOSITORY VISUALIZATION TOOLS ANALYSIS PIPELINES GALAXY Toolshed Workflow Editor Tools REST API
  • 14. ANALYSIS PIPELINES N Gehlenborg et al. , manuscript in preparation | DATA REPOSITORY VISUALIZATION TOOLS ANALYSIS PIPELINES GALAXY Toolshed Workflow Editor Tools REST API Workflow Inputs Workflow Outputs
  • 15. N Gehlenborg et al. , manuscript in preparation | DATA REPOSITORY VISUALIZATION TOOLS ANALYSIS PIPELINES http://www.refinery-platform.org
  • 16. … BUT NOT SO FAST!
  • 17. Z Text-Bas Data Sets Metadata Data Files X Y Z A1 X Y Z A2 A3 A4 X Y Z- - K K K K L M L M Free Text Annotation Mapping K L, M X, Y Z X YZX Y Keywords
  • 18. Z Text-Based Search Data Sets Metadata Data Files X Y Ontologies Z A1 X Y Z A2 A3 A4 X Y Z- - K K K K L M L M Free Text Annotation Mapping K L, M X, Y Z X YZX Y Terminal Root subclassof Keywords
  • 19. Z Text-Based Search Data Sets Metadata Data Files X Y Ontologies Z A1 X Y Z A2 A3 A4 X Y Z- - K K K K L M L M Free Text Annotation Mapping K L, M X, Y Z X YZX Y Terminal Root subclassof Keywords
  • 20. Z Text-Based Search Data Sets Metadata Data Files X Y Ontologies Z A1 X Y Z A2 A3 A4 X Y Z- - K K K K L M L M Free Text Annotation Mapping K L, M X, Y Z X YZX Y Terminal Root subclassof Keywords
  • 25. Z Text-Based Search Data Sets Metadata Data Files X Y Ontologies Z A1 X Y Z A2 A3 A4 X Y Z- - K K K K L M L M Free Text Annotation Mapping K L, M X, Y Z X YZX Y Terminal Root subclassof Keywords
  • 26. X Semantic Visual Exploration Y Z Text-Based Search Data Sets Metadata Data Files X Y Ontologies Z A1 X Y Z A2 A3 A4 X Y Z- - K K K K L M L M Free Text Annotation Mapping K L, M X, Y Z X YZX Y SATORI Terminal Root subclassof Keywords YX Z Z X
  • 27. SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories http://satori.refinery-platform.org
  • 28. D R C Data Analyst Group Leader Data Curator
  • 29. D R C Data Analyst Group Leader Data Curator
  • 30. D R C Data Analyst Group Leader Data Curator
  • 31. D R C Data Analyst Group Leader Data Curator
  • 32. Need 1
 find data sets that match certain experimental characteristics. Need 2
 find data sets that are similar (or dissimilar) to given data sets. Need 3
 get an overview of the distribution of the experimental characteristics across a collection of data sets. Need 4
 get an overview of the annotation term hierarchy and term usage.
  • 33. Peter Pirolli and Stu Card
  • 34. SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories http://satori.refinery-platform.org
  • 35. C A B C List graph B C B Tree Tree map A A B C Data sets B C B C B C CB CB A B C Scenario 1: Scenario 2: Scenario 3: AnnotationsTerm 1 2 3 4
  • 36. SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories http://satori.refinery-platform.org
  • 37. SATORI: A System for Ontology-Guided Visual Exploration of Biomedical Data Repositories http://satori.refinery-platform.org
  • 39. The Art Institute of Chicago
  • 40. HARVARD MEDICAL SCHOOL JOHANNES KEPLER UNIVERSITY LINZ Stefan Luger, Holger Stitz, Marc Streit Web http://satori.refinery-platform.org · http://refinery-platform.org Acknowledgements Peter J Park & all members of the Computational Genomics Lab Fritz Lekschas, Jennifer K Marx, Scott Ouellette, Anton Xue, Psalm Haseley HARVARD SCHOOL OF PUBLIC HEALTH Ilya Sytchev, Shannan Ho Sui UNIVERSITY OF SHEFFIELD David R Jones, Winston Hide Funding NIH/NHGRI R00 HG007583, Harvard Stem Cell Institute
  • 41. We are hiring postdocs & developers! HARVARD MEDICAL SCHOOL DEPARTMENT OF BIOMEDICAL INFORMATICS See http://gehlenborglab.org or http://dbmi.med.harvard.edu for details. Data visualization, analysis, and management for: • genomic structural variants • dynamics of the 3D genome • cancer subtypes in patient cohorts • exploration tools for data repositories • provenance graphs
  • 42. X B A D A X XX Term Terminal term To be deleted A A X To be duplicated A A C ABA C B C' 0 0 00 5 5 5 5 0 5 1 5 5 10 5 10 Term size Cumulative sizeX1 2 2 7 2 7 1 5 D C F D C F F' 1. Global 2. Tree Map 3. Node-Link Diagram 5 10 1 5 1 105 5 0 10 G G BB B C C C E EA'C