SlideShare a Scribd company logo
1 of 93
Networks of proteins and diseases




           Lars Juhl Jensen
three parts
protein networks
localization and diseases
disease networks
protein networks
STRING
Szklarczyk, Franceschini et al., Nucleic Acids Research, 2011
computational predictions
gene fusion
Korbel et al., Nature Biotechnology, 2004
experimental data
Jensen & Bork, Science, 2008
curated knowledge
Letunic & Bork, Trends in Biochemical Sciences, 2008
many databases
different formats
different identifiers
variable quality
not comparable
hard work
quality scores
von Mering et al., Nucleic Acids Research, 2005
calibrate vs. gold standard
missing most of the data
>10 km
too much to read
computer
as smart as a dog
teach it specific tricks
named entity recognition
comprehensive lexicon
cyclin dependent kinase 1
CDK1
CDC2
orthographic variation
cyclin-dependent kinase 1
hCdc2
“black list”
SDS
information extraction
co-mentioning
localization and disease
proteins
compartments
tissues
diseases
suite of web resources
jensenlab.org
text mining
curated knowledge
experimental data
computational predictions
quality scores
visualization
compartments.jensenlab.org
tissues.jensenlab.org
project onto networks
Szklarczyk, Franceschini et al., Nucleic Acids Research, 2011
compartments.jensenlab.org
tissues.jensenlab.org
diseases.jensenlab.org
disease networks
medical data
electronic health records
Jensen et al., Nature Reviews Genetics, 2012
structured data
Jensen et al., Nature Reviews Genetics, 2012
unstructured data
comorbidity
Jensen et al., Nature Reviews Genetics, 2012
in Danish
multiple testing
confounding factors
age and gender
reporting bias
comorbidity matrix
Roque et al., PLoS Computational Biology, 2011
comorbidity network
temporal correlation
Jensen et al., in preparation, 2012
disease trajectories
Jensen et al., in preparation, 2012
molecular basis
protein networks
Acknowledgments
STRING                 Text                  EHR mining
Christian von Mering
Damian Szklarczyk
                       mining                Anders Boeck Jensen
                                             Peter Bjødstrup Jensen
                       Sune Frankild
Michael Kuhn                                 Francisco S. Roque
Manuel Stark           Evangelos Pafilis     Henriette Schmock
                       Janos Binder
Samuel Chaffron                              Marlene Dalgaard
Chris Creevey          Kalliopi Tsafou       Massimo Andreatta
Jean Muller            Alberto Santos        Thomas Hansen
                       Heiko Horn
Tobias Doerks                                Karen Søeby
Philippe Julien        Michael Kuhn          Søren Bredkjær
                       Nigel Brown
Alexander Roth                               Anders Juul
Milan Simonovic        Reinhardt Schneider   Tudor Oprea
Jan Korbel             Sean O’Donoghue       Pope Moseley
Berend Snel                                  Thomas Werge
Martijn Huynen                               Søren Brunak
Peer Bork
Thank you!

More Related Content

Similar to Networks of proteins and diseases

Network biology: Large-scale data integration and text mining
Network biology: Large-scale data integration and text miningNetwork biology: Large-scale data integration and text mining
Network biology: Large-scale data integration and text miningLars Juhl Jensen
 
Network biology - Large-scale data integration and text mining
Network biology - Large-scale data integration and text miningNetwork biology - Large-scale data integration and text mining
Network biology - Large-scale data integration and text miningLars Juhl Jensen
 
Network biology - Large-scale biomedical data and text mining
Network biology - Large-scale biomedical data and text miningNetwork biology - Large-scale biomedical data and text mining
Network biology - Large-scale biomedical data and text miningLars Juhl Jensen
 
Mining text and data on chemicals
Mining text and data on chemicalsMining text and data on chemicals
Mining text and data on chemicalsLars Juhl Jensen
 
Advanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsAdvanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsLars Juhl Jensen
 
Large-scale integration of data and text
Large-scale integration of data and textLarge-scale integration of data and text
Large-scale integration of data and textLars Juhl Jensen
 
Systems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systemsSystems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systemsLars Juhl Jensen
 
Advanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsAdvanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsLars Juhl Jensen
 
Advanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsAdvanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsLars Juhl Jensen
 
Mining heaps of data and piles of papers
Mining heaps of data and piles of papersMining heaps of data and piles of papers
Mining heaps of data and piles of papersLars Juhl Jensen
 
Advanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsAdvanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsLars Juhl Jensen
 
Large-scale data and text mining
Large-scale data and text miningLarge-scale data and text mining
Large-scale data and text miningLars Juhl Jensen
 
More Than the Sum of its Parts – Leveraging a UC Network
More Than the Sum of its Parts – Leveraging a UC Network More Than the Sum of its Parts – Leveraging a UC Network
More Than the Sum of its Parts – Leveraging a UC Network CTSI at UCSF
 
Mining molecules from text and data
Mining molecules from text and dataMining molecules from text and data
Mining molecules from text and dataLars Juhl Jensen
 
Acknowledgment For Reviewers
Acknowledgment For ReviewersAcknowledgment For Reviewers
Acknowledgment For ReviewersNatasha Grant
 
Interaction networks - Prediction, data integration and text mining
Interaction networks - Prediction, data integration and text miningInteraction networks - Prediction, data integration and text mining
Interaction networks - Prediction, data integration and text miningLars Juhl Jensen
 
Protein networks: A basis for large-scale data mining
Protein networks: A basis for large-scale data miningProtein networks: A basis for large-scale data mining
Protein networks: A basis for large-scale data miningLars Juhl Jensen
 

Similar to Networks of proteins and diseases (20)

Network biology: Large-scale data integration and text mining
Network biology: Large-scale data integration and text miningNetwork biology: Large-scale data integration and text mining
Network biology: Large-scale data integration and text mining
 
Network biology - Large-scale data integration and text mining
Network biology - Large-scale data integration and text miningNetwork biology - Large-scale data integration and text mining
Network biology - Large-scale data integration and text mining
 
Network biology - Large-scale biomedical data and text mining
Network biology - Large-scale biomedical data and text miningNetwork biology - Large-scale biomedical data and text mining
Network biology - Large-scale biomedical data and text mining
 
Mining text and data on chemicals
Mining text and data on chemicalsMining text and data on chemicals
Mining text and data on chemicals
 
Advanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsAdvanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomics
 
Large-scale integration of data and text
Large-scale integration of data and textLarge-scale integration of data and text
Large-scale integration of data and text
 
Systems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systemsSystems biology - Bioinformatics on complete biological systems
Systems biology - Bioinformatics on complete biological systems
 
Advanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsAdvanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomics
 
Advanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsAdvanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomics
 
Mining heaps of data and piles of papers
Mining heaps of data and piles of papersMining heaps of data and piles of papers
Mining heaps of data and piles of papers
 
Disease Systems Biology
Disease Systems BiologyDisease Systems Biology
Disease Systems Biology
 
Mining biomedical texts
Mining biomedical textsMining biomedical texts
Mining biomedical texts
 
Advanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomicsAdvanced bioinformatics methods for proteomics
Advanced bioinformatics methods for proteomics
 
Large-scale data and text mining
Large-scale data and text miningLarge-scale data and text mining
Large-scale data and text mining
 
More Than the Sum of its Parts – Leveraging a UC Network
More Than the Sum of its Parts – Leveraging a UC Network More Than the Sum of its Parts – Leveraging a UC Network
More Than the Sum of its Parts – Leveraging a UC Network
 
Mining molecules from text and data
Mining molecules from text and dataMining molecules from text and data
Mining molecules from text and data
 
Acknowledgment For Reviewers
Acknowledgment For ReviewersAcknowledgment For Reviewers
Acknowledgment For Reviewers
 
Interaction networks - Prediction, data integration and text mining
Interaction networks - Prediction, data integration and text miningInteraction networks - Prediction, data integration and text mining
Interaction networks - Prediction, data integration and text mining
 
Network biology
Network biologyNetwork biology
Network biology
 
Protein networks: A basis for large-scale data mining
Protein networks: A basis for large-scale data miningProtein networks: A basis for large-scale data mining
Protein networks: A basis for large-scale data mining
 

More from Lars Juhl Jensen

One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Illustrating the power of dictionary-based named entit...One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Illustrating the power of dictionary-based named entit...Lars Juhl Jensen
 
One tagger, many uses: Simple text-mining strategies for biomedicine
One tagger, many uses: Simple text-mining strategies for biomedicineOne tagger, many uses: Simple text-mining strategies for biomedicine
One tagger, many uses: Simple text-mining strategies for biomedicineLars Juhl Jensen
 
Extract 2.0: Text-mining-assisted interactive annotation
Extract 2.0: Text-mining-assisted interactive annotationExtract 2.0: Text-mining-assisted interactive annotation
Extract 2.0: Text-mining-assisted interactive annotationLars Juhl Jensen
 
Network visualization: A crash course on using Cytoscape
Network visualization: A crash course on using CytoscapeNetwork visualization: A crash course on using Cytoscape
Network visualization: A crash course on using CytoscapeLars Juhl Jensen
 
STRING & STITCH : Network integration of heterogeneous data
STRING & STITCH: Network integration of heterogeneous dataSTRING & STITCH: Network integration of heterogeneous data
STRING & STITCH : Network integration of heterogeneous dataLars Juhl Jensen
 
Biomedical text mining: Automatic processing of unstructured text
Biomedical text mining: Automatic processing of unstructured textBiomedical text mining: Automatic processing of unstructured text
Biomedical text mining: Automatic processing of unstructured textLars Juhl Jensen
 
Medical network analysis: Linking diseases and genes through data and text mi...
Medical network analysis: Linking diseases and genes through data and text mi...Medical network analysis: Linking diseases and genes through data and text mi...
Medical network analysis: Linking diseases and genes through data and text mi...Lars Juhl Jensen
 
Network Biology: A crash course on STRING and Cytoscape
Network Biology: A crash course on STRING and CytoscapeNetwork Biology: A crash course on STRING and Cytoscape
Network Biology: A crash course on STRING and CytoscapeLars Juhl Jensen
 
Cellular Network Biology: Large-scale integration of data and text
Cellular Network Biology: Large-scale integration of data and textCellular Network Biology: Large-scale integration of data and text
Cellular Network Biology: Large-scale integration of data and textLars Juhl Jensen
 
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...Lars Juhl Jensen
 
STRING & related databases: Large-scale integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous dataSTRING & related databases: Large-scale integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous dataLars Juhl Jensen
 
Tagger: Rapid dictionary-based named entity recognition
Tagger: Rapid dictionary-based named entity recognitionTagger: Rapid dictionary-based named entity recognition
Tagger: Rapid dictionary-based named entity recognitionLars Juhl Jensen
 
Network Biology: Large-scale integration of data and text
Network Biology: Large-scale integration of data and textNetwork Biology: Large-scale integration of data and text
Network Biology: Large-scale integration of data and textLars Juhl Jensen
 
Medical text mining: Linking diseases, drugs, and adverse reactions
Medical text mining: Linking diseases, drugs, and adverse reactionsMedical text mining: Linking diseases, drugs, and adverse reactions
Medical text mining: Linking diseases, drugs, and adverse reactionsLars Juhl Jensen
 
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textNetwork biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textLars Juhl Jensen
 
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Medical data and text mining: Linking diseases, drugs, and adverse reactionsMedical data and text mining: Linking diseases, drugs, and adverse reactions
Medical data and text mining: Linking diseases, drugs, and adverse reactionsLars Juhl Jensen
 
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textNetwork biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textLars Juhl Jensen
 
Biomarker bioinformatics: Network-based candidate prioritization
Biomarker bioinformatics: Network-based candidate prioritizationBiomarker bioinformatics: Network-based candidate prioritization
Biomarker bioinformatics: Network-based candidate prioritizationLars Juhl Jensen
 

More from Lars Juhl Jensen (20)

One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Illustrating the power of dictionary-based named entit...One tagger, many uses: Illustrating the power of dictionary-based named entit...
One tagger, many uses: Illustrating the power of dictionary-based named entit...
 
One tagger, many uses: Simple text-mining strategies for biomedicine
One tagger, many uses: Simple text-mining strategies for biomedicineOne tagger, many uses: Simple text-mining strategies for biomedicine
One tagger, many uses: Simple text-mining strategies for biomedicine
 
Extract 2.0: Text-mining-assisted interactive annotation
Extract 2.0: Text-mining-assisted interactive annotationExtract 2.0: Text-mining-assisted interactive annotation
Extract 2.0: Text-mining-assisted interactive annotation
 
Network visualization: A crash course on using Cytoscape
Network visualization: A crash course on using CytoscapeNetwork visualization: A crash course on using Cytoscape
Network visualization: A crash course on using Cytoscape
 
STRING & STITCH : Network integration of heterogeneous data
STRING & STITCH: Network integration of heterogeneous dataSTRING & STITCH: Network integration of heterogeneous data
STRING & STITCH : Network integration of heterogeneous data
 
Biomedical text mining: Automatic processing of unstructured text
Biomedical text mining: Automatic processing of unstructured textBiomedical text mining: Automatic processing of unstructured text
Biomedical text mining: Automatic processing of unstructured text
 
Medical network analysis: Linking diseases and genes through data and text mi...
Medical network analysis: Linking diseases and genes through data and text mi...Medical network analysis: Linking diseases and genes through data and text mi...
Medical network analysis: Linking diseases and genes through data and text mi...
 
Network Biology: A crash course on STRING and Cytoscape
Network Biology: A crash course on STRING and CytoscapeNetwork Biology: A crash course on STRING and Cytoscape
Network Biology: A crash course on STRING and Cytoscape
 
Cellular networks
Cellular networksCellular networks
Cellular networks
 
Cellular Network Biology: Large-scale integration of data and text
Cellular Network Biology: Large-scale integration of data and textCellular Network Biology: Large-scale integration of data and text
Cellular Network Biology: Large-scale integration of data and text
 
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
Statistics on big biomedical data: Methods and pitfalls when analyzing high-t...
 
STRING & related databases: Large-scale integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous dataSTRING & related databases: Large-scale integration of heterogeneous data
STRING & related databases: Large-scale integration of heterogeneous data
 
Tagger: Rapid dictionary-based named entity recognition
Tagger: Rapid dictionary-based named entity recognitionTagger: Rapid dictionary-based named entity recognition
Tagger: Rapid dictionary-based named entity recognition
 
Network Biology: Large-scale integration of data and text
Network Biology: Large-scale integration of data and textNetwork Biology: Large-scale integration of data and text
Network Biology: Large-scale integration of data and text
 
Medical text mining: Linking diseases, drugs, and adverse reactions
Medical text mining: Linking diseases, drugs, and adverse reactionsMedical text mining: Linking diseases, drugs, and adverse reactions
Medical text mining: Linking diseases, drugs, and adverse reactions
 
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textNetwork biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and text
 
Medical data and text mining: Linking diseases, drugs, and adverse reactions
Medical data and text mining: Linking diseases, drugs, and adverse reactionsMedical data and text mining: Linking diseases, drugs, and adverse reactions
Medical data and text mining: Linking diseases, drugs, and adverse reactions
 
Cellular Network Biology
Cellular Network BiologyCellular Network Biology
Cellular Network Biology
 
Network biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and textNetwork biology: Large-scale integration of data and text
Network biology: Large-scale integration of data and text
 
Biomarker bioinformatics: Network-based candidate prioritization
Biomarker bioinformatics: Network-based candidate prioritizationBiomarker bioinformatics: Network-based candidate prioritization
Biomarker bioinformatics: Network-based candidate prioritization
 

Networks of proteins and diseases