SlideShare a Scribd company logo
Introduction to Data
Integration in Bioinformatics
Yan Xu

Dec. 2013
Data Integration
Copy
Number

Epigenome

Methylation

miRNA

Gene
Expression
Clinical data

Introduction to Data Integration in Bioinformatics

Pathways

Dec. 2013
Recent Publications
R. Louhimo, T. Lepikhova, O. Monni, and S. Hautaniemi, ‖Comparative analysis of
algorithms for integration of copy number and expression data,‖ Nature
Methods, 2012.
The ENCODE Project Consortium, ―An integrated encyclopedia of DNA elements in
the human genome, ‖ Nature, 2012.
S. Aerts and J. Cools, ―Cancer: Mutations close in on gene regulation,‖ Nature, Jul.
2013.
V. J. H. Powell and A. Acharya, ―Disease Prevention: Data Integration,‖ Science, Dec.
2012.
A. Vinayagam, Y. Hu, M. Kulkarni, C. Roesel, R. Sopko, S. E. Mohr, and N. Perrimon
―Protein Complex–Based Analysis Framework for High-Throughput Data Sets,‖
Science Signaling, Feb. 2013.

Introduction to Data Integration in Bioinformatics

Dec. 2013
DNA the molecule of life

Protein-coding DNA makes up barely 2% of the human
genome, About 80% of the bases in the genome may be expressed
without an identified function.

Introduction to Data Integration in Bioinformatics

Dec. 2013
Gene Expression
DNA: Two long
biopolymers made of
nucleotides,composed of
nucleobase:
A: Adenine
T: Thymine
C: Cytosine
G: Guanine

termination codon
Poly-A tail

cap

start codon
Sequence of amino acids

Introduction to Data Integration in Bioinformatics

Dec. 2013
Microarray

Reverse Transcription

Result

Introduction to Data Integration in Bioinformatics

Dec. 2013
Next generation RNA-sequencing
EST: Expressed Sequence Tag
Reads of a single type of
nucleotide at one moment

(animation)

The number of nucleotide reads
at one moment

Reference:
Open Reading Frame

Introduction to Data Integration in Bioinformatics

Time

Dec. 2013
DNA structural variation: Copy number
CNV (Copy Number Variation):
• 12% of human genomic DNA
• 0.4% of the genome of unrelated people differ with respect
to copy number
• Range from 1000 nucleotide bases to several megabases
• Inherited or caused by de novo mutation (not inherited
from either parent).
Relation to disease:
Higher EGFR (Epidermal growth factor receptor) copy number
exist in Non-small cell lung cancer. (Cappuzzo et al. Journal of the
National Cancer Institute, 2005)
Higher copy number of CCL3L1 decreases susceptibility to HIV.
(Gonzalez et al. Nature, 2005)
Low copy number of FCGR3B increases susceptibility to
inflammatory autoimmune disorders (Aitman et al. Nature, 2006).

Introduction to Data Integration in Bioinformatics

Dec. 2013
Epigenome: DNA Methylation
Why we look so
different even we
have the exactly
identical genes ??

What, when and where
Epigenome
directions

Introduction to Data Integration in Bioinformatics

Genome

• Addition of a methyl group to the C or
A DNA nucleotides.
• Permanent and unidirectional
• Can be copied across cell divisions or
even passed on to offsprings

Dec. 2013
miRNA (microRNA)
Genome has protein-coding genes, also has genes that code for small RNA
e.g., ―transfer RNA‖ that is used in translation is coded by genes
e.g., ―ribosomal RNA‖ that forms part of the structure of the ribosome, is also
coded by genes
miRNA: 21-22 nucleotide non-coding RNA

miRNA Pathway

• Perfect complementary
binding leads to mRNA
degradation of the target
gene
• Imperfect pairing inhibits
translation of mRNA to
protein

RISC: RNA-induced silencing complex.
Use miRNA as a template for
recognizing complementary mRNA

Introduction to Data Integration in Bioinformatics

Dec. 2013
Clinical data
General clinical checkup data: temperature, blood pressure;
Pathology: blood test, antibody test;

Radiology: X-ray, CT (Computed tomography), Ultrasound, MRI (Magnetic
resonance imaging).
Texture Heterogeneity

High score

Low score

Introduction to Data Integration in Bioinformatics

Internal Arteries

High score

Low score

Dec. 2013
Challenges of data integration analysis
• Large highly connected data sources and
ontologies

• Heterogeneity: functions, structures, data access
and analysis methods, dissemination formats.
• Incomplete or overlapping data sources
• Frequent changes

Introduction to Data Integration in Bioinformatics

Dec. 2013
Case I

E. Segal et al.,―Decoding global gene expression programs in liver cancer by noninvasive
imaging,‖ nature biotechnology, May 2007.

E. Segal et al.
“, Module
network:
identifying
regulatory
modules and their
condition-specific
regulators from
gene expression
data,” nature
genetics, 2003.

Introduction to Data Integration in Bioinformatics

Dec. 2013
Case II

O. Gevaert et al., ―Non–Small Cell Lung Cancer: Identifying Prognostic Imaging Biomarkers
by Leveraging Public Gene Expression Microarray Data—Methods and Preliminary Results
,‖ Radiology, Aug. 2012.

Introduction to Data Integration in Bioinformatics

Dec. 2013

More Related Content

What's hot

Drug Discovery: Proteomics, Genomics
Drug Discovery: Proteomics, GenomicsDrug Discovery: Proteomics, Genomics
Drug Discovery: Proteomics, Genomics
Philip Bourne
 
proteomics and system biology
proteomics and system biologyproteomics and system biology
proteomics and system biology
Nawfal Aldujaily
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver Hart
Alexander Pico
 
Analisis de la expresion de genes en la depresion
Analisis de la expresion de genes en la depresionAnalisis de la expresion de genes en la depresion
Analisis de la expresion de genes en la depresion
Cinthya Yessenia
 
Integration of heterogeneous data
Integration of heterogeneous dataIntegration of heterogeneous data
Integration of heterogeneous data
Lars Juhl Jensen
 
Big Datasets and Highly Sensitive Data
Big Datasets and Highly Sensitive DataBig Datasets and Highly Sensitive Data
Big Datasets and Highly Sensitive Data
ARDC
 
Genomics and proteomics by shreeman
Genomics and proteomics by shreemanGenomics and proteomics by shreeman
Genomics and proteomics by shreeman
shreeman cs
 
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Jonathan Eisen
 
Introduction to genes and gene theraph ysss
Introduction to genes and gene theraph ysssIntroduction to genes and gene theraph ysss
Introduction to genes and gene theraph ysss
farranajwa
 
Genomics
GenomicsGenomics
Genomics
Komal Rajgire
 
OMICS tecnology
OMICS tecnologyOMICS tecnology
OMICS tecnology
trishaissar
 
A linear motif atlas for phosphorylation-dependent signaling
A linear motif atlas for phosphorylation-dependent signalingA linear motif atlas for phosphorylation-dependent signaling
A linear motif atlas for phosphorylation-dependent signaling
Lars Juhl Jensen
 
Dr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 MedicineDr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 Medicine
The Ohio State University Wexner Medical Center
 
Role of biotechnology in cancer control
Role of biotechnology in cancer controlRole of biotechnology in cancer control
Role of biotechnology in cancer control
Janani Gopalarethinam
 
NGS in cancer treatment
NGS in cancer treatmentNGS in cancer treatment
NGS in cancer treatment
Nur Suhaida
 
Computational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKComputational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IK
Ilgın Kavaklıoğulları
 
Systems biology
Systems biologySystems biology
Systems biology
VWR INTERNATIONAL
 
Human genome
Human genomeHuman genome
Human genomeDansfera
 

What's hot (18)

Drug Discovery: Proteomics, Genomics
Drug Discovery: Proteomics, GenomicsDrug Discovery: Proteomics, Genomics
Drug Discovery: Proteomics, Genomics
 
proteomics and system biology
proteomics and system biologyproteomics and system biology
proteomics and system biology
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver Hart
 
Analisis de la expresion de genes en la depresion
Analisis de la expresion de genes en la depresionAnalisis de la expresion de genes en la depresion
Analisis de la expresion de genes en la depresion
 
Integration of heterogeneous data
Integration of heterogeneous dataIntegration of heterogeneous data
Integration of heterogeneous data
 
Big Datasets and Highly Sensitive Data
Big Datasets and Highly Sensitive DataBig Datasets and Highly Sensitive Data
Big Datasets and Highly Sensitive Data
 
Genomics and proteomics by shreeman
Genomics and proteomics by shreemanGenomics and proteomics by shreeman
Genomics and proteomics by shreeman
 
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
Phylogeny of Bacterial and Archaeal Genomes Using Conserved Genes: Supertrees...
 
Introduction to genes and gene theraph ysss
Introduction to genes and gene theraph ysssIntroduction to genes and gene theraph ysss
Introduction to genes and gene theraph ysss
 
Genomics
GenomicsGenomics
Genomics
 
OMICS tecnology
OMICS tecnologyOMICS tecnology
OMICS tecnology
 
A linear motif atlas for phosphorylation-dependent signaling
A linear motif atlas for phosphorylation-dependent signalingA linear motif atlas for phosphorylation-dependent signaling
A linear motif atlas for phosphorylation-dependent signaling
 
Dr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 MedicineDr. Leroy Hood Lecuture on P4 Medicine
Dr. Leroy Hood Lecuture on P4 Medicine
 
Role of biotechnology in cancer control
Role of biotechnology in cancer controlRole of biotechnology in cancer control
Role of biotechnology in cancer control
 
NGS in cancer treatment
NGS in cancer treatmentNGS in cancer treatment
NGS in cancer treatment
 
Computational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IKComputational Genomics - Bioinformatics - IK
Computational Genomics - Bioinformatics - IK
 
Systems biology
Systems biologySystems biology
Systems biology
 
Human genome
Human genomeHuman genome
Human genome
 

Viewers also liked

Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approachHong ChangBum
 
Cloud-based Storage, Processing and Rendering for Gegabytes 3D Biomedical Images
Cloud-based Storage, Processing and Rendering for Gegabytes 3D Biomedical ImagesCloud-based Storage, Processing and Rendering for Gegabytes 3D Biomedical Images
Cloud-based Storage, Processing and Rendering for Gegabytes 3D Biomedical Images
Yan Xu
 
Introduction to RNA-seq
Introduction to RNA-seqIntroduction to RNA-seq
Introduction to RNA-seqPaul Gardner
 
20081216 06陳倩琪 紅麴菌基因體之定序與分析
20081216 06陳倩琪 紅麴菌基因體之定序與分析20081216 06陳倩琪 紅麴菌基因體之定序與分析
20081216 06陳倩琪 紅麴菌基因體之定序與分析
Monascus2008
 
Unidad 5.
Unidad 5.Unidad 5.
Unidad 5.
felipe991107
 
Results
ResultsResults
Results
thejawaas
 
Tdd
TddTdd
ONLINE STORE BUSINESS IN FAN PAGE
ONLINE STORE BUSINESS IN FAN PAGEONLINE STORE BUSINESS IN FAN PAGE
ONLINE STORE BUSINESS IN FAN PAGE
D'Trendy Clothings
 
Unidad 5 (1).
Unidad 5 (1).Unidad 5 (1).
Unidad 5 (1).
felipe991107
 
Transporte em nanoestruturas_3_algumas_consideracoes_fisicas
Transporte em nanoestruturas_3_algumas_consideracoes_fisicasTransporte em nanoestruturas_3_algumas_consideracoes_fisicas
Transporte em nanoestruturas_3_algumas_consideracoes_fisicas
REGIANE APARECIDA RAGI PEREIRA
 
What is Android L ?
What is Android L ?What is Android L ?
What is Android L ?
E2LOGY
 
E2LOGY Cloud presentation
E2LOGY Cloud presentationE2LOGY Cloud presentation
E2LOGY Cloud presentationE2LOGY
 
Water
WaterWater

Viewers also liked (20)

Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approach
 
Cloud-based Storage, Processing and Rendering for Gegabytes 3D Biomedical Images
Cloud-based Storage, Processing and Rendering for Gegabytes 3D Biomedical ImagesCloud-based Storage, Processing and Rendering for Gegabytes 3D Biomedical Images
Cloud-based Storage, Processing and Rendering for Gegabytes 3D Biomedical Images
 
Introduction to RNA-seq
Introduction to RNA-seqIntroduction to RNA-seq
Introduction to RNA-seq
 
20081216 06陳倩琪 紅麴菌基因體之定序與分析
20081216 06陳倩琪 紅麴菌基因體之定序與分析20081216 06陳倩琪 紅麴菌基因體之定序與分析
20081216 06陳倩琪 紅麴菌基因體之定序與分析
 
Question 1
Question 1Question 1
Question 1
 
Unidad 5.
Unidad 5.Unidad 5.
Unidad 5.
 
Results
ResultsResults
Results
 
Tdd
TddTdd
Tdd
 
ONLINE STORE BUSINESS IN FAN PAGE
ONLINE STORE BUSINESS IN FAN PAGEONLINE STORE BUSINESS IN FAN PAGE
ONLINE STORE BUSINESS IN FAN PAGE
 
Unidad 5 (1).
Unidad 5 (1).Unidad 5 (1).
Unidad 5 (1).
 
Like a boss
Like a bossLike a boss
Like a boss
 
Transporte em nanoestruturas_3_algumas_consideracoes_fisicas
Transporte em nanoestruturas_3_algumas_consideracoes_fisicasTransporte em nanoestruturas_3_algumas_consideracoes_fisicas
Transporte em nanoestruturas_3_algumas_consideracoes_fisicas
 
What is Android L ?
What is Android L ?What is Android L ?
What is Android L ?
 
Pollution
PollutionPollution
Pollution
 
Yoursalespitchsuckspdf 140121071847-phpapp02
Yoursalespitchsuckspdf 140121071847-phpapp02Yoursalespitchsuckspdf 140121071847-phpapp02
Yoursalespitchsuckspdf 140121071847-phpapp02
 
Soldar.
Soldar.Soldar.
Soldar.
 
Iptek 2
Iptek 2Iptek 2
Iptek 2
 
My favourite house
My favourite houseMy favourite house
My favourite house
 
E2LOGY Cloud presentation
E2LOGY Cloud presentationE2LOGY Cloud presentation
E2LOGY Cloud presentation
 
Water
WaterWater
Water
 

Similar to Introduction to data integration in bioinformatics

Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
Incedo
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformaticaMartín Arrieta
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Hafeezarana
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
Joaquin Dopazo
 
Genomics: Personalised Medicine in Brain Cancer?
Genomics: Personalised Medicine in Brain Cancer?Genomics: Personalised Medicine in Brain Cancer?
Genomics: Personalised Medicine in Brain Cancer?
Cure Brain Cancer Foundation
 
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
ijcsit
 
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
AIRCC Publishing Corporation
 
Personalized medicine through wes and big data analytics
Personalized medicine through wes and big data analyticsPersonalized medicine through wes and big data analytics
Personalized medicine through wes and big data analytics
JunaidAKG
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
Shelomi Karoon
 
OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009Sean Davis
 
Applications of molecular genetics
Applications of molecular geneticsApplications of molecular genetics
Applications of molecular genetics
Tahir Ali,Punjab University Lahore
 
Integrative analysis of medical imaging and omics
Integrative analysis of medical imaging and omicsIntegrative analysis of medical imaging and omics
Integrative analysis of medical imaging and omics
Hongyoon Choi
 
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
New York City College of Technology Computer Systems Technology Colloquium
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision
 
bjr.20230211.pdf
bjr.20230211.pdfbjr.20230211.pdf
bjr.20230211.pdf
TahaShafiMasoodi
 
Maria A. Diroma – MEWAs: sviluppo di un sistema bioinformatico per studi di a...
Maria A. Diroma – MEWAs: sviluppo di un sistema bioinformatico per studi di a...Maria A. Diroma – MEWAs: sviluppo di un sistema bioinformatico per studi di a...
Maria A. Diroma – MEWAs: sviluppo di un sistema bioinformatico per studi di a...
eventi-ITBbari
 
P4 Medicine: A Vision For Your Molecular Health
P4 Medicine: A Vision For Your Molecular HealthP4 Medicine: A Vision For Your Molecular Health
P4 Medicine: A Vision For Your Molecular Health
Sachin Rawat
 
G. Poste. Big Data and the Evolution of Precision Medicine, Cambridge 2nd Ann...
G. Poste. Big Data and the Evolution of Precision Medicine, Cambridge 2nd Ann...G. Poste. Big Data and the Evolution of Precision Medicine, Cambridge 2nd Ann...
G. Poste. Big Data and the Evolution of Precision Medicine, Cambridge 2nd Ann...
CASI, Arizona State University
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13
Russ Altman
 

Similar to Introduction to data integration in bioinformatics (20)

Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformatica
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
 
Genomics: Personalised Medicine in Brain Cancer?
Genomics: Personalised Medicine in Brain Cancer?Genomics: Personalised Medicine in Brain Cancer?
Genomics: Personalised Medicine in Brain Cancer?
 
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
 
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMERGENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
 
Personalized medicine through wes and big data analytics
Personalized medicine through wes and big data analyticsPersonalized medicine through wes and big data analytics
Personalized medicine through wes and big data analytics
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
 
Applications of molecular genetics
Applications of molecular geneticsApplications of molecular genetics
Applications of molecular genetics
 
Integrative analysis of medical imaging and omics
Integrative analysis of medical imaging and omicsIntegrative analysis of medical imaging and omics
Integrative analysis of medical imaging and omics
 
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria López
 
bjr.20230211.pdf
bjr.20230211.pdfbjr.20230211.pdf
bjr.20230211.pdf
 
Maria A. Diroma – MEWAs: sviluppo di un sistema bioinformatico per studi di a...
Maria A. Diroma – MEWAs: sviluppo di un sistema bioinformatico per studi di a...Maria A. Diroma – MEWAs: sviluppo di un sistema bioinformatico per studi di a...
Maria A. Diroma – MEWAs: sviluppo di un sistema bioinformatico per studi di a...
 
P4 Medicine: A Vision For Your Molecular Health
P4 Medicine: A Vision For Your Molecular HealthP4 Medicine: A Vision For Your Molecular Health
P4 Medicine: A Vision For Your Molecular Health
 
G. Poste. Big Data and the Evolution of Precision Medicine, Cambridge 2nd Ann...
G. Poste. Big Data and the Evolution of Precision Medicine, Cambridge 2nd Ann...G. Poste. Big Data and the Evolution of Precision Medicine, Cambridge 2nd Ann...
G. Poste. Big Data and the Evolution of Precision Medicine, Cambridge 2nd Ann...
 
MLGG_for_linkedIn
MLGG_for_linkedInMLGG_for_linkedIn
MLGG_for_linkedIn
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13
 

More from Yan Xu

Kaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingKaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales Forecasting
Yan Xu
 
Basics of Dynamic programming
Basics of Dynamic programming Basics of Dynamic programming
Basics of Dynamic programming
Yan Xu
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0
Yan Xu
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for business
Yan Xu
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed Bandits
Yan Xu
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
Yan Xu
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Yan Xu
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Yan Xu
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Yan Xu
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
Yan Xu
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
Yan Xu
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
Yan Xu
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
Yan Xu
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)
Yan Xu
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
Yan Xu
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGo
Yan Xu
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep Learning
Yan Xu
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Yan Xu
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
Yan Xu
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
Yan Xu
 

More from Yan Xu (20)

Kaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales ForecastingKaggle winning solutions: Retail Sales Forecasting
Kaggle winning solutions: Retail Sales Forecasting
 
Basics of Dynamic programming
Basics of Dynamic programming Basics of Dynamic programming
Basics of Dynamic programming
 
Walking through Tensorflow 2.0
Walking through Tensorflow 2.0Walking through Tensorflow 2.0
Walking through Tensorflow 2.0
 
Practical contextual bandits for business
Practical contextual bandits for businessPractical contextual bandits for business
Practical contextual bandits for business
 
Introduction to Multi-armed Bandits
Introduction to Multi-armed BanditsIntroduction to Multi-armed Bandits
Introduction to Multi-armed Bandits
 
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack WangA Data-Driven Question Generation Model for Educational Content - by Jack Wang
A Data-Driven Question Generation Model for Educational Content - by Jack Wang
 
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
Deep Learning Approach in Characterizing Salt Body on Seismic Images - by Zhe...
 
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
Deep Hierarchical Profiling & Pattern Discovery: Application to Whole Brain R...
 
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
Detecting anomalies on rotating equipment using Deep Stacked Autoencoders - b...
 
Introduction to Autoencoders
Introduction to AutoencodersIntroduction to Autoencoders
Introduction to Autoencoders
 
State of enterprise data science
State of enterprise data scienceState of enterprise data science
State of enterprise data science
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)Linear algebra and probability (Deep Learning chapter 2&3)
Linear algebra and probability (Deep Learning chapter 2&3)
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
 
Secrets behind AlphaGo
Secrets behind AlphaGoSecrets behind AlphaGo
Secrets behind AlphaGo
 
Optimization in Deep Learning
Optimization in Deep LearningOptimization in Deep Learning
Optimization in Deep Learning
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Introduction to Neural Network
Introduction to Neural NetworkIntroduction to Neural Network
Introduction to Neural Network
 

Recently uploaded

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 

Recently uploaded (20)

Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 

Introduction to data integration in bioinformatics

  • 1. Introduction to Data Integration in Bioinformatics Yan Xu Dec. 2013
  • 3. Recent Publications R. Louhimo, T. Lepikhova, O. Monni, and S. Hautaniemi, ‖Comparative analysis of algorithms for integration of copy number and expression data,‖ Nature Methods, 2012. The ENCODE Project Consortium, ―An integrated encyclopedia of DNA elements in the human genome, ‖ Nature, 2012. S. Aerts and J. Cools, ―Cancer: Mutations close in on gene regulation,‖ Nature, Jul. 2013. V. J. H. Powell and A. Acharya, ―Disease Prevention: Data Integration,‖ Science, Dec. 2012. A. Vinayagam, Y. Hu, M. Kulkarni, C. Roesel, R. Sopko, S. E. Mohr, and N. Perrimon ―Protein Complex–Based Analysis Framework for High-Throughput Data Sets,‖ Science Signaling, Feb. 2013. Introduction to Data Integration in Bioinformatics Dec. 2013
  • 4. DNA the molecule of life Protein-coding DNA makes up barely 2% of the human genome, About 80% of the bases in the genome may be expressed without an identified function. Introduction to Data Integration in Bioinformatics Dec. 2013
  • 5. Gene Expression DNA: Two long biopolymers made of nucleotides,composed of nucleobase: A: Adenine T: Thymine C: Cytosine G: Guanine termination codon Poly-A tail cap start codon Sequence of amino acids Introduction to Data Integration in Bioinformatics Dec. 2013
  • 6. Microarray Reverse Transcription Result Introduction to Data Integration in Bioinformatics Dec. 2013
  • 7. Next generation RNA-sequencing EST: Expressed Sequence Tag Reads of a single type of nucleotide at one moment (animation) The number of nucleotide reads at one moment Reference: Open Reading Frame Introduction to Data Integration in Bioinformatics Time Dec. 2013
  • 8. DNA structural variation: Copy number CNV (Copy Number Variation): • 12% of human genomic DNA • 0.4% of the genome of unrelated people differ with respect to copy number • Range from 1000 nucleotide bases to several megabases • Inherited or caused by de novo mutation (not inherited from either parent). Relation to disease: Higher EGFR (Epidermal growth factor receptor) copy number exist in Non-small cell lung cancer. (Cappuzzo et al. Journal of the National Cancer Institute, 2005) Higher copy number of CCL3L1 decreases susceptibility to HIV. (Gonzalez et al. Nature, 2005) Low copy number of FCGR3B increases susceptibility to inflammatory autoimmune disorders (Aitman et al. Nature, 2006). Introduction to Data Integration in Bioinformatics Dec. 2013
  • 9. Epigenome: DNA Methylation Why we look so different even we have the exactly identical genes ?? What, when and where Epigenome directions Introduction to Data Integration in Bioinformatics Genome • Addition of a methyl group to the C or A DNA nucleotides. • Permanent and unidirectional • Can be copied across cell divisions or even passed on to offsprings Dec. 2013
  • 10. miRNA (microRNA) Genome has protein-coding genes, also has genes that code for small RNA e.g., ―transfer RNA‖ that is used in translation is coded by genes e.g., ―ribosomal RNA‖ that forms part of the structure of the ribosome, is also coded by genes miRNA: 21-22 nucleotide non-coding RNA miRNA Pathway • Perfect complementary binding leads to mRNA degradation of the target gene • Imperfect pairing inhibits translation of mRNA to protein RISC: RNA-induced silencing complex. Use miRNA as a template for recognizing complementary mRNA Introduction to Data Integration in Bioinformatics Dec. 2013
  • 11. Clinical data General clinical checkup data: temperature, blood pressure; Pathology: blood test, antibody test; Radiology: X-ray, CT (Computed tomography), Ultrasound, MRI (Magnetic resonance imaging). Texture Heterogeneity High score Low score Introduction to Data Integration in Bioinformatics Internal Arteries High score Low score Dec. 2013
  • 12. Challenges of data integration analysis • Large highly connected data sources and ontologies • Heterogeneity: functions, structures, data access and analysis methods, dissemination formats. • Incomplete or overlapping data sources • Frequent changes Introduction to Data Integration in Bioinformatics Dec. 2013
  • 13. Case I E. Segal et al.,―Decoding global gene expression programs in liver cancer by noninvasive imaging,‖ nature biotechnology, May 2007. E. Segal et al. “, Module network: identifying regulatory modules and their condition-specific regulators from gene expression data,” nature genetics, 2003. Introduction to Data Integration in Bioinformatics Dec. 2013
  • 14. Case II O. Gevaert et al., ―Non–Small Cell Lung Cancer: Identifying Prognostic Imaging Biomarkers by Leveraging Public Gene Expression Microarray Data—Methods and Preliminary Results ,‖ Radiology, Aug. 2012. Introduction to Data Integration in Bioinformatics Dec. 2013

Editor's Notes

  1. Researchers are now learning that another level of information—the epigenome—controls gene expression in part by controlling access to DNA. The gene-reading machinery is blocked when methyl molecules bind to DNA or histones.