SlideShare a Scribd company logo
1 of 49
MSc GBE Course:
Genes: from sequence to function
Brief Introduction to
Systems Biology
Sven Bergmann
Department of Medical Genetics
University of Lausanne
Rue de Bugnon 27 - DGM 328
CH-1005 Lausanne
Switzerland
work: ++41-21-692-5452
cell: ++41-78-663-4980
http://serverdgm.unil.ch/bergmann
Course Overview
• Basics: What is Systems Biology?
• Standard analysis tools for large
datasets
• Advanced analysis tools
• Systems approach to “small”
networks
What is Systems Biology?
• To understand biology at the system level, we
must examine the structure and dynamics of
cellular and organismal function, rather than
the characteristics of isolated parts of a cell or
organism. Properties of systems, such as
robustness, emerge as central issues, and
understanding these properties may have an
impact on the future of medicine.
Hiroaki Kitano
What is Systems Biology?
• To me, systems biology seeks to explain
biological phenomenon not on a gene by gene
basis, but through the interaction of all the cellular
and biochemical components in a cell or an
organism. Since, biologists have always sought to
understand the mechanisms sustaining living
systems, solutions arising from systems biology
have always been the goal in biology. Previously,
however, we did not have the knowledge or the
tools.
Edison T Liu
Genome Institute of Singapore
What is Systems Biology?
• addresses the analysis of entire biological systems
• interdisciplinary approach to the investigation of all the
components and networks contributing to a biological
system
• [involves] new dynamic computer modeling programs
which ultimately might allow us to simulate entire
organisms based on their individual cellular components
• Strategy of Systems Biology is dependent on interactive
cycles of predictions and experimentation.
• Allow[s Biology] to move from the ranks of a descriptive
science to an exact science.
(Quotes from SystemsX.ch website)
What is Systems Biology?
 identify elements (genes, molecules, cells, …)
 ascertain their relationships (co-expressed, interacting, …)
 integrate information to obtain view of system as a whole
Large (genomic) systems
• many uncharacterized
elements
• relationships unknown
• computational analysis should:
 improve annotation
 reveal relations
 reduce complexity
Small systems
• elements well-known
• many relationships established
• quantitative modeling of
systems properties like:
 Dynamics
 Robustness
 Logics
What is Systems Biology?
Part 1: Basics
Motivation:
• What is a “systems biology approach”?
• Why to take such an approach?
• How can one study systems properties?
Practical Part:
• First look at a set of genomic expression data
• How to have a global look at such datasets?
• Distributions, mean-values, standard deviations, z-
scores
• T-tests and other statistical tests
• Correlations and similarity measures
• Simple Clustering
First look at a set of genomic
expression data
DNA microarray experiments monitor expression
levels of thousands of genes simultaneously:
• allows for studying the genome-wide transcriptional
response of a cell to interior and exterior changes
• provide us with a first step towards understanding
gene function and regulation on a global scale
test
control
Microarrays generate massive data
Log-ratios of expression values
Log ratios indicate differential expression!
)
log(
)
log(
)
log( control
test
control
test E
E
E
/
E
r 


0
+
-
control
test E
E  control
test E
E 
control
test E
~
E
Consolidate data from multiple chips into
one table and use color-coding
Many KOs
(conditions)
1 2 3 4 5
1000
2000
3000
4000
5000
6000
-4
-3
-2
-1
0
1
2
3
4
genes
conditions
log-
ratio
r
Knock Out (KO)
Rosetta data: The real world …
50 100 150 200 250 300
1000
2000
3000
4000
5000
6000 -6
-4
-2
0
2
4
6
genes
conditions
Most genes exhibit little differential expression!
r
-4 -2 0 2 4 6 8
0
500
1000
1500
2000
2500
Histogram shows distribution
ade1 deletion mutant exhibits small differential expression
in most genes!
#
)
log( control
test E
/
E
r 
158 160 162 164 166 168
1000
2000
3000
4000
5000
6000 -6
-4
-2
0
2
4
6
Rosetta data: Zooming in …
Only few genes exhibit large differential expression!
genes
conditions
-8 -6 -4 -2 0 2 4 6 8
0
100
200
300
400
500
600
700
Histogram shows distribution
ssn6 deletion mutant exhibits large differential expression
in many genes!
#
)
log( control
test E
/
E
r 
-8 -6 -4 -2 0 2 4 6 8
0
100
200
300
400
500
600
700
Quantification of distribution
Mean and Standard Deviation (Std) characterize distribution
#
)
log( control
test E
/
E
r 
Mean: μ= =x
Outliers
Std: =
-4 -2 0 2 4 6 8
0
500
1000
1500
2000
2500
-8 -6 -4 -2 0 2 4 6 8
0
100
200
300
400
500
600
700
Comparing distributions
Are the expression values of ade1 different from those of ssn6?
# #
)
log( control
test E
/
E
r  )
log( control
test E
/
E
r 
ade1 ssn1
μ = 0.2366
 = 1.9854
μ = -5.5 · 10-5
 = 0.1434 ?
Quantifying Significance
Student’s T-test
t-statistic: difference between means in units of average error
Significance can be translated into p-value (probability) assuming normal distributions
http://www.physics.csbsju.edu/stats/t-test.html
History: W. S. Gossett [1876-1937]
• The t-test was developed by W. S. Gossett, a statistician
employed at the Guinness brewery. However, because
the brewery did not allow employees to publish their
research, Gossett's work on the t-test appears under the
name "Student" (and the t-test is sometimes referred to
as "Student's t-test.") Gossett was a chemist and was
responsible for developing procedures for ensuring the
similarity of batches of Guinness. The t-test was
developed as a way of measuring how closely the yeast
content of a particular batch of beer corresponded to the
brewery's standard.
http://ccnmtl.columbia.edu/projects/qmss/t_about.html
Pearson correlations (Graphic)
Comparing profiles X and Y (not distributions!):
What is the tendency that high/low values in X match high/low values in Y?
http://davidmlane.com/hyperstat/A34739.html
Y
=
(y
i
)
X = (xi)
Each dot is
a pair (xi,yi)
Pearson correlations: Formulae
(simple version using z-scores)
(complicated version)
r
Similarity according to all conditions
(“Democratic vote”)
Clustering-
coefficient
conditions
1 2 3 4 5
1
2
r12 ~ -1
gene
1 2 3 4 5
1
2 r12 ~ 0
gene
1 2 3 4 5
1
2 r12 ~ 1
gene
Pearson correlations: Intuition
Pearson correlations: Caution!
High correlation does not necessarily mean co-linearity!
r=0.8 r=0.8
r=0.8 r=0.8
(Hierarchical Agglomerative) Clustering
Join most correlated samples and replace correlations
to remaining samples by average, then iterate …
http://gepas.bioinfo.cipf.es/cgi-bin/tutoX?c=clustering/clustering.config
Clustering of the real expression data
Further Reading
K-means Clustering
2. Assign each data point to
closest centroid
1. Start with random
positions of centroids ( )
“guess” k=3 (# of clusters)
http://en.wikipedia.org/wiki/K-means_algorithm
Hierachical Clustering
Plus:
• Shows (re-orderd) data
• Gives hierarchy
Minus:
• Does not work well for many genes
(usually apply cut-off on fold-change)
• Similarity over all genes/conditions
• Clusters do not overlap
Overview of “modular” analysis tools
• Cheng Y and Church GM. Biclustering of expression data.
(Proc Int Conf Intell Syst Mol Biol. 2000;8:93-103)
• Getz G, Levine E, Domany E. Coupled two-way clustering analysis of gene
microarray data. (Proc Natl Acad Sci U S A. 2000 Oct 24;97(22):12079-84)
• Tanay A, Sharan R, Kupiec M, Shamir R. Revealing modularity and organization
in the yeast molecular network by integrated analysis of highly heterogeneous
genomewide data. (Proc Natl Acad Sci U S A. 2004 Mar 2;101(9):2981-6)
• Sheng Q, Moreau Y, De Moor B. Biclustering microarray data by Gibbs sampling.
(Bioinformatics. 2003 Oct;19 Suppl 2:ii196-205)
• Gasch AP and Eisen MB. Exploring the conditional coregulation of yeast gene
expression through fuzzy k-means clustering.
(Genome Biol. 2002 Oct 10;3(11):RESEARCH0059)
• Hastie T, Tibshirani R, Eisen MB, Alizadeh A, Levy R, Staudt L, Chan WC, Botstein
D, Brown P. 'Gene shaving' as a method for identifying distinct sets of genes
with similar expression patterns. (Genome Biol. 2000;1(2):RESEARCH0003.)
… and many more! http://serverdgm.unil.ch/bergmann/Publications/review.pdf
How to “hear” the relevant genes?
Song A
Song B
Coupled two-way Clustering
Inside CTWC: Iterations
Depth Genes Samples
Init G1 S1
1 G1(S1) G2,G3,…G5 S1(G1) S2,S3
2 G1(S2)
G1(S3)
G6,G7,….G13
G14,…G21
S1(G2)
…
S1(G5)
S4,S5,S6
S10,S11
None
3 G2(S1)…G2(S3)
…
G5(S1)…G5(S3)
G22…
…
…G97
S2(G1)…S2(G5)
S3(G1)…S3(G5)
S12,…
…S51
4 G1(S4)
…
G1(S11)
G98,..G105
…
G151,..G160
S1(G6)
…
S1(G21)
S52,...
S67
5 G2(S4)...G2(S11)
…
G5(S4)...G5(S11)
G161…
…
…G216
S2(G6)...S2(G21)
S3(G6)…S3(G21)
S68…
…S113
Two-way clustering
• No need for correlations!
• decomposes data into “transcription modules”
• integrates external information
• allows for interspecies comparative analysis
One example in more detail:
The (Iterative) Signature Algorithm:
J Ihmels, G Friedlander, SB, O Sarig, Y Ziv & N Barkai Nature Genetics (2002)
Trip to the “Amazon”:
5 10 15 20 25 30 35 40 45 50
10
20
30
40
50
60
70
80
90
100
How to find related items?
items
customers
re-
commended
items
your
choice
customers
with
similar
choice
5 10 15 20 25 30 35 40 45 50
10
20
30
40
50
60
70
80
90
100
How to find related genes?
genes
conditions
similarly
expressed
genes
your
guess
relevant
conditions
J Ihmels, G Friedlander, SB, O Sarig, Y Ziv & N Barkai Nature Genetics (2002)
I
G
g
gc
G
c E
s


}
:
{ C
C
C
c
c
c
C t
s
s
C
c
S 



 
c
S
c
gc
C
c
g E
s
s


}
:
{ G
G
G
g
g
g
G t
s
s
G
g
S 





I
G
Signature Algorithm: Score definitions
initial guesses
(genes)
thresholding:
condition scores
How to find related genes? Scores and thresholds!
gene
scores
condition scores
thresholding:
How to find related genes? Scores and thresholds!
gene
scores
condition scores
thresholding:
How to find related genes? Scores and thresholds!
Iterative Signature Algorithm
INPUT OUTPUT
OUTPUT = INPUT
“Transcription Module”
SB, J Ihmels & N Barkai Physical Review E (2003)
Identification of transcription modules
using many random “seeds”
random
“seeds”
Transcription
modules
Independent
identification:
Modules may
overlap!
New Tools: Module Visualization
http://serverdgm.unil.ch/bergmann/Fibroblasts/visualiser.html
Gene enrichment analysis
The hypergeometric distribution f(M,A,K,T) gives the probability
that K out of A genes with a particular annotation match with a
module having M genes if there are T genes in total.
http://en.wikipedia.org/wiki/Hypergeometric_distribution
Decomposing expression data into
annotated transcriptional modules
identified >100
transcriptional
modules in
yeast:
high functional
consistency!
many functional
links “waiting” to
be verified
experimentally
J Ihmels, SB & N Barkai Bioinformatics 2005
Higher-order structure
correlated
anti-
correlated
C

More Related Content

Similar to Slides_SB3.ppt

Biological Significance of Gene Expression Data Using Similarity Based Biclus...
Biological Significance of Gene Expression Data Using Similarity Based Biclus...Biological Significance of Gene Expression Data Using Similarity Based Biclus...
Biological Significance of Gene Expression Data Using Similarity Based Biclus...CSCJournals
 
Introduction to biocomputing
 Introduction to biocomputing Introduction to biocomputing
Introduction to biocomputingNatalio Krasnogor
 
Personalized medicine
Personalized medicinePersonalized medicine
Personalized medicinecancerdrg
 
2 md2016 annotation
2 md2016 annotation2 md2016 annotation
2 md2016 annotationScott Dawson
 
Research in Progress April 2014
Research in Progress April 2014Research in Progress April 2014
Research in Progress April 2014Vanessa S
 
Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsElena Sügis
 
A statistical physics approach to system biology
A statistical physics approach to system biologyA statistical physics approach to system biology
A statistical physics approach to system biologySamir Suweis
 
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...Varij Nayan
 
Proteomics - Analysis and integration of large-scale data sets
Proteomics - Analysis and integration of large-scale data setsProteomics - Analysis and integration of large-scale data sets
Proteomics - Analysis and integration of large-scale data setsLars Juhl Jensen
 
STRING/STITCH tutorial
STRING/STITCH tutorialSTRING/STITCH tutorial
STRING/STITCH tutorialbiocs
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformaticsAbhishek Vatsa
 
Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Gerald Lushington
 
Tales from BioLand - Engineering Challenges in the World of Life Sciences
Tales from BioLand - Engineering Challenges in the World of Life SciencesTales from BioLand - Engineering Challenges in the World of Life Sciences
Tales from BioLand - Engineering Challenges in the World of Life SciencesStefano Di Carlo
 
2015 agbt sv_poster_justin
2015 agbt sv_poster_justin2015 agbt sv_poster_justin
2015 agbt sv_poster_justinGenomeInABottle
 
Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Chirag Patel
 
Network Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systemsNetwork Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systemsGanesh Bagler
 
Identification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning MethodIdentification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning Methodpraveena06
 

Similar to Slides_SB3.ppt (20)

Bioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmmBioinformatica 08-12-2011-t8-go-hmm
Bioinformatica 08-12-2011-t8-go-hmm
 
Biological Significance of Gene Expression Data Using Similarity Based Biclus...
Biological Significance of Gene Expression Data Using Similarity Based Biclus...Biological Significance of Gene Expression Data Using Similarity Based Biclus...
Biological Significance of Gene Expression Data Using Similarity Based Biclus...
 
Introduction to biocomputing
 Introduction to biocomputing Introduction to biocomputing
Introduction to biocomputing
 
Personalized medicine
Personalized medicinePersonalized medicine
Personalized medicine
 
2 md2016 annotation
2 md2016 annotation2 md2016 annotation
2 md2016 annotation
 
ppt
pptppt
ppt
 
Research in Progress April 2014
Research in Progress April 2014Research in Progress April 2014
Research in Progress April 2014
 
Basics of Data Analysis in Bioinformatics
Basics of Data Analysis in BioinformaticsBasics of Data Analysis in Bioinformatics
Basics of Data Analysis in Bioinformatics
 
A statistical physics approach to system biology
A statistical physics approach to system biologyA statistical physics approach to system biology
A statistical physics approach to system biology
 
Gf o2014talk
Gf o2014talkGf o2014talk
Gf o2014talk
 
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...
 
Proteomics - Analysis and integration of large-scale data sets
Proteomics - Analysis and integration of large-scale data setsProteomics - Analysis and integration of large-scale data sets
Proteomics - Analysis and integration of large-scale data sets
 
STRING/STITCH tutorial
STRING/STITCH tutorialSTRING/STITCH tutorial
STRING/STITCH tutorial
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 
Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...Personalized medicine via molecular interrogation, data mining and systems bi...
Personalized medicine via molecular interrogation, data mining and systems bi...
 
Tales from BioLand - Engineering Challenges in the World of Life Sciences
Tales from BioLand - Engineering Challenges in the World of Life SciencesTales from BioLand - Engineering Challenges in the World of Life Sciences
Tales from BioLand - Engineering Challenges in the World of Life Sciences
 
2015 agbt sv_poster_justin
2015 agbt sv_poster_justin2015 agbt sv_poster_justin
2015 agbt sv_poster_justin
 
Correlation globes of the exposome 2016
Correlation globes of the exposome 2016Correlation globes of the exposome 2016
Correlation globes of the exposome 2016
 
Network Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systemsNetwork Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systems
 
Identification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning MethodIdentification of Differentially Expressed Genes by unsupervised Learning Method
Identification of Differentially Expressed Genes by unsupervised Learning Method
 

More from AnandKumar459862

2.4 RANDOM NETWORKS - Copy.pptx
2.4 RANDOM NETWORKS - Copy.pptx2.4 RANDOM NETWORKS - Copy.pptx
2.4 RANDOM NETWORKS - Copy.pptxAnandKumar459862
 
McFaddenSymposium-04182016.pptx
McFaddenSymposium-04182016.pptxMcFaddenSymposium-04182016.pptx
McFaddenSymposium-04182016.pptxAnandKumar459862
 
PlantBreedingandGenetics.ppt
PlantBreedingandGenetics.pptPlantBreedingandGenetics.ppt
PlantBreedingandGenetics.pptAnandKumar459862
 
6_2019_10_12!09_16_04_AM.pptx
6_2019_10_12!09_16_04_AM.pptx6_2019_10_12!09_16_04_AM.pptx
6_2019_10_12!09_16_04_AM.pptxAnandKumar459862
 
year-5-lesson-4-animal-life-cycles.pptx
year-5-lesson-4-animal-life-cycles.pptxyear-5-lesson-4-animal-life-cycles.pptx
year-5-lesson-4-animal-life-cycles.pptxAnandKumar459862
 
powerpoint poresentation.ppt
powerpoint poresentation.pptpowerpoint poresentation.ppt
powerpoint poresentation.pptAnandKumar459862
 
lect_2_fox_study_populations.pptx
lect_2_fox_study_populations.pptxlect_2_fox_study_populations.pptx
lect_2_fox_study_populations.pptxAnandKumar459862
 
slideset-Navigating-NIH-Programs-Nov2021.pptx
slideset-Navigating-NIH-Programs-Nov2021.pptxslideset-Navigating-NIH-Programs-Nov2021.pptx
slideset-Navigating-NIH-Programs-Nov2021.pptxAnandKumar459862
 
fouad_kharroubi_PhD_Defense_05_06_2014.ppt
fouad_kharroubi_PhD_Defense_05_06_2014.pptfouad_kharroubi_PhD_Defense_05_06_2014.ppt
fouad_kharroubi_PhD_Defense_05_06_2014.pptAnandKumar459862
 
s15-miller-chap-6b-lecture.ppt
s15-miller-chap-6b-lecture.ppts15-miller-chap-6b-lecture.ppt
s15-miller-chap-6b-lecture.pptAnandKumar459862
 

More from AnandKumar459862 (18)

3.3 P-P NETWORKS.pptx
3.3 P-P NETWORKS.pptx3.3 P-P NETWORKS.pptx
3.3 P-P NETWORKS.pptx
 
2.1RANDOM NETWORKS.pptx
2.1RANDOM NETWORKS.pptx2.1RANDOM NETWORKS.pptx
2.1RANDOM NETWORKS.pptx
 
2.4 RANDOM NETWORKS - Copy.pptx
2.4 RANDOM NETWORKS - Copy.pptx2.4 RANDOM NETWORKS - Copy.pptx
2.4 RANDOM NETWORKS - Copy.pptx
 
McFaddenSymposium-04182016.pptx
McFaddenSymposium-04182016.pptxMcFaddenSymposium-04182016.pptx
McFaddenSymposium-04182016.pptx
 
PlantBreedingandGenetics.ppt
PlantBreedingandGenetics.pptPlantBreedingandGenetics.ppt
PlantBreedingandGenetics.ppt
 
6_2019_10_12!09_16_04_AM.pptx
6_2019_10_12!09_16_04_AM.pptx6_2019_10_12!09_16_04_AM.pptx
6_2019_10_12!09_16_04_AM.pptx
 
year-5-lesson-4-animal-life-cycles.pptx
year-5-lesson-4-animal-life-cycles.pptxyear-5-lesson-4-animal-life-cycles.pptx
year-5-lesson-4-animal-life-cycles.pptx
 
powerpoint poresentation.ppt
powerpoint poresentation.pptpowerpoint poresentation.ppt
powerpoint poresentation.ppt
 
Michael Hucka.ppt
Michael Hucka.pptMichael Hucka.ppt
Michael Hucka.ppt
 
BIOMED_presentation.ppt
BIOMED_presentation.pptBIOMED_presentation.ppt
BIOMED_presentation.ppt
 
lect_2_fox_study_populations.pptx
lect_2_fox_study_populations.pptxlect_2_fox_study_populations.pptx
lect_2_fox_study_populations.pptx
 
QuickStart.ppt
QuickStart.pptQuickStart.ppt
QuickStart.ppt
 
slideset-Navigating-NIH-Programs-Nov2021.pptx
slideset-Navigating-NIH-Programs-Nov2021.pptxslideset-Navigating-NIH-Programs-Nov2021.pptx
slideset-Navigating-NIH-Programs-Nov2021.pptx
 
BIOMEDICAL .ppt
BIOMEDICAL .pptBIOMEDICAL .ppt
BIOMEDICAL .ppt
 
fouad_kharroubi_PhD_Defense_05_06_2014.ppt
fouad_kharroubi_PhD_Defense_05_06_2014.pptfouad_kharroubi_PhD_Defense_05_06_2014.ppt
fouad_kharroubi_PhD_Defense_05_06_2014.ppt
 
Chem_Enzymes.pptx
Chem_Enzymes.pptxChem_Enzymes.pptx
Chem_Enzymes.pptx
 
s15-miller-chap-6b-lecture.ppt
s15-miller-chap-6b-lecture.ppts15-miller-chap-6b-lecture.ppt
s15-miller-chap-6b-lecture.ppt
 
enzyme_class2.ppt
enzyme_class2.pptenzyme_class2.ppt
enzyme_class2.ppt
 

Recently uploaded

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...Call girls in Ahmedabad High profile
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 

Recently uploaded (20)

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
High Profile Call Girls Dahisar Arpita 9907093804 Independent Escort Service ...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 

Slides_SB3.ppt

  • 1. MSc GBE Course: Genes: from sequence to function Brief Introduction to Systems Biology Sven Bergmann Department of Medical Genetics University of Lausanne Rue de Bugnon 27 - DGM 328 CH-1005 Lausanne Switzerland work: ++41-21-692-5452 cell: ++41-78-663-4980 http://serverdgm.unil.ch/bergmann
  • 2. Course Overview • Basics: What is Systems Biology? • Standard analysis tools for large datasets • Advanced analysis tools • Systems approach to “small” networks
  • 3. What is Systems Biology? • To understand biology at the system level, we must examine the structure and dynamics of cellular and organismal function, rather than the characteristics of isolated parts of a cell or organism. Properties of systems, such as robustness, emerge as central issues, and understanding these properties may have an impact on the future of medicine. Hiroaki Kitano
  • 4. What is Systems Biology? • To me, systems biology seeks to explain biological phenomenon not on a gene by gene basis, but through the interaction of all the cellular and biochemical components in a cell or an organism. Since, biologists have always sought to understand the mechanisms sustaining living systems, solutions arising from systems biology have always been the goal in biology. Previously, however, we did not have the knowledge or the tools. Edison T Liu Genome Institute of Singapore
  • 5. What is Systems Biology? • addresses the analysis of entire biological systems • interdisciplinary approach to the investigation of all the components and networks contributing to a biological system • [involves] new dynamic computer modeling programs which ultimately might allow us to simulate entire organisms based on their individual cellular components • Strategy of Systems Biology is dependent on interactive cycles of predictions and experimentation. • Allow[s Biology] to move from the ranks of a descriptive science to an exact science. (Quotes from SystemsX.ch website)
  • 6. What is Systems Biology?
  • 7.  identify elements (genes, molecules, cells, …)  ascertain their relationships (co-expressed, interacting, …)  integrate information to obtain view of system as a whole Large (genomic) systems • many uncharacterized elements • relationships unknown • computational analysis should:  improve annotation  reveal relations  reduce complexity Small systems • elements well-known • many relationships established • quantitative modeling of systems properties like:  Dynamics  Robustness  Logics What is Systems Biology?
  • 8. Part 1: Basics Motivation: • What is a “systems biology approach”? • Why to take such an approach? • How can one study systems properties? Practical Part: • First look at a set of genomic expression data • How to have a global look at such datasets? • Distributions, mean-values, standard deviations, z- scores • T-tests and other statistical tests • Correlations and similarity measures • Simple Clustering
  • 9. First look at a set of genomic expression data
  • 10. DNA microarray experiments monitor expression levels of thousands of genes simultaneously: • allows for studying the genome-wide transcriptional response of a cell to interior and exterior changes • provide us with a first step towards understanding gene function and regulation on a global scale test control
  • 12. Log-ratios of expression values Log ratios indicate differential expression! ) log( ) log( ) log( control test control test E E E / E r    0 + - control test E E  control test E E  control test E ~ E
  • 13. Consolidate data from multiple chips into one table and use color-coding Many KOs (conditions) 1 2 3 4 5 1000 2000 3000 4000 5000 6000 -4 -3 -2 -1 0 1 2 3 4 genes conditions log- ratio r Knock Out (KO)
  • 14. Rosetta data: The real world … 50 100 150 200 250 300 1000 2000 3000 4000 5000 6000 -6 -4 -2 0 2 4 6 genes conditions Most genes exhibit little differential expression! r
  • 15. -4 -2 0 2 4 6 8 0 500 1000 1500 2000 2500 Histogram shows distribution ade1 deletion mutant exhibits small differential expression in most genes! # ) log( control test E / E r 
  • 16. 158 160 162 164 166 168 1000 2000 3000 4000 5000 6000 -6 -4 -2 0 2 4 6 Rosetta data: Zooming in … Only few genes exhibit large differential expression! genes conditions
  • 17. -8 -6 -4 -2 0 2 4 6 8 0 100 200 300 400 500 600 700 Histogram shows distribution ssn6 deletion mutant exhibits large differential expression in many genes! # ) log( control test E / E r 
  • 18. -8 -6 -4 -2 0 2 4 6 8 0 100 200 300 400 500 600 700 Quantification of distribution Mean and Standard Deviation (Std) characterize distribution # ) log( control test E / E r  Mean: μ= =x Outliers Std: =
  • 19. -4 -2 0 2 4 6 8 0 500 1000 1500 2000 2500 -8 -6 -4 -2 0 2 4 6 8 0 100 200 300 400 500 600 700 Comparing distributions Are the expression values of ade1 different from those of ssn6? # # ) log( control test E / E r  ) log( control test E / E r  ade1 ssn1 μ = 0.2366  = 1.9854 μ = -5.5 · 10-5  = 0.1434 ?
  • 21. Student’s T-test t-statistic: difference between means in units of average error Significance can be translated into p-value (probability) assuming normal distributions http://www.physics.csbsju.edu/stats/t-test.html
  • 22. History: W. S. Gossett [1876-1937] • The t-test was developed by W. S. Gossett, a statistician employed at the Guinness brewery. However, because the brewery did not allow employees to publish their research, Gossett's work on the t-test appears under the name "Student" (and the t-test is sometimes referred to as "Student's t-test.") Gossett was a chemist and was responsible for developing procedures for ensuring the similarity of batches of Guinness. The t-test was developed as a way of measuring how closely the yeast content of a particular batch of beer corresponded to the brewery's standard. http://ccnmtl.columbia.edu/projects/qmss/t_about.html
  • 23. Pearson correlations (Graphic) Comparing profiles X and Y (not distributions!): What is the tendency that high/low values in X match high/low values in Y? http://davidmlane.com/hyperstat/A34739.html Y = (y i ) X = (xi) Each dot is a pair (xi,yi)
  • 24. Pearson correlations: Formulae (simple version using z-scores) (complicated version) r
  • 25. Similarity according to all conditions (“Democratic vote”) Clustering- coefficient conditions 1 2 3 4 5 1 2 r12 ~ -1 gene 1 2 3 4 5 1 2 r12 ~ 0 gene 1 2 3 4 5 1 2 r12 ~ 1 gene Pearson correlations: Intuition
  • 26. Pearson correlations: Caution! High correlation does not necessarily mean co-linearity! r=0.8 r=0.8 r=0.8 r=0.8
  • 27. (Hierarchical Agglomerative) Clustering Join most correlated samples and replace correlations to remaining samples by average, then iterate … http://gepas.bioinfo.cipf.es/cgi-bin/tutoX?c=clustering/clustering.config
  • 28. Clustering of the real expression data
  • 30. K-means Clustering 2. Assign each data point to closest centroid 1. Start with random positions of centroids ( ) “guess” k=3 (# of clusters) http://en.wikipedia.org/wiki/K-means_algorithm
  • 31. Hierachical Clustering Plus: • Shows (re-orderd) data • Gives hierarchy Minus: • Does not work well for many genes (usually apply cut-off on fold-change) • Similarity over all genes/conditions • Clusters do not overlap
  • 32. Overview of “modular” analysis tools • Cheng Y and Church GM. Biclustering of expression data. (Proc Int Conf Intell Syst Mol Biol. 2000;8:93-103) • Getz G, Levine E, Domany E. Coupled two-way clustering analysis of gene microarray data. (Proc Natl Acad Sci U S A. 2000 Oct 24;97(22):12079-84) • Tanay A, Sharan R, Kupiec M, Shamir R. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. (Proc Natl Acad Sci U S A. 2004 Mar 2;101(9):2981-6) • Sheng Q, Moreau Y, De Moor B. Biclustering microarray data by Gibbs sampling. (Bioinformatics. 2003 Oct;19 Suppl 2:ii196-205) • Gasch AP and Eisen MB. Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. (Genome Biol. 2002 Oct 10;3(11):RESEARCH0059) • Hastie T, Tibshirani R, Eisen MB, Alizadeh A, Levy R, Staudt L, Chan WC, Botstein D, Brown P. 'Gene shaving' as a method for identifying distinct sets of genes with similar expression patterns. (Genome Biol. 2000;1(2):RESEARCH0003.) … and many more! http://serverdgm.unil.ch/bergmann/Publications/review.pdf
  • 33. How to “hear” the relevant genes? Song A Song B
  • 35. Inside CTWC: Iterations Depth Genes Samples Init G1 S1 1 G1(S1) G2,G3,…G5 S1(G1) S2,S3 2 G1(S2) G1(S3) G6,G7,….G13 G14,…G21 S1(G2) … S1(G5) S4,S5,S6 S10,S11 None 3 G2(S1)…G2(S3) … G5(S1)…G5(S3) G22… … …G97 S2(G1)…S2(G5) S3(G1)…S3(G5) S12,… …S51 4 G1(S4) … G1(S11) G98,..G105 … G151,..G160 S1(G6) … S1(G21) S52,... S67 5 G2(S4)...G2(S11) … G5(S4)...G5(S11) G161… … …G216 S2(G6)...S2(G21) S3(G6)…S3(G21) S68… …S113 Two-way clustering
  • 36. • No need for correlations! • decomposes data into “transcription modules” • integrates external information • allows for interspecies comparative analysis One example in more detail: The (Iterative) Signature Algorithm: J Ihmels, G Friedlander, SB, O Sarig, Y Ziv & N Barkai Nature Genetics (2002)
  • 37. Trip to the “Amazon”:
  • 38. 5 10 15 20 25 30 35 40 45 50 10 20 30 40 50 60 70 80 90 100 How to find related items? items customers re- commended items your choice customers with similar choice
  • 39. 5 10 15 20 25 30 35 40 45 50 10 20 30 40 50 60 70 80 90 100 How to find related genes? genes conditions similarly expressed genes your guess relevant conditions J Ihmels, G Friedlander, SB, O Sarig, Y Ziv & N Barkai Nature Genetics (2002)
  • 40. I G g gc G c E s   } : { C C C c c c C t s s C c S       c S c gc C c g E s s   } : { G G G g g g G t s s G g S       I G Signature Algorithm: Score definitions
  • 41. initial guesses (genes) thresholding: condition scores How to find related genes? Scores and thresholds!
  • 42. gene scores condition scores thresholding: How to find related genes? Scores and thresholds!
  • 43. gene scores condition scores thresholding: How to find related genes? Scores and thresholds!
  • 44. Iterative Signature Algorithm INPUT OUTPUT OUTPUT = INPUT “Transcription Module” SB, J Ihmels & N Barkai Physical Review E (2003)
  • 45. Identification of transcription modules using many random “seeds” random “seeds” Transcription modules Independent identification: Modules may overlap!
  • 46. New Tools: Module Visualization http://serverdgm.unil.ch/bergmann/Fibroblasts/visualiser.html
  • 47. Gene enrichment analysis The hypergeometric distribution f(M,A,K,T) gives the probability that K out of A genes with a particular annotation match with a module having M genes if there are T genes in total. http://en.wikipedia.org/wiki/Hypergeometric_distribution
  • 48. Decomposing expression data into annotated transcriptional modules identified >100 transcriptional modules in yeast: high functional consistency! many functional links “waiting” to be verified experimentally J Ihmels, SB & N Barkai Bioinformatics 2005