1. “A cross-border region where rivers connect, not divide” –
Interreg V-A Hungary-Croatia Co-operation Programme 2014-2020
Lecture 1
1
The title of this lecture is 'Systems biology for medicine' and it deals with getting and organizing the
big data in such a way that we can extract from them diagnostically and therapeutically relevant
information.
We will start with explanation what are the parts of biological systems and how they assemble
together in order to perform certain physiological functions.
2
You probably have this kind of a small kitchen aid machine and you know how it functions.
3
Depending on cutting piece inserted into machine you can cut the carrot on few different ways.
After some practice you'll be able to predict how carrot will look like if you use one or another
cutting piece.
In general, we can say - the function of this machine is to cut soft objects into predictable shapes.
4
It is easy to guess the function of simple machines, but if machine is built out of many parts it
becomes harder and harder.
How likely is that you would recognize snow blower on this image?
5
The more complex system gets it’s harder to predict all the possible functions performed by it.
2. “A cross-border region where rivers connect, not divide” –
Interreg V-A Hungary-Croatia Co-operation Programme 2014-2020
For example, the human body consist out of approximately 200 different types of cells and each of
this types is built from different number of molecules, which can be a couple of dozen or hundreds of
thousands - and they (cells) perform slightly different functions.
6
Systems biology is scientific discipline which had arisen after emergence of experimental methods
able to collect and analyse a big number of data at the same time, especially driven by emergence of
microarray techniques.
Systems biology is trying to understand how molecules mutually interact and associates together to
give rise to subcellular machinery that form functional units capable of actions that are needed for
cellular, tissue or organ level physiological functions.
The definition is given by Ravi Iyengar, one of the pioneers at the field of systems biology.
7
Up to emergence of microarray techniques scientists dealt with individual molecules and the most
often they investigated binary interactions between molecules.
Good example is discovery of haemoglobin structure and function which took 25 years.
The four peptide chains associates in order to bind, transfer and release 4 molecules of oxygen under
specific physiological conditions.
How it happens was concluded by using X-ray diffraction and protein crystallography.
Haemoglobin is crystallized and position of each atom is determined from diffraction of X-rays after
passing through crystals.
John Kendrew & Max Perutz got the Nobel price at 1962 for their discovery.
3. “A cross-border region where rivers connect, not divide” –
Interreg V-A Hungary-Croatia Co-operation Programme 2014-2020
Their technique is now used worldwide to determine the structure of large molecules, especially to
determine active site on enzymes and interaction between enzyme and substrate.
8
Systems biology is the molecular level of physiology.
We can explain that by looking at the insulin signalling pathway.
After injecting insulin at the blood stream of animal we can measure various physiological effects -
increased glucose uptake in skeletal muscle and fat tissue, increased glycogen synthesis in muscle,
decreased hepatic glucose production and decreased lipolysis in adipocytes.
In general, we can say that purpose of insulin signalling is to enhance glucose cellular uptake and to
stimulate cell metabolism.
If we look at the molecular level - we see the flow of information from outside of the cell to certain
intracellular metabolic pathways which is executed through a number of molecular interactions.
9
Molecular interactions obey the law of mass action and proceed till chemical equilibrium.
It means that each molecular interaction can be presented as mathematical equation.
We have to know initial concentrations of ligands/substrates and receptors/enzymes as well as rates
of forward and reverse reactions.
If we know all initial parameters we could compute how product of enzyme reactions is formed with
respect to time.
4. “A cross-border region where rivers connect, not divide” –
Interreg V-A Hungary-Croatia Co-operation Programme 2014-2020
10
Entire signalling pathway could be written as a series of ordinary differential equations.
We can mathematically present each molecular interaction so that product of one reaction is used in
following reaction, pathways can finish with end product or come back onto receptor.
Also, entire signalling pathway could be integrated in computational model.
11
Stepwise discovering of individual interactions is so called bottom-up approach.
Such approach is widely used in 20th century and resulted with seminal understanding of biological
processes.
The bottom-up approach uses hypothesis driven studies.
The success of this approach is manifested through the number of Nobel prizes awarded for the
discoveries of individual molecules and their interactions in signalling pathways.
12
Each discovery was further contributing to complexity.
Pathways gradually rose to networks.
In order to predict outcome of network functioning, system biology is using computational models.
13
Up to now we can conclude:
5. “A cross-border region where rivers connect, not divide” –
Interreg V-A Hungary-Croatia Co-operation Programme 2014-2020
1. Systems biology builds on molecular biology, biochemistry and cell biology and uses already
collected knowledge about biological interactions.
2. Systems biology integrates existing knowledge from many experiments in computational
models .
Purpose for using computational models is to find functions coming out from complex
interactions not predicted by looking at individual components (e. g. presence of switch in a
signaling network, robustness…)
14
A switch in a signalling network was described for the first time in Bhalla and Iyengar's paper.
The authors were investigating mutually interacting pathways PLCγ-PKC and Ras-Raf-MAPK using in
parallel experimental proof and computational model.
Both pathways start with EGF receptor and after ligand binds to receptor it causes rise in activity of
MAPK which gradually decreases.
Just and only if levels of ligand EFG rise above 5 nM, activity of MAPK do not come back to baseline
levels, but remain high and cell starts to divide.
The switch is based on integration of two pathways into joined network and its purpose is to switch
the cell from one mode of operation into another - from resting state to division.
15
Another behaviour which is noticed after computing entire networks was existence of positive and
negative feedback loops.
6. “A cross-border region where rivers connect, not divide” –
Interreg V-A Hungary-Croatia Co-operation Programme 2014-2020
Presence of positive and negative regulators enables cell to maintain same level of physiological
functions under wide range of external stimuli what we recognize as flexibility and robustness.
16
Besides building computational models, field of systems biology heavily relies on experiments that
collect a large number of data at the same time, measuring many molecular entities at once, like in
micro-array studies.
Such a type of studies is so called top-down approach and is not base on initial hypothesis, but we
should rather say that hypothesis get generated out of big data.
17
One of the first papers using the big data was investigating fibroblast response to serum.
Microarrays used for this study contained probes for 8613 human genes and had been measuring
9996 elements at the same time.
Library of cDNA molecules from serum deprived cells was labelled by Cy3, while library of cDNA from
serum exposed cells was labelled by Cy5.
Overlap of two equally represented probes, yellow in colour, was reflecting lack of change, while
appearance of red or green spot was interpreted as difference caused by induction.
Total number of 517 genes changed expression after serum exposure and contributed to change of
cellular operation- from resting state to proliferation.
18
Molecules commonly used for generating the big data are
- DNA sequances represented as DNA probes for genes or probes for non-coding sequences
- variuous classes of RNAs
7. “A cross-border region where rivers connect, not divide” –
Interreg V-A Hungary-Croatia Co-operation Programme 2014-2020
- proteins
- cellular lipids
- and small molecules produced by metabolism - so called metabolites.
19
Gathering the big data is usually recognized by suffix 'omics'.
Omics is any experimental approach which simultaneously measures many individual entities at the
same time point.
Omics type of experiments use at least two sets of big data which were collected in different time
points or under different conditions or has any other difference measurable by statistical method.
Representative examples of omic studies are genomics which collects data about genes involved in
physiological processes, proteomics which results in data about co-expressed genes and
metabolomics which list all metabolites produces by cellular machinery at the given time point.
20
In order to find relevant data from a large database, data should be organized according to certain
principles.
Bioinformatics is a scientific discipline that uses mostly mathematical tools in organizing the big data
what enables further comparison or search.
A large number of databases is already publicly available, for example GEO (results of mRNA
profiling), Target Scan (data about microRNA), Swiss-Prot (a large protein base), OMIM (the list and
description of disease genes), DbGAP (results of Genome-wide association studies).
21
There are two possible ways of using data from a big data base: the first is to generate list of entities
that are statistically co-related in the same base(e.g. all co-expressed proteins at starvation)
8. “A cross-border region where rivers connect, not divide” –
Interreg V-A Hungary-Croatia Co-operation Programme 2014-2020
the second is generating list of statistically co-related entities between different bases – e.g. all
expressed mRNAs in Down syndrome.
22
In conclusion:
1. Systems biology builds on biochemistry, molecular biology and cell biology
2. Systems biology uses experiments designed in OMICs style.
3. The further analysis of collected big data uses statistical methods, bioinformatics approach and
different types of modelling.
4. The main goal of systems biology is to discover hidden mechanisms and functions that derive from
the system as a whole and which cannot be detected by observing individual parts.
23
At the end - I hope that you can guess what is the function of machine from this image in the field of
systems biology.