SlideShare a Scribd company logo
1 of 72
Tales from BioLand
Engineering Challenges in
the World of Life Sciences
Prof. Alfredo Benso
Politecnico di Torino, Torino, Italy
www.sysbio.polito.it
ICIIBMS - International Conference on
Intelligent Informatics and BioMedical
Sciences
Engineering vs Life
Sciences
Engineering is for BUILDING systems
Life Sciences are for UNDERSTANDING
(REVERSE ENGINEERING) life
• “Reverse engineering is the process of discovering the
functional principles of a device, object, or system
through analysis of its structure, function, and
operation”.
• “Reverse engineering, is the processes of extracting
knowledge from something and reproducing it or
reproducing anything, based on the extracted
information”
Reverse Engineering
Example
Enigma machine
DESIGNING AND REVERSE ENGINEER A
SYSTEM ARE TWO OPPOSITE TASKS
WHOSE COMPLEXITY MAY DIFFER BY
ORDERS OF MAGNITUDE
I was once asked “So is it easier to
synthesize an organism than
understanding it?”
“What I cannot create, I do not understand”, Feynman 1988
Reductionism (Bottom-up)
• Understanding of the parts means
understanding of the whole
• Focus on parts
The properties of the whole
system can be explained in terms
of its parts.
Reverse Engineering
Reductionism vs holism
Holism (Top-down)
• To understand the whole we must
understand also the relations
between the parts in the whole
• Focus on relationships
The system cannot be explained by
component parts alone.
Instead, the system as a whole
determines how the parts behave.
Reverse Engineering
BOTTOM-UP / REDUCTIONIST
DATA driven
SYSTEMIC / TOP-DOWN /
HOLISTIC
MODEL driven
Pathway / biological networks
Genes
OUTLINE
The Methodology Challenge
The Data Challenge
The Modelling Challenge
The Methodology
Challenge
Problem
DOES THE COMPLEXITY (IN MATHEMATICAL TERMS) OF
THE SYSTEM DRIVE THE METHODOLOGICAL APPROACH
TO BE USED TO UNDERSTAND (REVERSE ENGINEER)
IT?
The Methodology
Challenge
YES!!!!!!!
Linear vs Complex Systems
The Methodology
Challenge
• We can consider the effect of each system component
(“variable”) separately, because the sum of their
effects equals the effect of their sum.
Linear Systems
The Methodology
Challenge
• Enzime / Substrate
• Simple pharmacokinetics
• Newtonian Physics
Linear Systems
Examples
The Methodology
Challenge
• Linear systems are easy to understand also for
non-mathematicians and are also easy to visualize.
• For this reason a large part of the LS world (and the
medical one in particular) still reasons in linear
terms.
• Linearity is rare in biological systems!
Linear Systems
The Methodology
Challenge
• Properties emerge from the interaction of
their parts (and cannot be predicted only
from the properties of the parts).
Complex Systems
• Complex Systems’ dynamics heavily depend on initial
conditions and perturbations (the butterfly effect…..)
The Methodology Challenge
Complex Systems
Examples
Is MUSIC
discoverable by
studying the record
and the player
separately?
Is cell LIFE
understandable by
separately studying
its internal
components?
• Can we learn how a car engine work just by studying
(some) of its individual components separately?
Issue
The Methodology
Challenge
Linear Systems ➜ Reductionism / Holism
Complex Systems ➜ Reductionism / Holism
The Methodology
Challenge
It would NOT BE ABLE TO
IDENTIFY properties that
emerge from the interactions
between its parts
Linear vs Complex Systems
The Methodology
Challenge
CHALLENGES
• Middle ground between data driven and model driven
approaches
• Cross-fertilization between biology and physics,
computer science, mathematics, chemistry, and
engineering.
The Methodology
Challenge
The role of
Systems Biology
The Methodology
Challenge
raw DATA MODEL
The Systems Biology
Lifecycle
decomposition
and
localization
Dynamic
modeling
The Methodology
Challenge
SIMULATION
New Hypothesis
New biological
questions
Model refinementData recomposition
• Systems in this context generally are modeled as large
networks of integrated components exhibiting non-linear
dynamical interactions.
• Protein-Protein-Interaction
• Gene Regulatory Networks
• Metabolic Networks
• Interactomes
A shift in perspective
The Methodology
Challenge
A shift in perspective
• In 1998, Oltvai, a cell biologist, and Barabasi, a physicist
which was studying the structure of internet, were home
neighbors in Chicago
• At the time, Barabasi had already shown that internet is a
non-random network, and that its connectivity structure
influences its function
• One year later, in 1999, they proved that the metabolic
pathways of yeast define a network whose structure is
very similar to that of internet.
Then…
• HUBs
• P53
• Motifs
• ….
The Methodology
Challenge
• High throughput DATA (NGS, ‘omics’, imaging, …) is
the “facilitator”.
• The technology to create this data (Biotechnology) is the
key. The “wider” and the better we see, the better we can
understand how systems work.
The Methodology Challenge
Systems
Biology
Systems
Medicine
Personalized
Medicine
The Methodology
Challenge
The Data Challenge
 Size
 Heterogeneity
 Curse of dimensionality
 Ownership / Ethics
 Falsification
Issues
The Data Challenge
Size
 Heterogeneity
 Curse of dimensionality
 Ownership / Ethics
 Falsification
Issues
The Data Challenge
The Cost of
Sequencing DNA Has
Fallen Over 100,000x
in the Last Ten Years
The Data Challenge: size
• Enormous Density
⁃ 1000x Ocean Water
• Highly Dynamic Microbial Ecology
⁃ Hundreds to Thousands of Species
• Horizontal Gene Transfer
• Adaptive Selection Pressures (Immune System)
⁃ Innate and Adaptive Immune System
⁃ Macrophages and Antimicrobial proteins
• Constantly Changing Environmental Pressures
⁃ Diet
⁃ Antibiotics
⁃ Pharmaceuticals
The human Microbiome
The Data Challenge: size
Your Microbiome is
Your “Near-Body” Environment
and its Cells
Contain 200-2000x
as Many DNA Genes
As Your Human Cells
More Microbe Cells Than Human Cells
DNA-bearing Cells in Your Body
The Data Challenge: size
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
Human Genome Metagenomics
To Map Out the Dynamics of Autoimmune Microbiome Ecology
Couples Next Generation Genome Sequencers to Big Data Supercomputers
Source: Weizhong Li, UCSD
UCSD Team Used 25 CPU-years
to Compute
Comparative Gut Microbiomes
Starting From
2.7 Trillion DNA Bases
of Healthy and IBD Subjects samples
Illumina HiSeq 2000 at JCVI
SDSC Gordon Data Supercomputer
Using Machine Learning to Determine Major Differences
Between Gut Microbiome in Health and Disease,
Mehrdad Y. et al., IEEE International Conference on Big Data (December 5-8,
2016)
In a “Healthy” Gut Microbiome:
Large Taxonomy Variation, Low Protein Family Variation
Source: Nature, 486, 207-212 (2012)
Over 200 People
 Size
Heterogeneity
 Curse of dimensionality
 Ownership / Ethics
 Falsification
Issues
The Data Challenge
Issues
Sources
180+ Bio/related ONLINE
Databases [NAR 2016]
Custom / Proprietary data
Medical data
The Data Challenge: heterogeneity
• Many de-facto standards
• No compatibility
• Overlaps
• Validation / Quality
 Size
 Heterogeneity
Curse of dimensionality
 Ownership / Ethics
 Falsification
Issues
The Data Challenge
Overfitting
Many variables & Few
samples
Machine Learning issues
 Size
 Heterogeneity
 Curse of dimensionality
Ownership / Ethics
 Falsification
Issues
The Data Challenge
Non reproducible results
No comparable
approaches
Difficult improvements
Private ≠ Not sharable
Many data sets are not
shared
 Size
 Heterogeneity
 Curse of dimensionality
 Ownership / Ethics
Falsification
Issues
The Data Challenge
• How do a biologist demonstrates in a paper that
he/she actually performed an experiment?
Falsification
The Data Challenge: falsification
What is a scientific fraud?
Experiment was never performed
Experiment was performed
Fictious data
Altered data
Duplicated data
FABRICATION
FALSIFICATION
PLAGIARISM
The Data Challenge: falsification
Example 1: reusing panels
The Data Challenge: falsification
Example 2: ROI reusing
The Data Challenge: falsification
Example 3: repeated patterns
The Data Challenge: falsification
Image manipulation by country
The manipulation rate is around 6% in general, around 17% for papers containing at
least 1 gel image
R² = 0.7138
0
5
10
15
20
25
0 50 100 150 200 250 300
ManipulatedPapers
Overall Examined Papers
Enrico Bucci, PhD, SCI 2017 – Paestum, September 11,
Experiment 1
1364 random papers
(5000 images) from
PMC, published in Jan
2014
Evolution of manipulation rate over 5 years
Increase from 6.5% to 13.1% of manipulated images content over 5 years
Submission rate increases 10 times.
20 papers were found to reuse previously published images.
Enrico Bucci, PhD, SCI 2017 – Paestum, September 11,
Experiment 2
1546 papers published
in Cell Death and
Disease (NPG) from
2010-2014
Manipulation frequency per
originating country
25.3%
20.6%
17.9% 16.7%
8.3% 7.1% 3.7%
% of submitted papers which were flagged (for countries
submitting at least 10 manuscripts)
Enrico Bucci, PhD, SCI 2017 – Paestum, September 11,
Country 1 Country 2 Country 3 Country 4 Country 5 Country 6 Japan 
Experiment 3
Submitted manuscripts
from Jun to Aug 2017 to
5 independent journals.
20% of manipulated
images
WE DO
PRECISION GUESS WORK
BASED ON
(SOMETIMES) UNRELIABLE
DATA!!!
The Data Challenge
CHALLENGES
The Data Challenge
• De-facto standards are dangerous
The Data Challenge
What happens when
the standard is
“unilaterally” changed
by its “owner” ?
Past versions may
become unusable,
and published work
obsolete
The Data Challenge
BioPAX
• Standardization Authority (like IEEE, ISO, …) for
standardization of:
⁃ ACCESSION (Entrez GeneId, mirBASE accession, ….)
⁃ NAMING (Hugo)
⁃ DATA EXCHANGE (MIAME for Microarrays)
⁃ …
The Data Challenge
raw DATA MODEL
The Systems Biology
Lifecycle
The Data Challenge
SIMULATION
decomposition
and
localization
Dynamic
modeling
New Hypothesis
New biological
questions
Model refinementData recomposition
• Common practice (in other fields) to compare
algorithms on the same data sets (E.g. ISCAS circuits,
SPEC software)
• They increase competitiveness, comparability,
reproducibility, REUSABILITY
• There are efforts (BioPerf, BAliBASE, Affycomp), but
there is a general lacks of “benchmarking culture”.
The Data Challenge
raw DATA MODEL
The Systems Biology
Lifecycle
The Data Challenge
SIMULATION
decomposition
and
localization
Dynamic
modeling
New Hypothesis
New biological
questions
Model refinementData recomposition
The Model Challenge
Model requirements
HIERARCHY
ENCAPSULATION
DYNAMICS!!!!
STOCHASTICITY
SPATIALITY
MOBILITY
The Modelling Challenge
SELECTIVE COMMUNICATION
“A Whole-Cell Computational Model Predicts Phenotype from Genotype”
A model of
Mycoplasma genitalium,
• 525 genes
• Using 1,900 experimental
observations from 900
studies, they created the
software model, which requires
128 computers to run.
KNOWLEDGE BASES vs. PREDICTION
DATA-DRIVEN vs. HYPOTHESIS-BASED
Modeling approaches
The Modelling Challenge
SYSTEMS BIOLOGY
 Granularity
 Pathways (KEGG, Reactome, Ingenuity, …)
 Only Gene-2-Gene networks. Missing miRNA interactions,
TF, lncRNA, …
 Scalability
 Boolean Networks
 2 values logic to model a continuous phenomena
(expression)
 But…. Simulation complexity grows exponentially!!!
• 100 genes = 2^100 = 10^30 states ➔ N^100
 The Data challenge!
Issues
The Modelling Challenge
MODEL CHALLENGES
The Modelling Challenge
• Machine Learning is the ability of computer systems to
infer their own knowledge, by extracting patterns from
raw data.
• Deep Learning (DL) avoids the need for human
operators to formally specify all of the knowledge
• DL achieves great power and flexibility by representing
the world as a hierarchy of concepts self-generated.
MODEL CHALLENGES
The Modelling Challenge
• Traditional ML approaches might have to be
optimized to adapt to the peculiar characteristics of
biological data (eg. curse of dimensionality)
• Often parameter-driven
• Benchmarks are needed
MODEL CHALLENGES
The Modelling Challenge
SHAPE
Reinhardtius hippoglossoidesPleuronectes platessa
How to make decision?
MORPH. DETAILS etc ...TEXTURE
The Modelling Challenge
F.I.S.HUB
Knowledge-Based: features are
hardcoded into the classifier.
Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
wrong Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
es
Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
m is the metric
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
wrong Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
wrongamples
Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
m is the metric
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
wrongExamples
Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
m is the metric
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
wrongamples
Sardina pilchardus
v.s. Sprattus sprattus:
m= 1,129
Hippoglossus hippoglossus
v.s. Microstomus kitt:
m=1,612
m is the metric
Merlangius merlangus
v.s. Pollachius virens:
m= 1,741
wrong
I.S.HUB – Classifier results
Deep Learning: features are discovered
by the neural network
The Modelling Challenge
25 spieces
> 15k
photos
UK & IT Acc. > 92%
Imagine ...
• The number of variables is so huge that we can
easily picture parts of the landscape that look (to
us) almost identical, but may be different in
small details.
The Modelling Challenge
raw DATA MODEL
The Systems Biology
Lifecycle
The Modelling Challenge
SIMULATION
decomposition
and
localization
Dynamic
modeling
New Hypothesis
New biological
questions
Model refinementData recomposition
• Multidisciplinary Teams/Individuals
• Planning Education
• Good/Bad Practices
• Fill the language gap
MODEL CHALLENGES
The Modelling Challenge
• Benso A.; Di Carlo S.; Politano G.; Savino A.; Bucci E.
Alice in "Bio-land": engineering challenges in the
world of Life-Sciences IT PROFESSIONAL, Vol.16,
pp.38-47, ISSN: 1520-9202,
DOI: 10.1109/MITP.2014.45
Related readings
Questions?

More Related Content

What's hot

Advances in prokaryote classification from microscopic images
Advances in prokaryote classification from microscopic imagesAdvances in prokaryote classification from microscopic images
Advances in prokaryote classification from microscopic imagesecij
 
The Seventh Annual BEACON Symposium and Technology fair bionanotechology
The Seventh Annual BEACON Symposium and Technology fair bionanotechologyThe Seventh Annual BEACON Symposium and Technology fair bionanotechology
The Seventh Annual BEACON Symposium and Technology fair bionanotechologyBokani Mtengi
 
Video-based social network analysis data collection in sport -Mariusz Karbowski
Video-based social network analysis data collection in sport -Mariusz Karbowski Video-based social network analysis data collection in sport -Mariusz Karbowski
Video-based social network analysis data collection in sport -Mariusz Karbowski BIZNES SOCIAL NETWORK ANALYSIS
 
NetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuNetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuAlexander Pico
 
Dr. Scott Webster Resume - 2-27-2019
Dr. Scott Webster   Resume - 2-27-2019Dr. Scott Webster   Resume - 2-27-2019
Dr. Scott Webster Resume - 2-27-2019Scott Webster
 

What's hot (9)

Categorias
CategoriasCategorias
Categorias
 
An Introduction to Biology with Computers
An Introduction to Biology with ComputersAn Introduction to Biology with Computers
An Introduction to Biology with Computers
 
Advances in prokaryote classification from microscopic images
Advances in prokaryote classification from microscopic imagesAdvances in prokaryote classification from microscopic images
Advances in prokaryote classification from microscopic images
 
Final Poster
Final PosterFinal Poster
Final Poster
 
The Seventh Annual BEACON Symposium and Technology fair bionanotechology
The Seventh Annual BEACON Symposium and Technology fair bionanotechologyThe Seventh Annual BEACON Symposium and Technology fair bionanotechology
The Seventh Annual BEACON Symposium and Technology fair bionanotechology
 
Video-based social network analysis data collection in sport -Mariusz Karbowski
Video-based social network analysis data collection in sport -Mariusz Karbowski Video-based social network analysis data collection in sport -Mariusz Karbowski
Video-based social network analysis data collection in sport -Mariusz Karbowski
 
NetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang SuNetBioSIG2013-Talk Gang Su
NetBioSIG2013-Talk Gang Su
 
cover letter
cover lettercover letter
cover letter
 
Dr. Scott Webster Resume - 2-27-2019
Dr. Scott Webster   Resume - 2-27-2019Dr. Scott Webster   Resume - 2-27-2019
Dr. Scott Webster Resume - 2-27-2019
 

Similar to Tales from BioLand - Engineering Challenges in the World of Life Sciences

BRN Seminar 12/06/14 From Systems Biology
BRN Seminar 12/06/14 From Systems Biology BRN Seminar 12/06/14 From Systems Biology
BRN Seminar 12/06/14 From Systems Biology brnmomentum
 
From systems biology
From systems biologyFrom systems biology
From systems biologybrnbarcelona
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcUSD Bioinformatics
 
International Journal of Biometrics and Bioinformatics(IJBB) Volume (3) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (3) Issue...International Journal of Biometrics and Bioinformatics(IJBB) Volume (3) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (3) Issue...CSCJournals
 
Can machines understand the scientific literature
Can machines understand the scientific literatureCan machines understand the scientific literature
Can machines understand the scientific literaturepetermurrayrust
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Amit Sheth
 
The Impact of Information Technology on Chemistry and Related Sciences
The Impact of Information Technology on Chemistry and Related SciencesThe Impact of Information Technology on Chemistry and Related Sciences
The Impact of Information Technology on Chemistry and Related SciencesAshutosh Jogalekar
 
BEACON 101: Sequencing tech
BEACON 101: Sequencing techBEACON 101: Sequencing tech
BEACON 101: Sequencing techc.titus.brown
 
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptxSmart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptxJohn Smart
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08Russ Altman
 
Eko Artificial Life, Determinacy of Ecological Resilience and Classification ...
Eko Artificial Life, Determinacy of Ecological Resilience and Classification ...Eko Artificial Life, Determinacy of Ecological Resilience and Classification ...
Eko Artificial Life, Determinacy of Ecological Resilience and Classification ...ijtsrd
 
International Bartlett Lecture Final
International Bartlett Lecture FinalInternational Bartlett Lecture Final
International Bartlett Lecture FinalRachel Armstrong
 
Solez Update on the Technology and Future of Medicine Course: Space, Regenera...
Solez Update on the Technology and Future of Medicine Course: Space, Regenera...Solez Update on the Technology and Future of Medicine Course: Space, Regenera...
Solez Update on the Technology and Future of Medicine Course: Space, Regenera...Kim Solez ,
 

Similar to Tales from BioLand - Engineering Challenges in the World of Life Sciences (20)

BRN Seminar 12/06/14 From Systems Biology
BRN Seminar 12/06/14 From Systems Biology BRN Seminar 12/06/14 From Systems Biology
BRN Seminar 12/06/14 From Systems Biology
 
From systems biology
From systems biologyFrom systems biology
From systems biology
 
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systemsA01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmc
 
PhDc exam presentation
PhDc exam presentationPhDc exam presentation
PhDc exam presentation
 
International Journal of Biometrics and Bioinformatics(IJBB) Volume (3) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (3) Issue...International Journal of Biometrics and Bioinformatics(IJBB) Volume (3) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (3) Issue...
 
Can machines understand the scientific literature
Can machines understand the scientific literatureCan machines understand the scientific literature
Can machines understand the scientific literature
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
 
K Zoo
K ZooK Zoo
K Zoo
 
The Impact of Information Technology on Chemistry and Related Sciences
The Impact of Information Technology on Chemistry and Related SciencesThe Impact of Information Technology on Chemistry and Related Sciences
The Impact of Information Technology on Chemistry and Related Sciences
 
BEACON 101: Sequencing tech
BEACON 101: Sequencing techBEACON 101: Sequencing tech
BEACON 101: Sequencing tech
 
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptxSmart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
Smart-GoodnessOfTheUniverse-SteppingIntoFuture2022NEW.pptx
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08
 
Eko Artificial Life, Determinacy of Ecological Resilience and Classification ...
Eko Artificial Life, Determinacy of Ecological Resilience and Classification ...Eko Artificial Life, Determinacy of Ecological Resilience and Classification ...
Eko Artificial Life, Determinacy of Ecological Resilience and Classification ...
 
International Bartlett Lecture Final
International Bartlett Lecture FinalInternational Bartlett Lecture Final
International Bartlett Lecture Final
 
eScience-School-Oct2012-Campinas-Brazil
eScience-School-Oct2012-Campinas-BrazileScience-School-Oct2012-Campinas-Brazil
eScience-School-Oct2012-Campinas-Brazil
 
Solez Update on the Technology and Future of Medicine Course: Space, Regenera...
Solez Update on the Technology and Future of Medicine Course: Space, Regenera...Solez Update on the Technology and Future of Medicine Course: Space, Regenera...
Solez Update on the Technology and Future of Medicine Course: Space, Regenera...
 
Interdisciplinarity and complexity as opportunities for research innovation i...
Interdisciplinarity and complexity as opportunities for research innovation i...Interdisciplinarity and complexity as opportunities for research innovation i...
Interdisciplinarity and complexity as opportunities for research innovation i...
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bio ontology drtc-seminar_anwesha
Bio ontology drtc-seminar_anweshaBio ontology drtc-seminar_anwesha
Bio ontology drtc-seminar_anwesha
 

Recently uploaded

Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxdharshini369nike
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaPraksha3
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxpriyankatabhane
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 

Recently uploaded (20)

Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptx
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tantaDashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
Dashanga agada a formulation of Agada tantra dealt in 3 Rd year bams agada tanta
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptxMicrophone- characteristics,carbon microphone, dynamic microphone.pptx
Microphone- characteristics,carbon microphone, dynamic microphone.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 

Tales from BioLand - Engineering Challenges in the World of Life Sciences

  • 1. Tales from BioLand Engineering Challenges in the World of Life Sciences Prof. Alfredo Benso Politecnico di Torino, Torino, Italy www.sysbio.polito.it ICIIBMS - International Conference on Intelligent Informatics and BioMedical Sciences
  • 3. Engineering is for BUILDING systems Life Sciences are for UNDERSTANDING (REVERSE ENGINEERING) life
  • 4. • “Reverse engineering is the process of discovering the functional principles of a device, object, or system through analysis of its structure, function, and operation”. • “Reverse engineering, is the processes of extracting knowledge from something and reproducing it or reproducing anything, based on the extracted information” Reverse Engineering
  • 5. Example Enigma machine DESIGNING AND REVERSE ENGINEER A SYSTEM ARE TWO OPPOSITE TASKS WHOSE COMPLEXITY MAY DIFFER BY ORDERS OF MAGNITUDE
  • 6. I was once asked “So is it easier to synthesize an organism than understanding it?” “What I cannot create, I do not understand”, Feynman 1988
  • 7.
  • 8. Reductionism (Bottom-up) • Understanding of the parts means understanding of the whole • Focus on parts The properties of the whole system can be explained in terms of its parts. Reverse Engineering Reductionism vs holism Holism (Top-down) • To understand the whole we must understand also the relations between the parts in the whole • Focus on relationships The system cannot be explained by component parts alone. Instead, the system as a whole determines how the parts behave.
  • 9. Reverse Engineering BOTTOM-UP / REDUCTIONIST DATA driven SYSTEMIC / TOP-DOWN / HOLISTIC MODEL driven Pathway / biological networks Genes
  • 10. OUTLINE The Methodology Challenge The Data Challenge The Modelling Challenge
  • 12. Problem DOES THE COMPLEXITY (IN MATHEMATICAL TERMS) OF THE SYSTEM DRIVE THE METHODOLOGICAL APPROACH TO BE USED TO UNDERSTAND (REVERSE ENGINEER) IT? The Methodology Challenge YES!!!!!!!
  • 13. Linear vs Complex Systems The Methodology Challenge
  • 14. • We can consider the effect of each system component (“variable”) separately, because the sum of their effects equals the effect of their sum. Linear Systems The Methodology Challenge
  • 15. • Enzime / Substrate • Simple pharmacokinetics • Newtonian Physics Linear Systems Examples The Methodology Challenge
  • 16. • Linear systems are easy to understand also for non-mathematicians and are also easy to visualize. • For this reason a large part of the LS world (and the medical one in particular) still reasons in linear terms. • Linearity is rare in biological systems! Linear Systems The Methodology Challenge
  • 17. • Properties emerge from the interaction of their parts (and cannot be predicted only from the properties of the parts). Complex Systems • Complex Systems’ dynamics heavily depend on initial conditions and perturbations (the butterfly effect…..) The Methodology Challenge
  • 18. Complex Systems Examples Is MUSIC discoverable by studying the record and the player separately? Is cell LIFE understandable by separately studying its internal components?
  • 19. • Can we learn how a car engine work just by studying (some) of its individual components separately? Issue The Methodology Challenge
  • 20. Linear Systems ➜ Reductionism / Holism Complex Systems ➜ Reductionism / Holism The Methodology Challenge It would NOT BE ABLE TO IDENTIFY properties that emerge from the interactions between its parts Linear vs Complex Systems
  • 22. • Middle ground between data driven and model driven approaches • Cross-fertilization between biology and physics, computer science, mathematics, chemistry, and engineering. The Methodology Challenge
  • 23. The role of Systems Biology The Methodology Challenge
  • 24. raw DATA MODEL The Systems Biology Lifecycle decomposition and localization Dynamic modeling The Methodology Challenge SIMULATION New Hypothesis New biological questions Model refinementData recomposition
  • 25. • Systems in this context generally are modeled as large networks of integrated components exhibiting non-linear dynamical interactions. • Protein-Protein-Interaction • Gene Regulatory Networks • Metabolic Networks • Interactomes A shift in perspective The Methodology Challenge
  • 26. A shift in perspective • In 1998, Oltvai, a cell biologist, and Barabasi, a physicist which was studying the structure of internet, were home neighbors in Chicago • At the time, Barabasi had already shown that internet is a non-random network, and that its connectivity structure influences its function • One year later, in 1999, they proved that the metabolic pathways of yeast define a network whose structure is very similar to that of internet. Then… • HUBs • P53 • Motifs • …. The Methodology Challenge
  • 27. • High throughput DATA (NGS, ‘omics’, imaging, …) is the “facilitator”. • The technology to create this data (Biotechnology) is the key. The “wider” and the better we see, the better we can understand how systems work. The Methodology Challenge
  • 30.  Size  Heterogeneity  Curse of dimensionality  Ownership / Ethics  Falsification Issues The Data Challenge
  • 31. Size  Heterogeneity  Curse of dimensionality  Ownership / Ethics  Falsification Issues The Data Challenge
  • 32. The Cost of Sequencing DNA Has Fallen Over 100,000x in the Last Ten Years The Data Challenge: size
  • 33. • Enormous Density ⁃ 1000x Ocean Water • Highly Dynamic Microbial Ecology ⁃ Hundreds to Thousands of Species • Horizontal Gene Transfer • Adaptive Selection Pressures (Immune System) ⁃ Innate and Adaptive Immune System ⁃ Macrophages and Antimicrobial proteins • Constantly Changing Environmental Pressures ⁃ Diet ⁃ Antibiotics ⁃ Pharmaceuticals The human Microbiome The Data Challenge: size
  • 34. Your Microbiome is Your “Near-Body” Environment and its Cells Contain 200-2000x as Many DNA Genes As Your Human Cells More Microbe Cells Than Human Cells DNA-bearing Cells in Your Body The Data Challenge: size 0 500000 1000000 1500000 2000000 2500000 3000000 3500000 Human Genome Metagenomics
  • 35. To Map Out the Dynamics of Autoimmune Microbiome Ecology Couples Next Generation Genome Sequencers to Big Data Supercomputers Source: Weizhong Li, UCSD UCSD Team Used 25 CPU-years to Compute Comparative Gut Microbiomes Starting From 2.7 Trillion DNA Bases of Healthy and IBD Subjects samples Illumina HiSeq 2000 at JCVI SDSC Gordon Data Supercomputer Using Machine Learning to Determine Major Differences Between Gut Microbiome in Health and Disease, Mehrdad Y. et al., IEEE International Conference on Big Data (December 5-8, 2016)
  • 36. In a “Healthy” Gut Microbiome: Large Taxonomy Variation, Low Protein Family Variation Source: Nature, 486, 207-212 (2012) Over 200 People
  • 37.  Size Heterogeneity  Curse of dimensionality  Ownership / Ethics  Falsification Issues The Data Challenge
  • 38. Issues Sources 180+ Bio/related ONLINE Databases [NAR 2016] Custom / Proprietary data Medical data The Data Challenge: heterogeneity • Many de-facto standards • No compatibility • Overlaps • Validation / Quality
  • 39.  Size  Heterogeneity Curse of dimensionality  Ownership / Ethics  Falsification Issues The Data Challenge Overfitting Many variables & Few samples Machine Learning issues
  • 40.  Size  Heterogeneity  Curse of dimensionality Ownership / Ethics  Falsification Issues The Data Challenge Non reproducible results No comparable approaches Difficult improvements Private ≠ Not sharable Many data sets are not shared
  • 41.  Size  Heterogeneity  Curse of dimensionality  Ownership / Ethics Falsification Issues The Data Challenge
  • 42. • How do a biologist demonstrates in a paper that he/she actually performed an experiment? Falsification The Data Challenge: falsification
  • 43. What is a scientific fraud? Experiment was never performed Experiment was performed Fictious data Altered data Duplicated data FABRICATION FALSIFICATION PLAGIARISM The Data Challenge: falsification
  • 44. Example 1: reusing panels The Data Challenge: falsification
  • 45. Example 2: ROI reusing The Data Challenge: falsification
  • 46. Example 3: repeated patterns The Data Challenge: falsification
  • 47. Image manipulation by country The manipulation rate is around 6% in general, around 17% for papers containing at least 1 gel image R² = 0.7138 0 5 10 15 20 25 0 50 100 150 200 250 300 ManipulatedPapers Overall Examined Papers Enrico Bucci, PhD, SCI 2017 – Paestum, September 11, Experiment 1 1364 random papers (5000 images) from PMC, published in Jan 2014
  • 48. Evolution of manipulation rate over 5 years Increase from 6.5% to 13.1% of manipulated images content over 5 years Submission rate increases 10 times. 20 papers were found to reuse previously published images. Enrico Bucci, PhD, SCI 2017 – Paestum, September 11, Experiment 2 1546 papers published in Cell Death and Disease (NPG) from 2010-2014
  • 49. Manipulation frequency per originating country 25.3% 20.6% 17.9% 16.7% 8.3% 7.1% 3.7% % of submitted papers which were flagged (for countries submitting at least 10 manuscripts) Enrico Bucci, PhD, SCI 2017 – Paestum, September 11, Country 1 Country 2 Country 3 Country 4 Country 5 Country 6 Japan  Experiment 3 Submitted manuscripts from Jun to Aug 2017 to 5 independent journals. 20% of manipulated images
  • 50. WE DO PRECISION GUESS WORK BASED ON (SOMETIMES) UNRELIABLE DATA!!! The Data Challenge
  • 52. • De-facto standards are dangerous The Data Challenge What happens when the standard is “unilaterally” changed by its “owner” ? Past versions may become unusable, and published work obsolete
  • 54. • Standardization Authority (like IEEE, ISO, …) for standardization of: ⁃ ACCESSION (Entrez GeneId, mirBASE accession, ….) ⁃ NAMING (Hugo) ⁃ DATA EXCHANGE (MIAME for Microarrays) ⁃ … The Data Challenge
  • 55. raw DATA MODEL The Systems Biology Lifecycle The Data Challenge SIMULATION decomposition and localization Dynamic modeling New Hypothesis New biological questions Model refinementData recomposition
  • 56. • Common practice (in other fields) to compare algorithms on the same data sets (E.g. ISCAS circuits, SPEC software) • They increase competitiveness, comparability, reproducibility, REUSABILITY • There are efforts (BioPerf, BAliBASE, Affycomp), but there is a general lacks of “benchmarking culture”. The Data Challenge
  • 57. raw DATA MODEL The Systems Biology Lifecycle The Data Challenge SIMULATION decomposition and localization Dynamic modeling New Hypothesis New biological questions Model refinementData recomposition
  • 60. “A Whole-Cell Computational Model Predicts Phenotype from Genotype” A model of Mycoplasma genitalium, • 525 genes • Using 1,900 experimental observations from 900 studies, they created the software model, which requires 128 computers to run.
  • 61. KNOWLEDGE BASES vs. PREDICTION DATA-DRIVEN vs. HYPOTHESIS-BASED Modeling approaches The Modelling Challenge SYSTEMS BIOLOGY
  • 62.  Granularity  Pathways (KEGG, Reactome, Ingenuity, …)  Only Gene-2-Gene networks. Missing miRNA interactions, TF, lncRNA, …  Scalability  Boolean Networks  2 values logic to model a continuous phenomena (expression)  But…. Simulation complexity grows exponentially!!! • 100 genes = 2^100 = 10^30 states ➔ N^100  The Data challenge! Issues The Modelling Challenge
  • 64. • Machine Learning is the ability of computer systems to infer their own knowledge, by extracting patterns from raw data. • Deep Learning (DL) avoids the need for human operators to formally specify all of the knowledge • DL achieves great power and flexibility by representing the world as a hierarchy of concepts self-generated. MODEL CHALLENGES The Modelling Challenge
  • 65. • Traditional ML approaches might have to be optimized to adapt to the peculiar characteristics of biological data (eg. curse of dimensionality) • Often parameter-driven • Benchmarks are needed MODEL CHALLENGES The Modelling Challenge
  • 66. SHAPE Reinhardtius hippoglossoidesPleuronectes platessa How to make decision? MORPH. DETAILS etc ...TEXTURE The Modelling Challenge
  • 67. F.I.S.HUB Knowledge-Based: features are hardcoded into the classifier. Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 Merlangius merlangus v.s. Pollachius virens: m= 1,741 wrong Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 Merlangius merlangus v.s. Pollachius virens: m= 1,741 Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 Merlangius merlangus v.s. Pollachius virens: m= 1,741 Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 Merlangius merlangus v.s. Pollachius virens: m= 1,741 es Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 m is the metric Merlangius merlangus v.s. Pollachius virens: m= 1,741 wrong Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 Merlangius merlangus v.s. Pollachius virens: m= 1,741 Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 Merlangius merlangus v.s. Pollachius virens: m= 1,741 Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 Merlangius merlangus v.s. Pollachius virens: m= 1,741 Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 Merlangius merlangus v.s. Pollachius virens: m= 1,741 wrongamples Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 m is the metric Merlangius merlangus v.s. Pollachius virens: m= 1,741 wrongExamples Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 m is the metric Merlangius merlangus v.s. Pollachius virens: m= 1,741 wrongamples Sardina pilchardus v.s. Sprattus sprattus: m= 1,129 Hippoglossus hippoglossus v.s. Microstomus kitt: m=1,612 m is the metric Merlangius merlangus v.s. Pollachius virens: m= 1,741 wrong I.S.HUB – Classifier results Deep Learning: features are discovered by the neural network The Modelling Challenge 25 spieces > 15k photos UK & IT Acc. > 92%
  • 68. Imagine ... • The number of variables is so huge that we can easily picture parts of the landscape that look (to us) almost identical, but may be different in small details. The Modelling Challenge
  • 69. raw DATA MODEL The Systems Biology Lifecycle The Modelling Challenge SIMULATION decomposition and localization Dynamic modeling New Hypothesis New biological questions Model refinementData recomposition
  • 70. • Multidisciplinary Teams/Individuals • Planning Education • Good/Bad Practices • Fill the language gap MODEL CHALLENGES The Modelling Challenge
  • 71. • Benso A.; Di Carlo S.; Politano G.; Savino A.; Bucci E. Alice in "Bio-land": engineering challenges in the world of Life-Sciences IT PROFESSIONAL, Vol.16, pp.38-47, ISSN: 1520-9202, DOI: 10.1109/MITP.2014.45 Related readings