SlideShare a Scribd company logo
A biologists in e-Science? by Marco Roos Acknowledgements:  Scott Marshall, Edgar Meij, Sophia Katrenko, Willem van Hage,  Pieter Adriaans, Martijn Schuemie, Carole Goble, Dave de Roure,  Katy Wolstencroft, Andy Gibson, the myGrid and myExperiment teams,  many others who share their ideas, and… You! * Project or Area Liaison for OMII-UK (domain: Biology and Bioinformatics) BioAssist programmers meeting  November 17, 2008, Utrecht, The Netherlands
A priori   What does e-Science mean to you? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introducing myself A  biologist
My prime interest Structure and function of DNA in the nucleus Escherichia coli Mouse fibroblast (skin) cells
My C.V. before e-Science e-Science since 2003 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How did I end up here? ,[object Object],[object Object],[object Object],[object Object],[object Object]
Why should a biologist be interested in  e-science? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
My prime interest Structure and function of DNA in the nucleus Escherichia coli Mouse fibroblast (skin) cells
Components controlling structure & function of DNA
Connecting the dots (example: protein interaction network in yeast)
Biomedical knowledge repository PubMed statistics http://www.ncbi.nlm.nih.gov/entrez >17 million citations >400,000 added/year ~70,000 searches/month … Does not compute Does not fit
1070 databases   Nucleic Acids Research Jan 2008 (96 in Jan 2001) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What do I do? A   needy  biologist
‘ Old school’ Bioinformatics A typical bioinformatician
‘ Old school’ Bioinformatics A biologist behind a computer who (just) learned perl
/* * determines ridges in htm expression table */ #include &quot;ridge.h&quot; int selecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable) { char querystring[256]; sprintf(&quot;SELECT * FROM %s WHERE chrom = %s ORDER BY genstart&quot;, htmtablename, chromname); htmtable = PQexec(conn, querystring); return(validquery(htmtable, querystring)); } int is_ridge(PGresult *htmtable, int row, double exprthreshold, int mincount) /* determines if mincount genes in a row are (part of) a ridge */ /* pre: htmtable is valid and sorted on genStart (ascending) /* post:  { if (mincount<=0) return TRUE; if (row>=PQntuples(htmtable)) return FALSE; if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, &quot;movmed39expr&quot;)) < exprthreshold) {   return FALSE; } return(is_ridge(htmtable, ++row, exprthreshold, --mincount)); } int main() { PGconn *conn; /* holds database connection */ char querystring[256]; /* query string */ PGresult *result; int i; conn = PQconnectdb(&quot;dbname=htm port=6400 user=mroos password=geheim&quot;); if (PQstatus(conn)==CONNECTION_BAD) { fprintf(stderr, &quot;connection to database failed.&quot;); fprintf(stderr, &quot;%s&quot;, PQerrorMessage(conn)); exit(1); } else printf(&quot;Connection ok&quot;); sprintf(querystring, &quot;SELECT * FROM chromosomes&quot;); printf(&quot;%s&quot;, querystring); result = PQexec(conn, querystring); if (validquery(result, querystring)) { printresults(result); } else { PQclear(result); PQfinish(conn); return FALSE; } PQclear(result); PQfinish(conn); return TRUE; } int printresults(PGresult *tuples) { int i; for (i=0; i< PQntuples(tuples) && i < 10; i++) { printf(&quot;%d, &quot;, i); printf(&quot;%s&quot;, PQgetvalue(tuples,i,0)); } return TRUE; } int validquery(PGresult *result, char *querystring) { printf(&quot; in validquery&quot;); if (PQresultStatus(result) != PGRES_TUPLES_OK)  { printf(&quot;Query %s failed.&quot;, querystring); fprintf(stderr, &quot;Query %s failed.&quot;, querystring); return FALSE; } return TRUE; }
Theme Not an e-Science approach
The ‘spaghetti’ approach
Computational tools graveyard  rephrasing David Shotton
Database survival: <20% ‘no problems’
Data graveyard  quoting David Shotton
Why should a biologist be interested in  e-science? ,[object Object],[object Object]
Bridging biology and computer science ,[object Object],[object Object],[object Object],[object Object],[object Object]
Empowering biologists and bioinformaticians
How could we be empowered? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Experiment 1: Model based data integration Example: UCSC genome browser partOf * * Transcription Factor Binding Site
Experiment 2 ,[object Object],Roos, Marshall,  et al., ISMB/ECCB, Vienna, 2007
An e-science approach ,[object Object],[object Object],[object Object]
Which diseases are associated with my protein of interest ‘EZH2’
Biological knowledge extraction Biological question/model Computational experiment Extracted knowledge >17 million citations +400,000/yr
Combining expertise Edgar Meij Information retrieval expert
Combining expertise Sophia Katrenko Machine learning expert
Combining expertise Willem van Hage Semantic web expert (and bass guitar player)
Combining expertise Towards a knowledge framework Computer scientist and bioinformatician Scott Marshall
The  AIDA  toolbox, Web Services  for knowledge extraction  and knowledge management
e -Science collaboration AIDA toolbox
“ Collaboration through Web Services” Bio-text mining expert BioSemantics group, Erasmus University Rotterdam Martijn Schuemie
“ Collaboration through Web Services” Biological Database expert Hideaki Sugawara
“ Collaboration through Web Services” e -bioscientist
A nice experiment design
A not so nice experiment design
A workflow Protocol for a computational experiment
05/06/09 BioAID
05/06/09 BioAID
Sharing and publishing my designs
Bio AID Disease Discovery workflow 05/06/09 BioAID AIDA AIDA OMIM service  (Japan) AIDA ‘ Taverna shim’ Taverna ‘shim’
Bio AID Disease discovery workflow 05/06/09 BioAID
Bio AID Disease discovery workflow 05/06/09 BioAID
An  insightful  computational experiment
e -Science leveraging  the use of more brains Want this…
e -Science leveraging  the use of more brains … need this
Publish and share Publish & share research objects myExperiment >400 workflows >1000 registered users  (< 1yr) Run workflows without Taverna (expert feature) Open to objects other than workflows Link out to other resources
Do I feel all powerful now? An   e -biologist?
Tabular output ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Underestimated: The brain bottleneck
Empower me with a ‘virtual brain’ * From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34 My ws Your ws My ws Your ws My ws *
Workflow and Semantic Web Query Retrieve documents from Medline Extract proteins  ( Homo sapiens ) Calculate ranking scores Create biological cross references Convert to table (html) Add documents (IDs) to semantic model Add proteins to semantic model Add scores to semantic model Add cross references to semantic model Add query to semantic model
Do I feel all powerful now? An   e -biologist?
http://staff.science.uva.nl/~roos/ChromatinWorkgroup/
e -Laboratory factories
Conclusions How do we know when e-Science has succeeded? Not just  accelerated  but  new A. When everyone is using Grid computing? B. When scientists make scientific advances that would not have happened otherwise? Slide from ‘The New e-Science’ by Dave de  Roure
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How would  you  like to be empowered? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],You
Project and Area Liaison ,[object Object],[object Object],[object Object],[object Object],[object Object]
How many brains do you want to use? – One?
Some?
Many?
Use your community myGrid/myExperiment OMII-UK You
End of presentation... ,[object Object],[object Object],[object Object],[object Object]

More Related Content

What's hot

2015 balti-and-bioinformatics
2015 balti-and-bioinformatics2015 balti-and-bioinformatics
2015 balti-and-bioinformatics
c.titus.brown
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Carole Goble
 
Use of data
Use of dataUse of data
Use of data
Chris Evelo
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
Carole Goble
 
FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use Case
Rothamsted Research, UK
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Sciencedrnigam
 
Presentation cybernetics immunology-ver1.0 (for-criticism) - copy
Presentation cybernetics immunology-ver1.0 (for-criticism) - copyPresentation cybernetics immunology-ver1.0 (for-criticism) - copy
Presentation cybernetics immunology-ver1.0 (for-criticism) - copy
EmadFaragHABIB
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenisBOSC 2010
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
Carole Goble
 
Analysis with biological pathways:
Analysis with biological pathways: Analysis with biological pathways:
Analysis with biological pathways:
Chris Evelo
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
Carole Goble
 
Web Science - ISoLA 2012
Web Science - ISoLA 2012Web Science - ISoLA 2012
Web Science - ISoLA 2012
Mark Wilkinson
 
Better Data for a Better World
Better Data for a Better WorldBetter Data for a Better World
Better Data for a Better World
Rothamsted Research, UK
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitBOSC 2010
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
myGrid team
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
Carole Goble
 
Model Management in Systems Biology: Challenges – Approaches – Solutions
Model Management in Systems Biology: Challenges – Approaches – SolutionsModel Management in Systems Biology: Challenges – Approaches – Solutions
Model Management in Systems Biology: Challenges – Approaches – Solutions
Martin Scharm
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...
Chris Evelo
 

What's hot (18)

2015 balti-and-bioinformatics
2015 balti-and-bioinformatics2015 balti-and-bioinformatics
2015 balti-and-bioinformatics
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
Use of data
Use of dataUse of data
Use of data
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use Case
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Science
 
Presentation cybernetics immunology-ver1.0 (for-criticism) - copy
Presentation cybernetics immunology-ver1.0 (for-criticism) - copyPresentation cybernetics immunology-ver1.0 (for-criticism) - copy
Presentation cybernetics immunology-ver1.0 (for-criticism) - copy
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenis
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Analysis with biological pathways:
Analysis with biological pathways: Analysis with biological pathways:
Analysis with biological pathways:
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
 
Web Science - ISoLA 2012
Web Science - ISoLA 2012Web Science - ISoLA 2012
Web Science - ISoLA 2012
 
Better Data for a Better World
Better Data for a Better WorldBetter Data for a Better World
Better Data for a Better World
 
Venkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkitVenkatesan bosc2010 onto-toolkit
Venkatesan bosc2010 onto-toolkit
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
Model Management in Systems Biology: Challenges – Approaches – Solutions
Model Management in Systems Biology: Challenges – Approaches – SolutionsModel Management in Systems Biology: Challenges – Approaches – Solutions
Model Management in Systems Biology: Challenges – Approaches – Solutions
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...
 

Viewers also liked

biology ppt by sagnik banerjee, k.v.s birbhum class 9
biology ppt by sagnik banerjee, k.v.s birbhum class 9biology ppt by sagnik banerjee, k.v.s birbhum class 9
biology ppt by sagnik banerjee, k.v.s birbhum class 9
sagnikrocks
 
Living Scientists Unit 8
Living Scientists   Unit 8Living Scientists   Unit 8
Living Scientists Unit 8
yolyordam yolyordam
 
Famous Romanian People
Famous Romanian PeopleFamous Romanian People
Famous Romanian People
Artemiza Milea
 
Jagadish Chandra Bose
Jagadish Chandra BoseJagadish Chandra Bose
Jagadish Chandra BoseRanjan Ghosh
 
J C Bose
J C BoseJ C Bose
J C Bose
Harsh Kalode
 
Computer for Biological Research
Computer for Biological ResearchComputer for Biological Research
Computer for Biological Research
Chakard Chalayut
 
POWERPOINT ON GUJARAT
POWERPOINT ON GUJARAT POWERPOINT ON GUJARAT
POWERPOINT ON GUJARAT
prathamesh bandekar
 

Viewers also liked (7)

biology ppt by sagnik banerjee, k.v.s birbhum class 9
biology ppt by sagnik banerjee, k.v.s birbhum class 9biology ppt by sagnik banerjee, k.v.s birbhum class 9
biology ppt by sagnik banerjee, k.v.s birbhum class 9
 
Living Scientists Unit 8
Living Scientists   Unit 8Living Scientists   Unit 8
Living Scientists Unit 8
 
Famous Romanian People
Famous Romanian PeopleFamous Romanian People
Famous Romanian People
 
Jagadish Chandra Bose
Jagadish Chandra BoseJagadish Chandra Bose
Jagadish Chandra Bose
 
J C Bose
J C BoseJ C Bose
J C Bose
 
Computer for Biological Research
Computer for Biological ResearchComputer for Biological Research
Computer for Biological Research
 
POWERPOINT ON GUJARAT
POWERPOINT ON GUJARAT POWERPOINT ON GUJARAT
POWERPOINT ON GUJARAT
 

Similar to A biologist in e-Science

'A PAL's Life' for OMII-UK Board, May 2008
'A PAL's Life' for OMII-UK Board, May 2008'A PAL's Life' for OMII-UK Board, May 2008
'A PAL's Life' for OMII-UK Board, May 2008
Leiden University Medical Center
 
Demo Presentation Wageningen Text Mining Workshop 2007
Demo Presentation Wageningen Text Mining Workshop 2007Demo Presentation Wageningen Text Mining Workshop 2007
Demo Presentation Wageningen Text Mining Workshop 2007
Leiden University Medical Center
 
The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960mare34
 
Feasting On Brains With Taverna Public
Feasting On Brains With Taverna PublicFeasting On Brains With Taverna Public
Feasting On Brains With Taverna Public
Leiden University Medical Center
 
E Science4 Chromatin Research
E Science4 Chromatin ResearchE Science4 Chromatin Research
E Science4 Chromatin Research
Leiden University Medical Center
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009Ian Foster
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
mikaelhuss
 
PhDc exam presentation
PhDc exam presentationPhDc exam presentation
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Amit Sheth
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
Andy Petrella
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
Philip Cheung
 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astrowebuploader
 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Spark Summit
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Jan Aerts
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
mikaelhuss
 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
c.titus.brown
 
DCC Keynote 2007
DCC Keynote 2007DCC Keynote 2007
DCC Keynote 2007
Carole Goble
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
Carole Goble
 
myExperiment and AIDA
myExperiment and AIDAmyExperiment and AIDA
myExperiment and AIDA
Jun Zhao
 
Blogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart LabsBlogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart Labs
Jeremy Frey
 

Similar to A biologist in e-Science (20)

'A PAL's Life' for OMII-UK Board, May 2008
'A PAL's Life' for OMII-UK Board, May 2008'A PAL's Life' for OMII-UK Board, May 2008
'A PAL's Life' for OMII-UK Board, May 2008
 
Demo Presentation Wageningen Text Mining Workshop 2007
Demo Presentation Wageningen Text Mining Workshop 2007Demo Presentation Wageningen Text Mining Workshop 2007
Demo Presentation Wageningen Text Mining Workshop 2007
 
The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960
 
Feasting On Brains With Taverna Public
Feasting On Brains With Taverna PublicFeasting On Brains With Taverna Public
Feasting On Brains With Taverna Public
 
E Science4 Chromatin Research
E Science4 Chromatin ResearchE Science4 Chromatin Research
E Science4 Chromatin Research
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
PhDc exam presentation
PhDc exam presentationPhDc exam presentation
PhDc exam presentation
 
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
wolstencroft-ogf20-astro
wolstencroft-ogf20-astrowolstencroft-ogf20-astro
wolstencroft-ogf20-astro
 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 
2016 davis-plantbio
2016 davis-plantbio2016 davis-plantbio
2016 davis-plantbio
 
DCC Keynote 2007
DCC Keynote 2007DCC Keynote 2007
DCC Keynote 2007
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
myExperiment and AIDA
myExperiment and AIDAmyExperiment and AIDA
myExperiment and AIDA
 
Blogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart LabsBlogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart Labs
 

More from Leiden University Medical Center

Rare Disease Data Linkage plan 2017 - IRDiRC 2017 presentation
Rare Disease Data Linkage plan 2017 - IRDiRC 2017 presentation Rare Disease Data Linkage plan 2017 - IRDiRC 2017 presentation
Rare Disease Data Linkage plan 2017 - IRDiRC 2017 presentation
Leiden University Medical Center
 
Linked Data and Ontology Tutorial (for RD-Connect)
Linked Data and Ontology Tutorial (for RD-Connect)Linked Data and Ontology Tutorial (for RD-Connect)
Linked Data and Ontology Tutorial (for RD-Connect)
Leiden University Medical Center
 
Data models for preserving and publishing digital research material beyond th...
Data models for preserving and publishing digital research material beyond th...Data models for preserving and publishing digital research material beyond th...
Data models for preserving and publishing digital research material beyond th...
Leiden University Medical Center
 
Feasting onbrainswithworkflows
Feasting onbrainswithworkflowsFeasting onbrainswithworkflows
Feasting onbrainswithworkflows
Leiden University Medical Center
 
Enabling Collaborative Biobank Research with feedback from audience
Enabling Collaborative Biobank Research with feedback from audienceEnabling Collaborative Biobank Research with feedback from audience
Enabling Collaborative Biobank Research with feedback from audience
Leiden University Medical Center
 
CWA & SWAT4LS Pitch at DILS2009
CWA & SWAT4LS Pitch at DILS2009CWA & SWAT4LS Pitch at DILS2009
CWA & SWAT4LS Pitch at DILS2009
Leiden University Medical Center
 
Demo Presentation ISMB/ECCB 2007
Demo Presentation ISMB/ECCB 2007Demo Presentation ISMB/ECCB 2007
Demo Presentation ISMB/ECCB 2007
Leiden University Medical Center
 

More from Leiden University Medical Center (8)

Rare Disease Data Linkage plan 2017 - IRDiRC 2017 presentation
Rare Disease Data Linkage plan 2017 - IRDiRC 2017 presentation Rare Disease Data Linkage plan 2017 - IRDiRC 2017 presentation
Rare Disease Data Linkage plan 2017 - IRDiRC 2017 presentation
 
Linked Data and Ontology Tutorial (for RD-Connect)
Linked Data and Ontology Tutorial (for RD-Connect)Linked Data and Ontology Tutorial (for RD-Connect)
Linked Data and Ontology Tutorial (for RD-Connect)
 
Nanopubs strong to_weak_semantics_vs_machine_readability
Nanopubs strong to_weak_semantics_vs_machine_readabilityNanopubs strong to_weak_semantics_vs_machine_readability
Nanopubs strong to_weak_semantics_vs_machine_readability
 
Data models for preserving and publishing digital research material beyond th...
Data models for preserving and publishing digital research material beyond th...Data models for preserving and publishing digital research material beyond th...
Data models for preserving and publishing digital research material beyond th...
 
Feasting onbrainswithworkflows
Feasting onbrainswithworkflowsFeasting onbrainswithworkflows
Feasting onbrainswithworkflows
 
Enabling Collaborative Biobank Research with feedback from audience
Enabling Collaborative Biobank Research with feedback from audienceEnabling Collaborative Biobank Research with feedback from audience
Enabling Collaborative Biobank Research with feedback from audience
 
CWA & SWAT4LS Pitch at DILS2009
CWA & SWAT4LS Pitch at DILS2009CWA & SWAT4LS Pitch at DILS2009
CWA & SWAT4LS Pitch at DILS2009
 
Demo Presentation ISMB/ECCB 2007
Demo Presentation ISMB/ECCB 2007Demo Presentation ISMB/ECCB 2007
Demo Presentation ISMB/ECCB 2007
 

Recently uploaded

Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 

Recently uploaded (20)

Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 

A biologist in e-Science

  • 1. A biologists in e-Science? by Marco Roos Acknowledgements: Scott Marshall, Edgar Meij, Sophia Katrenko, Willem van Hage, Pieter Adriaans, Martijn Schuemie, Carole Goble, Dave de Roure, Katy Wolstencroft, Andy Gibson, the myGrid and myExperiment teams, many others who share their ideas, and… You! * Project or Area Liaison for OMII-UK (domain: Biology and Bioinformatics) BioAssist programmers meeting November 17, 2008, Utrecht, The Netherlands
  • 2.
  • 4. My prime interest Structure and function of DNA in the nucleus Escherichia coli Mouse fibroblast (skin) cells
  • 5.
  • 6.
  • 7.
  • 8. My prime interest Structure and function of DNA in the nucleus Escherichia coli Mouse fibroblast (skin) cells
  • 10. Connecting the dots (example: protein interaction network in yeast)
  • 11. Biomedical knowledge repository PubMed statistics http://www.ncbi.nlm.nih.gov/entrez >17 million citations >400,000 added/year ~70,000 searches/month … Does not compute Does not fit
  • 12.
  • 13. What do I do? A needy biologist
  • 14. ‘ Old school’ Bioinformatics A typical bioinformatician
  • 15. ‘ Old school’ Bioinformatics A biologist behind a computer who (just) learned perl
  • 16. /* * determines ridges in htm expression table */ #include &quot;ridge.h&quot; int selecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable) { char querystring[256]; sprintf(&quot;SELECT * FROM %s WHERE chrom = %s ORDER BY genstart&quot;, htmtablename, chromname); htmtable = PQexec(conn, querystring); return(validquery(htmtable, querystring)); } int is_ridge(PGresult *htmtable, int row, double exprthreshold, int mincount) /* determines if mincount genes in a row are (part of) a ridge */ /* pre: htmtable is valid and sorted on genStart (ascending) /* post: { if (mincount<=0) return TRUE; if (row>=PQntuples(htmtable)) return FALSE; if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, &quot;movmed39expr&quot;)) < exprthreshold) { return FALSE; } return(is_ridge(htmtable, ++row, exprthreshold, --mincount)); } int main() { PGconn *conn; /* holds database connection */ char querystring[256]; /* query string */ PGresult *result; int i; conn = PQconnectdb(&quot;dbname=htm port=6400 user=mroos password=geheim&quot;); if (PQstatus(conn)==CONNECTION_BAD) { fprintf(stderr, &quot;connection to database failed.&quot;); fprintf(stderr, &quot;%s&quot;, PQerrorMessage(conn)); exit(1); } else printf(&quot;Connection ok&quot;); sprintf(querystring, &quot;SELECT * FROM chromosomes&quot;); printf(&quot;%s&quot;, querystring); result = PQexec(conn, querystring); if (validquery(result, querystring)) { printresults(result); } else { PQclear(result); PQfinish(conn); return FALSE; } PQclear(result); PQfinish(conn); return TRUE; } int printresults(PGresult *tuples) { int i; for (i=0; i< PQntuples(tuples) && i < 10; i++) { printf(&quot;%d, &quot;, i); printf(&quot;%s&quot;, PQgetvalue(tuples,i,0)); } return TRUE; } int validquery(PGresult *result, char *querystring) { printf(&quot; in validquery&quot;); if (PQresultStatus(result) != PGRES_TUPLES_OK) { printf(&quot;Query %s failed.&quot;, querystring); fprintf(stderr, &quot;Query %s failed.&quot;, querystring); return FALSE; } return TRUE; }
  • 17. Theme Not an e-Science approach
  • 19. Computational tools graveyard rephrasing David Shotton
  • 20. Database survival: <20% ‘no problems’
  • 21. Data graveyard quoting David Shotton
  • 22.
  • 23.
  • 24. Empowering biologists and bioinformaticians
  • 25.
  • 26. Experiment 1: Model based data integration Example: UCSC genome browser partOf * * Transcription Factor Binding Site
  • 27.
  • 28.
  • 29. Which diseases are associated with my protein of interest ‘EZH2’
  • 30. Biological knowledge extraction Biological question/model Computational experiment Extracted knowledge >17 million citations +400,000/yr
  • 31. Combining expertise Edgar Meij Information retrieval expert
  • 32. Combining expertise Sophia Katrenko Machine learning expert
  • 33. Combining expertise Willem van Hage Semantic web expert (and bass guitar player)
  • 34. Combining expertise Towards a knowledge framework Computer scientist and bioinformatician Scott Marshall
  • 35. The AIDA toolbox, Web Services for knowledge extraction and knowledge management
  • 36. e -Science collaboration AIDA toolbox
  • 37. “ Collaboration through Web Services” Bio-text mining expert BioSemantics group, Erasmus University Rotterdam Martijn Schuemie
  • 38. “ Collaboration through Web Services” Biological Database expert Hideaki Sugawara
  • 39. “ Collaboration through Web Services” e -bioscientist
  • 41. A not so nice experiment design
  • 42. A workflow Protocol for a computational experiment
  • 46. Bio AID Disease Discovery workflow 05/06/09 BioAID AIDA AIDA OMIM service (Japan) AIDA ‘ Taverna shim’ Taverna ‘shim’
  • 47. Bio AID Disease discovery workflow 05/06/09 BioAID
  • 48. Bio AID Disease discovery workflow 05/06/09 BioAID
  • 49. An insightful computational experiment
  • 50. e -Science leveraging the use of more brains Want this…
  • 51. e -Science leveraging the use of more brains … need this
  • 52. Publish and share Publish & share research objects myExperiment >400 workflows >1000 registered users (< 1yr) Run workflows without Taverna (expert feature) Open to objects other than workflows Link out to other resources
  • 53. Do I feel all powerful now? An e -biologist?
  • 54.
  • 56. Empower me with a ‘virtual brain’ * From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34 My ws Your ws My ws Your ws My ws *
  • 57. Workflow and Semantic Web Query Retrieve documents from Medline Extract proteins ( Homo sapiens ) Calculate ranking scores Create biological cross references Convert to table (html) Add documents (IDs) to semantic model Add proteins to semantic model Add scores to semantic model Add cross references to semantic model Add query to semantic model
  • 58. Do I feel all powerful now? An e -biologist?
  • 61. Conclusions How do we know when e-Science has succeeded? Not just accelerated but new A. When everyone is using Grid computing? B. When scientists make scientific advances that would not have happened otherwise? Slide from ‘The New e-Science’ by Dave de Roure
  • 62.
  • 63.
  • 64.
  • 65. How many brains do you want to use? – One?
  • 66. Some?
  • 67. Many?
  • 68. Use your community myGrid/myExperiment OMII-UK You
  • 69.