Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
From Laboratory to e-Laboratory?<br />Introduction for ‘Lab-J’ of the LUMC Human Genetics Department<br />Marco Roos<br />...
Introducing<br />2<br />Me<br />
Liaison biology/bioinformatics – informatics<br />3<br />Biologist and bioinformatician, e-(bio)science researcher<br />Co...
also about<br />4<br />You<br />
First about<br />5<br />Me<br />
My C.V. before e-Sciencebefore 2003<br />6<br />Molecular & Cellular biology(MSc)<br />microscopy and image analysis of ch...
MotivationStructure and function of DNA in the nucleus<br />Escherichia coli<br />Muntiacusmuntjak<br />
8<br />Why bioinformatics?<br />Lab-J suggests…<br />
23/09/2009<br />BioAID<br />9<br />Bioinformatics<br />A typical bioinformatician<br />
23/09/2009<br />BioAID<br />10<br />Bioinformatics<br />A biologist behind a computer<br />who (just) learned perl<br />
23/09/2009<br />BioAID<br />11<br />/*<br /> * determines ridges in htm expression table<br />*/<br />#include &quot;ridge...
State of the art applied computer science to a biologist<br />12<br />
Why e-science? What is wrong with bioinformatics?<br />13<br />Human geneticists think…<br />
Why should a biologist be interested in e-science?<br />14<br />BioAssistantsguessed…<br />Involves Computation<br />Inter...
15<br />Why e-Science?<br />Lots of data to deal with<br />Single tiny brain<br />Lots of knowledge to deal with<br />No c...
16<br />1070 databasesNucleic Acids Research Jan 2008(96 in Jan 2001)<br />Proteomics, Genomics, Transcriptomics, Protein ...
23/09/2009<br />17<br />Traditional data integration in bioinformatics<br />Local<br />Database<br />Local<br />Database<b...
18<br />The ‘spaghetti’ approach<br />
Some of my observations<br />Reinvention<br />How many reannotation pipelines do you need?<br />Little reuse of components...
How did I end up here?<br />20<br />Marco Roos<br />Biologist and bioinformatician, Post-doc e-(bio)scienceHuman Genetics ...
Some examples from field of e-Science<br />21<br />
Enhancement 1: Workflows(Taverna workflow)<br />22<br />
Enhancement 2: exploiting brains<br />23<br />
Exploiting Brains By Web Servicessource: http://biocatalogue.org(launched at ISMB2009)<br />24<br />&gt;1000 annotated ser...
25<br />Exploiting more brains by sharing workflowssource: http://myExperiment.org<br />Social community web site for scie...
Bioinformatics and e-science<br />Customized experiments with reusable components<br />Single purpose,single person, black...
What do we know of our data?<br />27<br />Sufficient?<br /><ul><li>Query discoveries?
Query across experiment?
Fit biological modelling?
Good basis for new experiments?
Flexible enough?</li></li></ul><li>Model-based data integration<br />Computer readable model<br />Biologist readable model...
Model based data integrationExample: UCSC genome browser<br />partOf<br />
Semantic Web (Linked Open Data)<br />30<br />
31<br />Empower me with a ‘virtual brain’<br />*<br />My ws<br />Your ws<br />My ws<br />My ws<br />Your ws<br />* From P....
32<br />Query<br />Add query to semantic model<br />Retrieve documents from Medline<br />Add documents (IDs) to semantic m...
Concept web from a users point of view<br />33<br />
34<br />e-Laboratories and e-Laboratory factories<br />
e-Galaxy for NBIC<br />35<br /><ul><li>Galaxy as front end
Workflows & Web Services
Grid enabled Taverna
MOLGENIS
Semantic/Concept Web
myExperiment/BioCatalogue
Scientific Research Objects</li></ul>Vacancy! (software engineer)<br />
SRO = a pack of models<br />- Tool models<br />- Data/ui models<br />- Flow models<br />+Attached data<br />SRO enactment ...
Upcoming SlideShare
Loading in …5
×

From Laboratory to e-Laboratory

2,303 views

Published on

Presentation for Lab-J of the Human Genetics Department at the Leiden University Medical Centre.

Published in: Education, Technology
  • Be the first to comment

From Laboratory to e-Laboratory

  1. 1. From Laboratory to e-Laboratory?<br />Introduction for ‘Lab-J’ of the LUMC Human Genetics Department<br />Marco Roos<br />Acknowledging the colleagues from BioSemantics, myGrid, OMII-UK, AID, The LUMC BioInformatics Expertise Centre<br />
  2. 2. Introducing<br />2<br />Me<br />
  3. 3. Liaison biology/bioinformatics – informatics<br />3<br />Biologist and bioinformatician, e-(bio)science researcher<br />Coordinator BioSemantics group LeidenHuman Genetics Department Leiden University Medical Centre and Informatics Institute University of Amsterdam<br />Project or Area Liaison (PAL) OMII-UK <br />Member BioAssist programme committee NBIC<br />
  4. 4. also about<br />4<br />You<br />
  5. 5. First about<br />5<br />Me<br />
  6. 6. My C.V. before e-Sciencebefore 2003<br />6<br />Molecular & Cellular biology(MSc)<br />microscopy and image analysis of chromosome structure<br />‘minor’ computer science<br />Image analysis methods to measure DNA content in bull sperm cells(civil service)<br />Chromatin structure & function(PhD molecular cytology)<br />F.I.S.H., microscopy, image analysis, statistics<br />3-D chromosome structure during cell cycle (no luck)<br />DNA movement in Escherichia coli(success)<br />Human Transcriptome Map (post-doc)<br />Gene expression to human genome sequence<br />Analysis of regions of increased gene expression<br />
  7. 7. MotivationStructure and function of DNA in the nucleus<br />Escherichia coli<br />Muntiacusmuntjak<br />
  8. 8. 8<br />Why bioinformatics?<br />Lab-J suggests…<br />
  9. 9. 23/09/2009<br />BioAID<br />9<br />Bioinformatics<br />A typical bioinformatician<br />
  10. 10. 23/09/2009<br />BioAID<br />10<br />Bioinformatics<br />A biologist behind a computer<br />who (just) learned perl<br />
  11. 11. 23/09/2009<br />BioAID<br />11<br />/*<br /> * determines ridges in htm expression table<br />*/<br />#include &quot;ridge.h&quot;<br />intselecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable)<br />{<br /> char querystring[256];<br />sprintf(&quot;SELECT * FROM %s WHERE chrom = %s ORDER BY genstart&quot;, htmtablename, chromname);<br />htmtable = PQexec(conn, querystring);<br /> return(validquery(htmtable, querystring));<br />}<br />intis_ridge(PGresult *htmtable, int row, double exprthreshold, intmincount)<br />/* determines if mincount genes in a row are (part of) a ridge */<br />/* pre: htmtable is valid and sorted on genStart (ascending)<br />/* post: <br />{<br /> if (mincount&lt;=0) return TRUE;<br /> if (row&gt;=PQntuples(htmtable)) return FALSE;<br /> if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, &quot;movmed39expr&quot;)) &lt; exprthreshold)<br /> {<br /> return FALSE;<br /> }<br /> return(is_ridge(htmtable, ++row, exprthreshold, --mincount));<br />}<br />int main()<br />{<br />PGconn *conn; /* holds database connection */<br /> char querystring[256]; /* query string */<br />PGresult *result;<br />inti;<br />conn = PQconnectdb(&quot;dbname=htm port=6400 user=mroos password=geheim&quot;);<br /> if (PQstatus(conn)==CONNECTION_BAD)<br /> {<br />fprintf(stderr, &quot;connection to database failed. &quot;);<br />fprintf(stderr, &quot;%s&quot;, PQerrorMessage(conn));<br /> exit(1);<br /> }<br /> else printf(&quot;Connection ok &quot;);<br />sprintf(querystring, &quot;SELECT * FROM chromosomes&quot;);<br />printf(&quot;%s &quot;, querystring);<br /> result = PQexec(conn, querystring);<br /> if (validquery(result, querystring))<br /> {<br />printresults(result);<br /> }<br /> else<br /> {<br />PQclear(result);<br />PQfinish(conn);<br /> return FALSE;<br /> }<br />PQclear(result);<br />PQfinish(conn);<br /> return TRUE;<br />}<br />intprintresults(PGresult *tuples)<br />{<br />inti;<br /> for (i=0; i&lt; PQntuples(tuples) && i &lt; 10; i++)<br /> {<br />printf(&quot;%d, &quot;, i);<br />printf(&quot;%s &quot;, PQgetvalue(tuples,i,0));<br /> }<br /> return TRUE;<br />}<br />intvalidquery(PGresult *result, char *querystring)<br />{<br />printf(&quot; in validquery &quot;);<br /> if (PQresultStatus(result) != PGRES_TUPLES_OK) <br /> {<br />printf(&quot;Query %s failed. &quot;, querystring);<br />fprintf(stderr, &quot;Query %s failed. &quot;, querystring);<br /> return FALSE;<br /> }<br /> return TRUE;<br />}<br />
  12. 12. State of the art applied computer science to a biologist<br />12<br />
  13. 13. Why e-science? What is wrong with bioinformatics?<br />13<br />Human geneticists think…<br />
  14. 14. Why should a biologist be interested in e-science?<br />14<br />BioAssistantsguessed…<br />Involves Computation<br />Interpretation of results<br />Biology isn’t that interesting<br />Reduce reinvention of the wheel<br />Current lack of standards<br />Sharing results<br />Reshaping biology<br />Synergy between different sciences<br />Emerging Data driven science<br />
  15. 15. 15<br />Why e-Science?<br />Lots of data to deal with<br />Single tiny brain<br />Lots of knowledge to deal with<br />No computationalsuperpowers<br />Lots of methodsand algorithms to try and combine<br />Aneedy biologist<br />
  16. 16. 16<br />1070 databasesNucleic Acids Research Jan 2008(96 in Jan 2001)<br />Proteomics, Genomics, Transcriptomics, Protein sequence prediction, Phenotypic studies, Phylogeny, Sequence analysis, Protein Structure prediction, Protein-protein interaction, Metabolomics, Model organism collections, Systems Biology, Epidemiology, etcetera …<br />All with a splendid interface<br /> … all different, of course<br />
  17. 17. 23/09/2009<br />17<br />Traditional data integration in bioinformatics<br />Local<br />Database<br />Local<br />Database<br />
  18. 18. 18<br />The ‘spaghetti’ approach<br />
  19. 19. Some of my observations<br />Reinvention<br />How many reannotation pipelines do you need?<br />Little reuse of components<br />Reproducibility<br />Black boxes <br />Emphasis not on clarity<br />Can we understand bioinformatics as wet lab protocols?<br />Focus on technicalities, not biological analysis<br />Should bioinformaticians write ‘job submission’ scripts?<br />Data graveyards<br />Do we need &gt;1000 databases?<br />Can we understand our own data?<br />19<br />
  20. 20. How did I end up here?<br />20<br />Marco Roos<br />Biologist and bioinformatician, Post-doc e-(bio)scienceHuman Genetics Department Leiden University Medical Centre and Informatics Institute University of Amsterdam<br />Project or Area Liaison (PAL) OMII-UK <br />Member BioAssist programme committee NBIC<br />
  21. 21. Some examples from field of e-Science<br />21<br />
  22. 22. Enhancement 1: Workflows(Taverna workflow)<br />22<br />
  23. 23. Enhancement 2: exploiting brains<br />23<br />
  24. 24. Exploiting Brains By Web Servicessource: http://biocatalogue.org(launched at ISMB2009)<br />24<br />&gt;1000 annotated services, &gt;3000 known to Taverna<br />Includes BioMart, R, Text mining, Kegg, NCBI Pubmed, Ensembl, etc.<br />Web Services run remotely<br />
  25. 25. 25<br />Exploiting more brains by sharing workflowssource: http://myExperiment.org<br />Social community web site for scientists<br />2300 registered users in two years<br />750 workflows<br />
  26. 26. Bioinformatics and e-science<br />Customized experiments with reusable components<br />Single purpose,single person, black box<br />application<br />My component<br />Your component<br />My component<br />My component<br />Your component<br />
  27. 27. What do we know of our data?<br />27<br />Sufficient?<br /><ul><li>Query discoveries?
  28. 28. Query across experiment?
  29. 29. Fit biological modelling?
  30. 30. Good basis for new experiments?
  31. 31. Flexible enough?</li></li></ul><li>Model-based data integration<br />Computer readable model<br />Biologist readable model<br />Biological concepts (‘myModel’)<br />Data<br />Marshall et al., International Workshop on Knowledge Systems in Bioinformatics 2006<br />Post et al., Bioinformatics 2007<br />
  32. 32. Model based data integrationExample: UCSC genome browser<br />partOf<br />
  33. 33. Semantic Web (Linked Open Data)<br />30<br />
  34. 34. 31<br />Empower me with a ‘virtual brain’<br />*<br />My ws<br />Your ws<br />My ws<br />My ws<br />Your ws<br />* From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34<br />
  35. 35. 32<br />Query<br />Add query to semantic model<br />Retrieve documents from Medline<br />Add documents (IDs) to semantic model<br />Extract proteins (Homo sapiens)<br />Add proteins to semantic model<br />Calculate ranking scores<br />Add scores to semantic model<br />Create biological cross references<br />Add cross references to semantic model<br />Convert to table (html)<br />Workflow and Semantic Web<br />
  36. 36. Concept web from a users point of view<br />33<br />
  37. 37. 34<br />e-Laboratories and e-Laboratory factories<br />
  38. 38. e-Galaxy for NBIC<br />35<br /><ul><li>Galaxy as front end
  39. 39. Workflows & Web Services
  40. 40. Grid enabled Taverna
  41. 41. MOLGENIS
  42. 42. Semantic/Concept Web
  43. 43. myExperiment/BioCatalogue
  44. 44. Scientific Research Objects</li></ul>Vacancy! (software engineer)<br />
  45. 45. SRO = a pack of models<br />- Tool models<br />- Data/ui models<br />- Flow models<br />+Attached data<br />SRO enactment = a running e-laboratory<br />Tools<br />Uses tools services<br />Model<br />SROs<br />my protocols<br />my data<br />my protocols<br />my data<br />Portal to workflows<br />2.0<br />mashup<br />data <br />Flows<br />mashup<br />tools<br />e-biologist<br />e-bioinformatician<br />Uses data services<br />Portal to workflows<br />Data<br /> programmatic interaction<br />user interfacing<br />
  46. 46. e-Galaxy mock-up<br />37<br />Suggestions by semantic components<br />Your Scientific Research Object<br />Underlying workflow<br />Related research and documents<br />Adlsjfladjslfadsflkjalfdadsf<br />Adfljadlfkjaladlfjlakdjflkjadf<br />Adflkjlakjlkjadsflakdfjlfladoioewn<br />Jlakdsfooiuwfjaoijaoisdflvoaijdf<br />MOLGENIS<br />Convert<br />Import/Export<br />Research Objects<br />Store<br />Configure<br />Run<br />
  47. 47. e-Science requirement: Reuse<br />38<br />E-Laboratorycomponent<br />
  48. 48. 39<br />http://www.epigenius.org/(mock-up)<br />
  49. 49. Research and development aims<br />Automated support for hypothesis formation <br />E.g. on epigenetic mechanisms<br />Apply Workflow, Semantic Web, Concept Web<br />Concept-based meta-analysis<br />Automated triple creation from computational analysis<br />40<br />
  50. 50. Research and development ambitions<br />Co-develop e-Laboratories<br />e-Galaxy<br />epiGenius<br />BioBanking<br />Help BEC with support environment<br />Concept Web services<br />Web services<br />E-Laboratory components<br />Transparent creation of triples<br />Personal semantic repositories<br />41<br />
  51. 51. Liaison<br />Bioinformatics Expertise Centre LUMC<br />Statistical and computer science expertise<br />Generic support<br />NBIC<br />BioAssist core software development<br />Grid tools, Concept Web, e-Labs<br />BioSemantics Rotterdam<br />Text mining<br />Concept profile meta-analysis<br />AIDUniversity of Amsterdam<br />e-Science experts<br />Grid tools<br />You?<br />OMII-UK<br />Manchester, Southampton, Edinburgh<br />(ca. 30 engineers)<br />Taverna, myExperiment, e-Labs<br />Concept Web<br />Content, tools and infrastructure<br />W3C Health Care & Life Sciences Interest Group<br />Semantic Web experts<br />Linked Open Data<br />
  52. 52. ‘e’ for enhance, not enforce<br />Please help me to help you <br />Register for:<br />http://snipurl.com/biosemanticsusers<br />(http://www.myexperiment.org/groups/211)<br />Allows me to<br />Give you preferential treatment<br />Not spam everybody<br />Keep you informed<br />Ask your opinion (user driven development!)<br />43<br />
  53. 53. Visit the BioSemantics web sitehttp://www.biosemantics.org/<br />44<br />
  54. 54. Word of warning<br />Computer scientists are scientists too!<br />Need to publish<br />Score by papers, not by software<br />Addressed by OMII-UK and BioAssist<br />Compare<br />“How can I use it in the clinic?”<br />“How can I use it in the lab?”<br />45<br />
  55. 55. Dissemination<br />Come by for help or information<br />Internal ‘mini-courses’?<br />Send me suggestions!<br />Course Managing Life Science Information for PhD students, 2010<br />46<br />
  56. 56. 47<br />Thank you for your attention<br />Lots of accessible data<br />Communitybrain power<br />Knowledge basesto query<br />Other people’scomputationalsuperpowers<br />Web Services, Workflows, and their creatorsavailable<br />Anenhanced biologist<br />Homo biologicusenhancis<br />
  57. 57. Demontration<br />SysMo-SEEK (e-Lab)<br />BioCatalogue<br />myExperiment<br />Taverna<br />48<br />

×