This document introduces Marco Roos and discusses his transition from traditional molecular biology and bioinformatics work to e-science. It describes how e-science approaches can help address challenges in biology by enabling greater data and knowledge sharing, reuse of tools and workflows, and integrated analysis across multiple data types and sources. Examples discussed include semantic web technologies, workflow systems, and proposed e-laboratory platforms to empower scientists with virtual collaborative environments and intelligent assistance. The goal is to help biologists better exploit computational resources and expertise through enhanced and standardized e-science frameworks.
Chapter 3 - Islamic Banking Products and Services.pptx
From Laboratory to e-Laboratory
1. From Laboratory to e-Laboratory? Introduction for ‘Lab-J’ of the LUMC Human Genetics Department Marco Roos Acknowledging the colleagues from BioSemantics, myGrid, OMII-UK, AID, The LUMC BioInformatics Expertise Centre
3. Liaison biology/bioinformatics – informatics 3 Biologist and bioinformatician, e-(bio)science researcher Coordinator BioSemantics group LeidenHuman Genetics Department Leiden University Medical Centre and Informatics Institute University of Amsterdam Project or Area Liaison (PAL) OMII-UK Member BioAssist programme committee NBIC
6. My C.V. before e-Sciencebefore 2003 6 Molecular & Cellular biology(MSc) microscopy and image analysis of chromosome structure ‘minor’ computer science Image analysis methods to measure DNA content in bull sperm cells(civil service) Chromatin structure & function(PhD molecular cytology) F.I.S.H., microscopy, image analysis, statistics 3-D chromosome structure during cell cycle (no luck) DNA movement in Escherichia coli(success) Human Transcriptome Map (post-doc) Gene expression to human genome sequence Analysis of regions of increased gene expression
10. 25/09/2009 BioAID 10 Bioinformatics A biologist behind a computer who (just) learned perl
11. 25/09/2009 BioAID 11 /* * determines ridges in htm expression table */ #include "ridge.h" intselecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable) { char querystring[256]; sprintf("SELECT * FROM %s WHERE chrom = %s ORDER BY genstart", htmtablename, chromname); htmtable = PQexec(conn, querystring); return(validquery(htmtable, querystring)); } intis_ridge(PGresult *htmtable, int row, double exprthreshold, intmincount) /* determines if mincount genes in a row are (part of) a ridge */ /* pre: htmtable is valid and sorted on genStart (ascending) /* post: { if (mincount<=0) return TRUE; if (row>=PQntuples(htmtable)) return FALSE; if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, "movmed39expr")) < exprthreshold) { return FALSE; } return(is_ridge(htmtable, ++row, exprthreshold, --mincount)); } int main() { PGconn *conn; /* holds database connection */ char querystring[256]; /* query string */ PGresult *result; inti; conn = PQconnectdb("dbname=htm port=6400 user=mroos password=geheim"); if (PQstatus(conn)==CONNECTION_BAD) { fprintf(stderr, "connection to database failed."); fprintf(stderr, "%s", PQerrorMessage(conn)); exit(1); } else printf("Connection ok"); sprintf(querystring, "SELECT * FROM chromosomes"); printf("%s", querystring); result = PQexec(conn, querystring); if (validquery(result, querystring)) { printresults(result); } else { PQclear(result); PQfinish(conn); return FALSE; } PQclear(result); PQfinish(conn); return TRUE; } intprintresults(PGresult *tuples) { inti; for (i=0; i< PQntuples(tuples) && i < 10; i++) { printf("%d, ", i); printf("%s", PQgetvalue(tuples,i,0)); } return TRUE; } intvalidquery(PGresult *result, char *querystring) { printf(" in validquery"); if (PQresultStatus(result) != PGRES_TUPLES_OK) { printf("Query %s failed.", querystring); fprintf(stderr, "Query %s failed.", querystring); return FALSE; } return TRUE; }
12. State of the art applied computer science to a biologist 12
13. Why e-science? What is wrong with bioinformatics? 13 Human geneticists think…
14. Why should a biologist be interested in e-science? 14 BioAssistantsguessed… Involves Computation Interpretation of results Biology isn’t that interesting Reduce reinvention of the wheel Current lack of standards Sharing results Reshaping biology Synergy between different sciences Emerging Data driven science
15. 15 Why e-Science? Lots of data to deal with Single tiny brain Lots of knowledge to deal with No computationalsuperpowers Lots of methodsand algorithms to try and combine Aneedy biologist
16. 16 1070 databasesNucleic Acids Research Jan 2008(96 in Jan 2001) Proteomics, Genomics, Transcriptomics, Protein sequence prediction, Phenotypic studies, Phylogeny, Sequence analysis, Protein Structure prediction, Protein-protein interaction, Metabolomics, Model organism collections, Systems Biology, Epidemiology, etcetera … All with a splendid interface … all different, of course
19. Some of my observations Reinvention How many reannotation pipelines do you need? Little reuse of components Reproducibility Black boxes Emphasis not on clarity Can we understand bioinformatics as wet lab protocols? Focus on technicalities, not biological analysis Should bioinformaticians write ‘job submission’ scripts? Data graveyards Do we need >1000 databases? Can we understand our own data? 19
20. How did I end up here? 20 Marco Roos Biologist and bioinformatician, Post-doc e-(bio)scienceHuman Genetics Department Leiden University Medical Centre and Informatics Institute University of Amsterdam Project or Area Liaison (PAL) OMII-UK Member BioAssist programme committee NBIC
24. Exploiting Brains By Web Servicessource: http://biocatalogue.org(launched at ISMB2009) 24 >1000 annotated services, >3000 known to Taverna Includes BioMart, R, Text mining, Kegg, NCBI Pubmed, Ensembl, etc. Web Services run remotely
25. 25 Exploiting more brains by sharing workflowssource: http://myExperiment.org Social community web site for scientists 2300 registered users in two years 750 workflows
26. Bioinformatics and e-science Customized experiments with reusable components Single purpose,single person, black box application My component Your component My component My component Your component
34. 31 Empower me with a ‘virtual brain’ * My ws Your ws My ws My ws Your ws * From P.J. Verschure, Journal of Cellular Biochemistry 2006, vol. 99(1), pg 23-34
35. 32 Query Add query to semantic model Retrieve documents from Medline Add documents (IDs) to semantic model Extract proteins (Homo sapiens) Add proteins to semantic model Calculate ranking scores Add scores to semantic model Create biological cross references Add cross references to semantic model Convert to table (html) Workflow and Semantic Web
45. SRO = a pack of models - Tool models - Data/ui models - Flow models +Attached data SRO enactment = a running e-laboratory Tools Uses tools services Model SROs my protocols my data my protocols my data Portal to workflows 2.0 mashup data Flows mashup tools e-biologist e-bioinformatician Uses data services Portal to workflows Data programmatic interaction user interfacing
46. e-Galaxy mock-up 37 Suggestions by semantic components Your Scientific Research Object Underlying workflow Related research and documents Adlsjfladjslfadsflkjalfdadsf Adfljadlfkjaladlfjlakdjflkjadf Adflkjlakjlkjadsflakdfjlfladoioewn Jlakdsfooiuwfjaoijaoisdflvoaijdf MOLGENIS Convert Import/Export Research Objects Store Configure Run
49. Research and development aims Automated support for hypothesis formation E.g. on epigenetic mechanisms Apply Workflow, Semantic Web, Concept Web Concept-based meta-analysis Automated triple creation from computational analysis 40
50. Research and development ambitions Co-develop e-Laboratories e-Galaxy epiGenius BioBanking Help BEC with support environment Concept Web services Web services E-Laboratory components Transparent creation of triples Personal semantic repositories 41
51. Liaison Bioinformatics Expertise Centre LUMC Statistical and computer science expertise Generic support NBIC BioAssist core software development Grid tools, Concept Web, e-Labs BioSemantics Rotterdam Text mining Concept profile meta-analysis AIDUniversity of Amsterdam e-Science experts Grid tools You? OMII-UK Manchester, Southampton, Edinburgh (ca. 30 engineers) Taverna, myExperiment, e-Labs Concept Web Content, tools and infrastructure W3C Health Care & Life Sciences Interest Group Semantic Web experts Linked Open Data
52. ‘e’ for enhance, not enforce Please help me to help you Register for: http://snipurl.com/biosemanticsusers (http://www.myexperiment.org/groups/211) Allows me to Give you preferential treatment Not spam everybody Keep you informed Ask your opinion (user driven development!) 43
54. Word of warning Computer scientists are scientists too! Need to publish Score by papers, not by software Addressed by OMII-UK and BioAssist Compare “How can I use it in the clinic?” “How can I use it in the lab?” 45
55. Dissemination Come by for help or information Internal ‘mini-courses’? Send me suggestions! FYI: Course ‘Managing Life Science Information’ for PhD students, 2010 46
56. Key points Liaisingbetween technology contacts and you, the colleagues of Human Genetics. No obligationsTry any new developments that we are involved in with our help, but don't feel obliged. Help us help youExpress your wishes, problems, try things and give feedback – and be patient sometimes Please join the biosemantics users group on myExperiment.org to help us communicate. 47
57. 48 Thank you for your attention Lots of accessible data Communitybrain power Knowledge basesto query Other people’scomputationalsuperpowers Web Services, Workflows, and their creatorsavailable Anenhanced biologist Homo biologicusenhancis