'A PAL's Life' for OMII-UK Board, May 2008

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    'A PAL's Life' for OMII-UK Board, May 2008 - Presentation Transcript

    1. A PAL’s life About a biologist in e-science Presentation for the OMII-UK Board, Southampton, May 16, 2008
    2. My experience in e- bio science My experience in e- bio science
        • Marco Roos
        • Biologist and bioinformatician
        • Post-doc e-(bio)science, University of Amsterdam
        • PAL OMII-UK
        • Member BioAssist steering group
    3. to here
    4. Biological motivation: F unction and architecture of DNA in the cell Escherichia coli Mouse fibroblast (skin) cells
    5. Many components... 10/06/09 BioAID
    6. Example: bioinformatics before e-science Human Transcriptome Map (HTM) (Versteeg et al. , Genome Research, 2003) Sage tag count (TU, Sage library) TU identifier position Transcriptional Unit (TU)
    7. Before e-science HTM construction and RIDGE detection /* * determines ridges in htm expression table */ #include &quot;ridge.h&quot; int selecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable) { char querystring[256]; sprintf(&quot;SELECT * FROM %s WHERE chrom = %s ORDER BY genstart&quot;, htmtablename, chromname); htmtable = PQexec(conn, querystring); return(validquery(htmtable, querystring)); } int is_ridge(PGresult *htmtable, int row, double exprthreshold, int mincount) /* determines if mincount genes in a row are (part of) a ridge */ /* pre: htmtable is valid and sorted on genStart (ascending) /* post: { if (mincount<=0) return TRUE; if (row>=PQntuples(htmtable)) return FALSE; if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, &quot;movmed39expr&quot;)) < exprthreshold) { return FALSE; } return(is_ridge(htmtable, ++row, exprthreshold, --mincount)); } int main() { PGconn *conn; /* holds database connection */ char querystring[256]; /* query string */ PGresult *result; int i; conn = PQconnectdb(&quot;dbname=htm port=6400 user=mroos password=geheim&quot;); if (PQstatus(conn)==CONNECTION_BAD) { fprintf(stderr, &quot;connection to database failed. &quot;); fprintf(stderr, &quot;%s&quot;, PQerrorMessage(conn)); exit(1); } else printf(&quot;Connection ok &quot;); sprintf(querystring, &quot;SELECT * FROM chromosomes&quot;); printf(&quot;%s &quot;, querystring); result = PQexec(conn, querystring); if (validquery(result, querystring)) { printresults(result); } else { PQclear(result); PQfinish(conn); return FALSE; } PQclear(result); PQfinish(conn); return TRUE; } int printresults(PGresult *tuples) { int i; for (i=0; i< PQntuples(tuples) && i < 10; i++) { printf(&quot;%d, &quot;, i); printf(&quot;%s &quot;, PQgetvalue(tuples,i,0)); } return TRUE; } int validquery(PGresult *result, char *querystring) { printf(&quot; in validquery &quot;); if (PQresultStatus(result) != PGRES_TUPLES_OK) { printf(&quot;Query %s failed. &quot;, querystring); fprintf(stderr, &quot;Query %s failed. &quot;, querystring); return FALSE; } return TRUE; } IT used Perl PostgresSQL C MS Excel + VBA SPSS No predefined development strategy No design phase Data Data Data Data Data Data
    8. Bioinformatics A typical bioinformatician
    9. Bioinformatics A biologist behind a computer who (just) learned perl
    10. The ‘spaghetti’ approach
    11. Before e-science
      • Conclusion
      • State of the art in computing in life science is of the 1980s
      • (gross simplification)
    12. e-science
      • e-science motivation
      • Enhance the state of the art of computing in life science and bioinformatics
    13. Example An e-science approach to text mining
    14. Biological knowledge extraction 10/06/09 BioAID Biological question/model Computational experiment Extracted knowledge I want to do it my way Carole Goble’s me -scientist >17 million citations +400,000/yr
    15. 10/06/09 BioAID Which diseases may be associated with my protein of interest EZH2
    16. Combining expertise Edgar Meij Information retrieval expert
    17. Combining expertise Sophia Katrenko Machine learning expert
    18. Combining expertise Willem van Hage Semantic web expert (and bass guitar player)
    19. Combining expertise Towards a knowledge framework Computer scientist and bioinformatician Scott Marshall
    20. The AIDA toolbox for knowledge extraction and knowledge management in a virtual laboratory for e -Science
    21. Combining web services
    22. “ Collaboration through web services” Bio-text mining expert Martijn Schuemie
    23. “ Collaboration through web services” Biological Database expert Hideaki Sugawara
    24. “ Collaboration through web services” e -bioscientist
    25. A nice tool
    26. A not so nice tool
    27. 10/06/09 BioAID
    28. Sharing
    29. Bio AID Disease Discovery workflow 10/06/09 BioAID AIDA AIDA OMIM service (Japan) AIDA ‘ Taverna shim’ Taverna ‘shim’
    30. Bio AID Disease discovery workflow 10/06/09 BioAID
    31. Bio AID Disease discovery workflow from 100 abstracts: 29 proteins associated with 1280 diseases 10/06/09 BioAID
    32. Summary so far
      • Application of myExperiment
      • Application of Taverna
      • Application of web services
      • Reuse of components from a text mining tool
      • Reuse of AIDA services in resource management tools (not shown)
      • Application of semantic web (not shown)
    33. Summary so far
      • Workflow enhance insight and reproducibility
      • Workflow as ‘computational experiment’
      • Feedback and development
      • Workflow enhances application of expertise
        • Components built by diverse experts
        • Collaboration through web services
        • Text mining experiment by non text mining expert
    34. e -Science is about people 10/06/09 BioAID Want this…
    35. e -Science is about people 10/06/09 BioAID … need this
    36. Outreach
      • Successful as ‘schoolbook’ example of e-science approach
          • VL-e mid-term review (‘e-science that works’)
          • 3x NBIC
            • Bioinformatics symposium
            • Text mining workshop
            • Web services/workflow tutorial
          • ICT delta and eChallenges
          • ISMB/ECCB2007 Vienna
          • 2x OMII-UK workshops
      • Attracts bioinformaticians to e-science
          • Example: NBIC/BioAssist
    37. BioAssist
      • National bioinformatics support programme
        • Now based on e-science
        • Taverna as target platform
      • 5 (power)user communities (‘PAL’s pals’):
        • Integrated analysis of functional genomics data
        • Proteomics data management and analysis
        • Metabolomics data management and analysis
        • Biobanking
        • High throughput sequencing
        • System bioinformatics
      • Grid and super computing support from SARA
        • Collaboration with OMII-UK/myGrid for linking computing resources with Taverna for transparent processing of large datasets
    38. BioAssist as test-bed community
      • Life science/bioinformatics requirements
        • Taverna
        • Large data processing in Taverna
        • Running workflows without Taverna help bioinformaticians help biologists
        • Web service repository (e.g. BioCatalogue)
      • Potential ‘companion’ tools for OMII-UK toolset
        • MolGenis for local data
        • vBrowser for browsing resources e.g. workflow results and data on grids
      • Collaborative effort to address requirements
        • Sharing code
    39. PAL’s future
      • Is this PAL satisfied? Not yet!
      • Uptake by bioinformatics: going well
      • Uptake by systems biology: progress
      • Uptake by life science: early days
    40. Full circle a biological question… 10/06/09 BioAID Could be running on a Grid or cluster
    41. 10/06/09 BioAID Thank you for sending me this e -Experiment from myExperiment.org!
    42. Experiences and conclusions
      • VL-e, AID and OMII-UK have helped me reach out to the bioinformatics and life science communities
      • (Hopefully) I was helpful in getting the e-science of VL-e, AID, and OMII-UK across to the bioinformatics and life science community
      • Collaborative spirit and win-win unfamiliar for me-scientists
        • Dissemination requires a lot of time and energy
    43. Experiences and conclusions
      • OMII-UK and its members prove the concept of e-science
        • accomplish the hugely complicated task of being successful from core computer science to application science and back (imho the essence of e-science research)
        • Why? (my view)
          • Strong positive leadership
          • Successful approach, acknowledging social aspects
          • Large enough community
      • OMII-UK role model for e-science & organisations adopting e-science
        • e-science (user) community needs a role model for some time to come
    44. Acknowledgements
      • AID team: Sophia Katrenko, Edgar Meij, Willem van Hage ,…, Frans Verster, Machiel Jansen , Scott Marshall and Guus Schreiber, Maarten de Rijke, Pieter Adriaans
      • Jan Top, Nicole Koenderink, Food informatics, Wageningen University
      • Martijn Schuemie, Erasmus University Rotterdam
      • Hideaki Sugawara, Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics (http://xml.nig.ac.jp)
      • OMII-UK and the myGrid family, Katy Wolstencroft
      • E-science support team for NBIC
      • VL-e colleagues
      • The Netherlands BioInformatics Centre (NBIC)
      • W3C Semantic Web Health Care and Life Sciences Interest Group
      • iCapture team in Canada
      • My friends on myExperiment
      • This work was supported by the Dutch Ministry of Economic Affairs via VL-e and BioRange (BSIK grants), and OMII-UK
      10/06/09
    45. A hopefully mutually felt: warm and fuzzy feeling! Thank you for your attention
    46. Why should I adopt e-Science? I do not believe in e -Science I only believe in Me -Science
    47. Why adopt e-science? For determined sinners: ‘ The seven deadly sins of bioinformatics’ by Carole Goble http://www.slideshare.net/dullhunk/the-seven-deadly-sins-of-bioinformatics/

    + Leiden University Medical Centre / University of AmsterdamLeiden University Medical Centre / University of Amsterdam, 6 months ago

    custom

    175 views, 0 favs, 0 embeds more stats

    Presentation for the OMII-UK Board, May 16, 2008. R more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 175
      • 175 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 0
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories