Feasting On Brains With Taverna Public - Presentation Transcript
Feasting on Brains with Taverna Tutorial and demonstration by Marco Roos acknowledging Carole Goble, Dave de Roure, Alan Williams, Jiten Bhagat, Katy Wolstencroft, Martijn Schuemie, Edgar Meij, Sophia Katrenko, Willem van Hage, Scott Marshall, Pieter Adriaans, NBIC, OMII-UK, the myGrid team
Feasting on your brain!
Please help me by filling out the form
preferably on
http://www.tinyurl.com/TavernaBrains
What can Taverna do for me?
Benelux bioinformaticians think…
Introducing myself A biologist
My prime interest Structure and function of DNA in the nucleus Escherichia coli Mouse fibroblast (skin) cells
How did I end up here?
Marco Roos
Biologist and bioinformatician (e-bioscientist) at the Informatics Institute, University of Amsterdam (BioRange/VL-e)
Project or Area Liaison (PAL) OMII-UK/myGrid
Member BioAssist programme committee NBIC
Member UK All Hands e-Science Foundation
Components controlling structure & function of DNA
Connecting the dots (example: protein interaction network in yeast)
1070 databases Nucleic Acids Research Jan 2008 (96 in Jan 2001)
Proteomics, Genomics, Transcriptomics, Protein sequence prediction, Phenotypic studies, Phylogeny, Sequence analysis, Protein Structure prediction, Protein-protein interaction, Metabolomics, Model organism collections, Systems Biology, Epidemiology, etcetera …
All with a splendid interface … all different, of course
A typical biologist… A needy biologist Tiny brain Lots of data to deal with Lots of methods and algorithms to try and combine No computational superpowers Lots of knowledge to deal with
Start at the beginning I have a computational question…
‘ Old school’ Bioinformatics A typical bioinformatician
‘ Old school’ Bioinformatics A biologist behind a computer who (just) learned perl
/* * determines ridges in htm expression table */ #include "ridge.h" int selecthtm(PGconn *conn, char *htmtablename, char *chromname, PGresult *htmtable) { char querystring[256]; sprintf("SELECT * FROM %s WHERE chrom = %s ORDER BY genstart", htmtablename, chromname); htmtable = PQexec(conn, querystring); return(validquery(htmtable, querystring)); } int is_ridge(PGresult *htmtable, int row, double exprthreshold, int mincount) /* determines if mincount genes in a row are (part of) a ridge */ /* pre: htmtable is valid and sorted on genStart (ascending) /* post: { if (mincount<=0) return TRUE; if (row>=PQntuples(htmtable)) return FALSE; if(PQgetvalue(htmtable, 0, PQfnumber(htmtable, "movmed39expr")) < exprthreshold) { return FALSE; } return(is_ridge(htmtable, ++row, exprthreshold, --mincount)); } int main() { PGconn *conn; /* holds database connection */ char querystring[256]; /* query string */ PGresult *result; int i; conn = PQconnectdb("dbname=htm port=6400 user=mroos password=geheim"); if (PQstatus(conn)==CONNECTION_BAD) { fprintf(stderr, "connection to database failed.
"); fprintf(stderr, "%s", PQerrorMessage(conn)); exit(1); } else printf("Connection ok
"); sprintf(querystring, "SELECT * FROM chromosomes"); printf("%s
", querystring); result = PQexec(conn, querystring); if (validquery(result, querystring)) { printresults(result); } else { PQclear(result); PQfinish(conn); return FALSE; } PQclear(result); PQfinish(conn); return TRUE; } int printresults(PGresult *tuples) { int i; for (i=0; i< PQntuples(tuples) && i < 10; i++) { printf("%d, ", i); printf("%s
", PQgetvalue(tuples,i,0)); } return TRUE; } int validquery(PGresult *result, char *querystring) { printf(" in validquery
"); if (PQresultStatus(result) != PGRES_TUPLES_OK) { printf("Query %s failed.
", querystring); fprintf(stderr, "Query %s failed.
", querystring); return FALSE; } return TRUE; }
The ‘spaghetti’ approach
Computational tools graveyard rephrasing David Shotton
Database survival: <20% ‘no problems’
Data graveyard quoting David Shotton
Old school bioinformatics for biologists
Lots of data, knowledge, and methods to deal with
Bioinformaticians make spaghetti and graveyards
e -Science? ‘ enhanced science’ Research and development for enhancing science
What about…
e -Science?
‘ enhanced science’
Research and development from the field of computer science to enhance science with their methodologies
Which diseases are associated with my protein of interest ‘EZH2’
0 comments
Post a comment