Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Talk1 ben sadi for_gmod_bosc_2011


Published on

Published in: Education, Technology
  • Be the first to comment

  • Be the first to like this

Talk1 ben sadi for_gmod_bosc_2011

  1. 1. SADI for GMOD:  Bringing Model Organism  Databases onto the  Semantic Web  Ben Vandervalk, Luke McCarthy, Edward  Kawas, Mark WilkinsonJames Hogg Research Centre, Heart + Lung Institute University of British Columbia
  2. 2. SADI for GMOD: Background SADI (Semantic Automated Discovery and  Integration) • Standard for Web services that consume/generate  RDF • Motivation: automated integration of bioinformatics  data and software  GMOD (Generic Model Organism Database) • Toolkit for building a model organism database and  website • Collection of related open source projects: e.g.  Chado, Gbrowse, Pathway Tools   • Many sites use GMOD components: FlyBase,  BeetleBase, DictyBase, etc. 
  3. 3. SADI in a Nutshell• to invoke a SADI service: o HTTP POST an RDF document to the service URI o e.g. $ curl --data-binary @input.rdf• to get service metadata:   o HTTP GET on service URL o returns an RDF document with service name, description, etc.  o e.g. $ curl• structure of input/output data is described in OWL o service provider specifies one input OWL class and one output OWL class• strengths of SADI o no framework-specific messaging formats or ontologies o supports batch processing of inputs o supports long-running services (asynchronous services) more info:
  4. 4. SADI for GMOD• SADI services for accessing sequence feature data• implemented as Perl CGI scripts Service Name Input Relationship Output get_feature_info database identifier is about feature description get_features_ collection of feature  genomic coordinates overlapsoverlapping_region descriptions get_sequence_ DNA, RNA, or amino  genomic coordinates is represented by for_region acid sequence collection of feature get_child_features feature description has part / derives into descriptions is part of / derives  collection of feature get_parent_feature feature description from descriptions
  5. 5. SADI for GMOD: Structure of Service  Input/Output RDF Input RDF (N3) Output RDF (N3)@prefix lsrn: <> . @perefix lsrn: <> .@prefix GeneID: <> . @prefix GeneID: <> . @prefix FlyBase: < id=> . a lsrn:GeneID_Record; @prefix GenBank: <> . sio:SIO_000008 [ # p = has attribute a lsrn:GeneID_Identifier; # p = is about sio:SIO_000300 "49962" # p = has value GeneID:49962 sio:SIO_000332 FlyBase:FBgn0040037 . ] . # feature FlyBase:FBgn0040037 a SO:SO_0000704 . # o = gene range:position [ HTTP  a range:RangedSequencePosition; sio:SIO_000053 . # p = has proper part POST [ a range:StartPosition; sio:SIO_000300 26994]; sio:SIO_000053 . # p = has proper part [ a range:EndPosition; sio:SIO_000300 32391]; range:in_relation_to _:minus_strand_seq ] . _:minus_strand_seq sio:SIO_000011 [ # p = represents a strand:MinusStrand; sio:SIO_000093 GenBank:AE014135 # p = is proper part of ] . # reference feature (chromosome) FlyBase:4 # chromosome 4 get_feature_info a SO:SO_0000105 . # o = chromosome arm
  6. 6. SADI for GMOD: Setting up the Services1. Load your GFF files into a Bio::DB::SeqFeature::Store database (mysql) 2. Install SADI for GMOD dependencies with CPAN3. Download the SADI for GMOD tarball and unpack into cgi-bin4. Set DB connection parameters in cgi-bin/sadi.gmod/sadi.gmod.conf [GENERAL] db_adaptor = Bio::DB::SeqFeature::Store db_args = -adaptor DBI::mysql -dsn dbi:mysql:database=flybase base_url = Configure Dbxref mappings in cgi-bin/sadi.gmod/dbxref.conf [DBXREF_TO_LSRN] SwissProt = UniProt UniProtKB = UniProt SwissProt/TrEMBL = UniProt ...6. Register the services in public SADI registry: more info:
  7. 7. SADI Client Software SHARE Query Engine SADI Taverna PluginSPARQL Query => SADI Workflow Design SADI workflows 2010/05/03/sadi-taverna-plugin- tutorial/
  8. 8. Acknowledgements Team  Mark Wilkinson: Principal Investigator Luke McCarthy: Lead Programmer, SADI & SHARE Edward Kawas: Perl Programmer, SADI Funding Microsoft Research
  9. 9. Extra Slides
  10. 10. Demo with SHARE Query Engine SPARQL Query SADI Workflow"What proteins are homologous to FlyBaseprotein FBpp0288804?"PREFIX FlyBase: <>PREFIX sio: <>SELECT ?homologWHERE { # SIO_000332 = is about FlyBase:FBpp0288804 sio:SIO_000332 ?protein . # SIO_000205 = is represented by ?protein sio:SIO_000205 ?sequence . # SIO_010302 = is homologous to ?protein sio:SIO_010302 ?homolog .} online demo: