SADI for GMOD:                   Bringing Model Organism                   Databases onto the                   Semantic W...
SADI for GMOD: Background      SADI (Semantic Automated Discovery and       Integration)      • Standard for Web services ...
SADI in a Nutshell• to invoke a SADI service:   o HTTP POST an RDF document to the service URI   o e.g. $ curl --data-bina...
SADI for GMOD• SADI services for accessing sequence feature data• implemented as Perl CGI scripts  Service Name           ...
SADI for GMOD: Structure of Service                     Input/Output RDF             Input RDF (N3)                       ...
SADI for GMOD: Setting up the Services1. Load your GFF files into a Bio::DB::SeqFeature::Store database (mysql) 2. Install...
SADI Client Software SHARE Query Engine                    SADI Taverna PluginSPARQL Query => SADI Workflow             De...
Acknowledgements    Team    Mark Wilkinson: Principal Investigator    Luke McCarthy: Lead Programmer, SADI & SHARE    Edwa...
Extra Slides
Demo with SHARE Query Engine     SPARQL Query                                    SADI Workflow"What proteins are homologou...
Upcoming SlideShare
Loading in...5
×

Talk1 ben sadi for_gmod_bosc_2011

734

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
734
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Talk1 ben sadi for_gmod_bosc_2011

  1. 1. SADI for GMOD:  Bringing Model Organism  Databases onto the  Semantic Web  Ben Vandervalk, Luke McCarthy, Edward  Kawas, Mark WilkinsonJames Hogg Research Centre, Heart + Lung Institute University of British Columbia http://code.google.com/p/sadi/wiki/SADIforGMOD
  2. 2. SADI for GMOD: Background SADI (Semantic Automated Discovery and  Integration) • Standard for Web services that consume/generate  RDF • Motivation: automated integration of bioinformatics  data and software  GMOD (Generic Model Organism Database) • Toolkit for building a model organism database and  website • Collection of related open source projects: e.g.  Chado, Gbrowse, Pathway Tools   • Many sites use GMOD components: FlyBase,  BeetleBase, DictyBase, etc. 
  3. 3. SADI in a Nutshell• to invoke a SADI service: o HTTP POST an RDF document to the service URI o e.g. $ curl --data-binary @input.rdf http://sadiframework.org/examples/hello• to get service metadata:   o HTTP GET on service URL o returns an RDF document with service name, description, etc.  o e.g. $ curl http://sadiframework.org/examples/hello• structure of input/output data is described in OWL o service provider specifies one input OWL class and one output OWL class• strengths of SADI o no framework-specific messaging formats or ontologies o supports batch processing of inputs o supports long-running services (asynchronous services) more info: http://sadiframework.org/
  4. 4. SADI for GMOD• SADI services for accessing sequence feature data• implemented as Perl CGI scripts Service Name Input Relationship Output get_feature_info database identifier is about feature description get_features_ collection of feature  genomic coordinates overlapsoverlapping_region descriptions get_sequence_ DNA, RNA, or amino  genomic coordinates is represented by for_region acid sequence collection of feature get_child_features feature description has part / derives into descriptions is part of / derives  collection of feature get_parent_feature feature description from descriptions
  5. 5. SADI for GMOD: Structure of Service  Input/Output RDF Input RDF (N3) Output RDF (N3)@prefix lsrn: <http://purl.oclc.org/SADI/LSRN/> . @perefix lsrn: <http://purl.oclc.org/SADI/LSRN/> .@prefix GeneID: <http://lsrn.org/GeneID:> . @prefix GeneID: <http://lsrn.org/GeneID:> . @prefix FlyBase: <http://flybase.org/cgi-bin/sadi.gmod/feature?GeneID:49962 id=> . a lsrn:GeneID_Record; @prefix GenBank: <http://lsrn.org/GB:> . sio:SIO_000008 [ # p = has attribute a lsrn:GeneID_Identifier; # p = is about sio:SIO_000300 "49962" # p = has value GeneID:49962 sio:SIO_000332 FlyBase:FBgn0040037 . ] . # feature FlyBase:FBgn0040037 a SO:SO_0000704 . # o = gene range:position [ HTTP  a range:RangedSequencePosition; sio:SIO_000053 . # p = has proper part POST [ a range:StartPosition; sio:SIO_000300 26994]; sio:SIO_000053 . # p = has proper part [ a range:EndPosition; sio:SIO_000300 32391]; range:in_relation_to _:minus_strand_seq ] . _:minus_strand_seq sio:SIO_000011 [ # p = represents a strand:MinusStrand; sio:SIO_000093 GenBank:AE014135 # p = is proper part of ] . # reference feature (chromosome) FlyBase:4 # chromosome 4 get_feature_info a SO:SO_0000105 . # o = chromosome arm
  6. 6. SADI for GMOD: Setting up the Services1. Load your GFF files into a Bio::DB::SeqFeature::Store database (mysql) 2. Install SADI for GMOD dependencies with CPAN3. Download the SADI for GMOD tarball and unpack into cgi-bin4. Set DB connection parameters in cgi-bin/sadi.gmod/sadi.gmod.conf [GENERAL] db_adaptor = Bio::DB::SeqFeature::Store db_args = -adaptor DBI::mysql -dsn dbi:mysql:database=flybase base_url = http://flybase.org/cgi-bin/sadi.gmod/5. Configure Dbxref mappings in cgi-bin/sadi.gmod/dbxref.conf [DBXREF_TO_LSRN] SwissProt = UniProt UniProtKB = UniProt SwissProt/TrEMBL = UniProt ...6. Register the services in public SADI registry: http://sadiframework.org/registry more info: http://code.google.com/p/sadi/wiki/SADIforGMOD
  7. 7. SADI Client Software SHARE Query Engine SADI Taverna PluginSPARQL Query => SADI Workflow Design SADI workflows http://biordf.net/cardioSHARE/query http://sadiframework.org/content/ 2010/05/03/sadi-taverna-plugin- tutorial/
  8. 8. Acknowledgements Team  Mark Wilkinson: Principal Investigator Luke McCarthy: Lead Programmer, SADI & SHARE Edward Kawas: Perl Programmer, SADI Funding Microsoft Research http://sadiframework.org/
  9. 9. Extra Slides
  10. 10. Demo with SHARE Query Engine SPARQL Query SADI Workflow"What proteins are homologous to FlyBaseprotein FBpp0288804?"PREFIX FlyBase: <http://lsrn.org/FLYBASE:>PREFIX sio: <http://semanticscience.org/resource/>SELECT ?homologWHERE { # SIO_000332 = is about FlyBase:FBpp0288804 sio:SIO_000332 ?protein . # SIO_000205 = is represented by ?protein sio:SIO_000205 ?sequence . # SIO_010302 = is homologous to ?protein sio:SIO_010302 ?homolog .} online demo: http://biordf.net/cardioSHARE/query
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×