SADI for GMOD: Bringing Model Organism Data onto the Semantic Web

  • 724 views
Uploaded on

SADI for GMOD is a collection of ready-made SADI services for accessing sequence feature data in RDF form. The services were developed as an add-on for the GMOD (Generic Model Organism Database) …

SADI for GMOD is a collection of ready-made SADI services for accessing sequence feature data in RDF form. The services were developed as an add-on for the GMOD (Generic Model Organism Database) project, which is a popular toolkit for building model organism databases and their associated websites (e.g. FlyBase).

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
724
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
5
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. SADI for GMOD:  Bringing Model Organism  Databases onto the  Semantic Web  Ben Vandervalk, Luke McCarthy, Edward  Kawas, Mark WilkinsonJames Hogg Research Centre, Heart + Lung Institute University of British Columbia http://code.google.com/p/sadi/wiki/SADIforGMOD
  • 2. SADI for GMOD: Background SADI (Semantic Automated Discovery and  Integration) • Standard for Web services that consume/generate  RDF • Motivation: automated integration of bioinformatics  data and software  GMOD (Generic Model Organism Database) • Toolkit for building a model organism database and  website • Collection of related open source projects: e.g.  Chado, Gbrowse, Pathway Tools   • Many sites use GMOD components: FlyBase,  BeetleBase, DictyBase, etc. 
  • 3. SADI in a Nutshell• to invoke a SADI service: o HTTP POST an RDF document to the service URI o e.g. $ curl --data-binary @input.rdf http://sadiframework.org/examples/hello• to get service metadata:   o HTTP GET on service URL o returns an RDF document with service name, description, etc.  o e.g. $ curl http://sadiframework.org/examples/hello• structure of input/output data is described in OWL o service provider specifies one input OWL class and one output OWL class• strengths of SADI o no framework-specific messaging formats or ontologies o supports batch processing of inputs o supports long-running services (asynchronous services) more info: http://sadiframework.org/
  • 4. SADI for GMOD• SADI services for accessing sequence feature data• implemented as Perl CGI scripts Service Name Input Relationship Output get_feature_info database identifier is about feature description get_features_ genomic coordinates overlaps collection of feature overlapping_region descriptions get_sequence_ genomic coordinates is represented by DNA, RNA, or amino  for_region acid sequenceget_child_features feature description has part / derives into collection of feature  descriptionsget_parent_feature feature description is part of / derives  collection of feature  from descriptions
  • 5. SADI for GMOD: Structure of Service  Input/Output RDF Input RDF (N3) Output RDF (N3)@prefix lsrn: <http://purl.oclc.org/SADI/LSRN/> . @perefix lsrn: <http://purl.oclc.org/SADI/LSRN/> .@prefix GeneID: <http://lsrn.org/GeneID:> . @prefix GeneID: <http://lsrn.org/GeneID:> . @prefix FlyBase: <http://flybase.org/cgi-bin/sadi.gmod/feature?GeneID:49962 id=> . a lsrn:GeneID_Record; @prefix GenBank: <http://lsrn.org/GB:> . sio:SIO_000008 [ # p = has attribute a lsrn:GeneID_Identifier; # p = is about sio:SIO_000300 "49962" # p = has value GeneID:49962 sio:SIO_000332 FlyBase:FBgn0040037 . ] . # feature FlyBase:FBgn0040037 a SO:SO_0000704 . # o = gene range:position [ HTTP  a range:RangedSequencePosition; sio:SIO_000053 . # p = has proper part POST [ a range:StartPosition; sio:SIO_000300 26994]; sio:SIO_000053 . # p = has proper part [ a range:EndPosition; sio:SIO_000300 32391]; range:in_relation_to _:minus_strand_seq ] . _:minus_strand_seq sio:SIO_000011 [ # p = represents a strand:MinusStrand; sio:SIO_000093 GenBank:AE014135 # p = is proper part of ] . # reference feature (chromosome) FlyBase:4 # chromosome 4 get_feature_info a SO:SO_0000105 . # o = chromosome arm
  • 6. SADI for GMOD: Setting up the Services1. Load your GFF files into a Bio::DB::SeqFeature::Store database (mysql) 2. Install SADI for GMOD dependencies with CPAN3. Download the SADI for GMOD tarball and unpack into cgi-bin4. Set DB connection parameters in cgi-bin/sadi.gmod/sadi.gmod.conf [GENERAL] db_adaptor = Bio::DB::SeqFeature::Store db_args = -adaptor DBI::mysql -dsn dbi:mysql:database=flybase base_url = http://flybase.org/cgi-bin/sadi.gmod/5. Configure Dbxref mappings in cgi-bin/sadi.gmod/dbxref.conf [DBXREF_TO_LSRN] SwissProt = UniProt UniProtKB = UniProt SwissProt/TrEMBL = UniProt ...6. Register the services in public SADI registry: http://sadiframework.org/registry more info: http://code.google.com/p/sadi/wiki/SADIforGMOD
  • 7. SADI Client Software SHARE Query Engine SADI Taverna PluginSPARQL Query => SADI Workflow Design SADI workflows http://biordf.net/cardioSHARE/query http://sadiframework.org/content/ 2010/05/03/sadi-taverna-plugin- tutorial/
  • 8. Acknowledgements Team  Mark Wilkinson: Principal Investigator Luke McCarthy: Lead Programmer, SADI & SHARE Edward Kawas: Perl Programmer, SADI Funding Microsoft Research http://sadiframework.org/