Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments




             AmiGO 2: a document-oriented approach to
           ontology software and escaping the heartache of
                                SQL.

                ƒeth g—r˜on @with ghris wung—ll —nd reiko hietzeA

                              Berkeley BOP (http://berkeleybop.org),
                                  Lawrence Berkeley National Lab



                                           IQ tuly PHIP
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Outline


      1    sntrodu™tion

      2    h—t— —s — do™ument

      3    ‡h—t it gets you

      4    hevelopment —nd m—int—in—˜ility

      5    e™knowledgments
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Introduction




                                         Introduction
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Where we were at: The Software

      emiqy @httpXGG—migoFgeneontologyForgA is —n openEsour™e we˜
      —ppli™—tion th—t —llows users to queryD ˜rowseD —nd visu—lize
      ontologies —nd rel—ted gene produ™t —nnot—tion d—t—F
Introduction    Data as a document   What it gets you   Development and maintainability   Acknowledgments



Where we were at: The Software

      emiqy @httpXGG—migoFgeneontologyForgA is —n openEsour™e we˜
      —ppli™—tion th—t —llows users to queryD ˜rowseD —nd visu—lize
      ontologies —nd rel—ted gene produ™t —nnot—tion d—t—F

      „he ˜—si™ things th—t we h—ve to doX

               qet inform—tion —˜out gene produ™ts —nd termsF
               ƒe—r™h ˜y text in v—rious (eldsF
               pind dire™t —nnot—tions for termD (ltered ˜yF F F
               pind —ll inferred —nnot—tions —ndGor genes to — termD (ltered
               ˜y F F F
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



AmiGO 1.8 (term details page)
Introduction    Data as a document   What it gets you   Development and maintainability   Acknowledgments



Where we were at: The Problem


      woreD moreD moreF

      efter ye—rs of ™linging to — ™ore ƒv ˜—™kendD with —n in™re—sing
      num˜er of tri™ksD extensionsD —nd ™—™hes to keep the perform—n™e
      —t —n —™™ept—˜le levelD things h—d to ™h—ngeF F F

               gompli™—ted queries
               h—t—
               €rovided servi™es
Introduction    Data as a document   What it gets you   Development and maintainability   Acknowledgments



Where we were at: The Problem


      woreD moreD moreF

      efter ye—rs of ™linging to — ™ore ƒv ˜—™kendD with —n in™re—sing
      num˜er of tri™ksD extensionsD —nd ™—™hes to keep the perform—n™e
      —t —n —™™ept—˜le levelD things h—d to ™h—ngeF F F

               gompli™—ted queriesX enri™hmentD su˜setsD se—r™hD reportsD et™F
               h—t—X ~IDSHHDHHH Eb ~IQDHHHDHHH Eb ~VHDHHHDHHH Eb ccc
               €rovided servi™es
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



How the graph was in SQL



      qive me —ll of the genes in hrosophili— —nnot—ted to
      –neurogenesis9F




      ‘gh—do users should ˜e f—mili—r with our ontology modelF“
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Our Solution: Solr



      ƒolr!— spe™i—lized r„„€ server over the vu™ene do™ument storeF
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Our Solution: Solr



      ƒolr!— spe™i—lized r„„€ server over the vu™ene do™ument storeF

      emiqy P h—s gre—tly in™re—sed in )exi˜ilityD speedD reli—˜ilityD —nd
      development turn—round time over its ƒv prede™essorF

      por ex—mpleX — deep text se—r™h from £QHs down to £HFQsF

      st h—s —lso m—de things th—t were previously not fe—si˜le e—sy to doF
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Data as a document




                                    Data as a document
Introduction    Data as a document   What it gets you   Development and maintainability   Acknowledgments



One minute overview of Solr



               st is — do™ument storeF

               i—™h do™ument ™—n h—ve —ny num˜er of n—med (eldsF

               „hese (eld n—mes do not need to ˜e unique!h—ving multiple
               ˜eh—ves like — listF

               „he v—lues of these (elds ™—n ˜e —ny num˜er of —tomi™ types
               @if they —re — stringAF
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Example to this point

        do™ument•™—tegoryX                 ontology•™l—ss
        is•o˜soleteX                       f—lse
        l—˜elX                             neurogenesis
        l—˜el•se—r™h—˜leX                  neurogenesis
        idX                                qyXHHPPHHV
        sour™eX                            ˜iologi™—l•pro™ess
        des™riptionX                       qener—tion of ™ells within the nervous systemF
        des™ription•se—r™h—˜leX            qener—tion of ™ells within the nervous systemF
        ™ommentX                           „his term w—s —dded ˜y qy•‚ipXHHHHHPIF
        ™omment•se—r™h—˜leX                „his term w—s —dded ˜y qy•‚ipXHHHHHPIF
        synonymX                           nervous system ™ell gener—tion
        synonymX                           neur—l ™ell di'erenti—tion
        synonym•se—r™h—˜leX                nervous system ™ell gener—tion
        synonym•se—r™h—˜leX                neur—l ™ell di'erenti—tion
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Conversion: how the graph was in SQL



      qive me —ll of the genes in hrosophili— —nnot—ted to
      –neurogenesis9F
Introduction    Data as a document    What it gets you   Development and maintainability   Acknowledgments



Conversion: the graph with Solr



      e gr—ph —spe™t in ƒolrX

               ‘qyXHHRVUUH           qyXHHQIWVV          qyXHHHSTPQ          qyXHHQIRIH
                qyXHHHSSUS           qyXHHQIWVP          qyXHHRQPQI          qyXHHHSTPP
                qyXHHRRRRR           qyXHHHSUQU          qyXHHRQPPT          qyXHHRQPPU
                qyXHHRRRTR           qyXHHITHPQ          qyXHHRQPPW          qyXHHRRRPR“


      wore on this l—terF F F
Introduction    Data as a document   What it gets you   Development and maintainability   Acknowledgments



In for a penny, in for a pound



      sn f—™tD why not ™r—m everything in we ™—nc vu™ene is designed for
      d—t— r—ther l—rger th—n wh—t we h—veF

               tƒyx m—ps of ids to l—˜els —nd l—˜els to idsF
               ‚i™h gr—ph segments —s nonEindexed tƒyx ˜lo˜sF
               enything th—t might h—ve ˜een ™—™hedF
               ‡—nt just dire™t —nnot—tionsc edd —nother (eldF
Introduction   Data as a document     What it gets you   Development and maintainability     Acknowledgments



A nal schema (annotation)

        document_category                     annotation
        id                                    MGI:MGI:107940_:_GO:0014013
        bioentity                             MGI:MGI:107940
        bioentity_label                       Ezh2
        bioentity_label_searchable            Ezh2
        source                                MGI
        date                                  20110523
        taxon                                 NCBITaxon:10090
        taxon_label                           Mus musculus
        taxon_label_searchable                Mus musculus
        reference                             MGI:MGI:4833736|PMID:20798045
        evidence_type                         IMP
        evidence_with                         MGI:MGI:2661097
        annotation_class                      GO:0014013
        annotation_class_label                regulation of gliogenesis
        annotation_class_label_searchable     regulation of gliogenesis
        isa_partof_closure_map                {GO:0051239:regulation of multicellular organismal process,GO:0
        isa_partof_closure                    GO:0051239
        isa_partof_closure                    GO:0009987
        ...                                   ...
        isa_partof_closure_label              regulation of multicellular organismal process
        isa_partof_closure_label              cellular process
        isa_partof_closure_label              regulation of multicellular organismal development
        ...                                   ...
        isa_partof_closure_label_searchable   regulation of multicellular organismal process
        isa_partof_closure_label_searchable   cellular process
        isa_partof_closure_label_searchable   regulation of multicellular organismal development
        ...                                   ...
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



What it gets you




                                     What it gets you
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Graph example in SQL
      qive me —ll of the genes in hrosophili— —nnot—ted to
      –neurogenesis9F
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Graph example in SQL
      qive me —ll of the genes in hrosophili— —nnot—ted to
      –neurogenesis9F

      ƒivig„ termFn—me eƒ superterm•n—meD termF—™™ eƒ
      superterm•—™™D termFterm•type eƒ superterm•typeD
      —sso™i—tionFBD gene•produ™tFsym˜ol eƒ gp•sym˜olD
      gene•produ™tFsym˜ol eƒ gp•full•n—meD d˜xrefFxref•d˜n—me eƒ
      gp•d˜n—meD d˜xrefFxref•key eƒ gp•—™™D spe™iesFgenusD
      spe™iesFspe™iesD spe™iesFn™˜i•t—x—•idD spe™iesF™ommon•n—me
      p‚yw term sxxi‚ tysx gr—ph•p—th yx
      @termFidagr—ph•p—thFtermI•idA sxxi‚ tysx —sso™i—tion yx
      @gr—ph•p—thFtermP•ida—sso™i—tionFterm•idA sxxi‚ tysx
      gene•produ™t yx
      @—sso™i—tionFgene•produ™t•idagene•produ™tFidA sxxi‚ tysx
      spe™ies yx @gene•produ™tFspe™ies•idaspe™iesFidA sxxi‚ tysx
      d˜xref yx @gene•produ™tFd˜xref•idad˜xrefFidA ‡ri‚i
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Graph example in Solr


      qive me —ll of the genes in hrosophili— —nnot—ted to
      –neurogenesis9F
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Graph example in Solr


      qive me —ll of the genes in hrosophili— —nnot—ted to
      –neurogenesis9F

      edd to …‚vX

        what we want query arg
        —ny do™                 qaBXB
        in genes                fqado™ument•™—tegoryX4˜ioentity4
        in ™losure              fqais—•p—rtof•™losure•l—˜elX4neurogenesis4
        with )y                 fqat—xon•l—˜el•se—r™h—˜leX4hrosophili—4
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Text search example in SQL

      eny ontology term th—t ™ont—ins — referen™e to –pigment9D giving
      ™ert—in (elds more weighted th—n othersD s™ored —nd ordered ˜y
      relev—n™eD tr—nsitively rel—ted to –org—nelle9D with the relev—nt p—rts
      highlighted if s w—nt themF
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Text search example in SQL

      eny ontology term th—t ™ont—ins — referen™e to –pigment9D giving
      ™ert—in (elds more weighted th—n othersD s™ored —nd ordered ˜y
      relev—n™eD tr—nsitively rel—ted to –org—nelle9D with the relev—nt p—rts
      highlighted if s w—nt themF




      F F F s9ll get ˜—™k to youF
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Text search example in Solr

      eny ontology term th—t ™ont—ins — referen™e to –pigment9D giving
      ™ert—in (elds more weighted th—n othersD s™ored —nd ordered ˜y
      relev—n™eD tr—nsitively rel—ted to –org—nelle9D with the relev—nt p—rts
      highlighted if s w—nt themF
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Text search example in Solr

      eny ontology term th—t ™ont—ins — referen™e to –pigment9D giving
      ™ert—in (elds more weighted th—n othersD s™ored —nd ordered ˜y
      relev—n™eD tr—nsitively rel—ted to –org—nelle9D with the relev—nt p—rts
      highlighted if s w—nt themF

      edd to …‚vX

        what we want query arg
        only term               fqado™ument•™—tegoryX4ontology•™l—ss4
        h—s 4pigment4           def„ypeaedism—x 8 qapigment
        weights                 qfa‘FFF“ l—˜el•se—r™h—˜le”P id”P ‘FFF“
        in ™losure              fqais—•p—rtof•™losure•l—˜elX4org—nelle4
        highlighting            hlFsimpleFprea`em ™l—ssa4hilite4b 8 hlatrue
Introduction   Data as a document     What it gets you   Development and maintainability   Acknowledgments



Ease of Exploration I
      p—™ets ™—n m—ke life — lot e—sierF
      prom the neurogenesis ex—mpleD — ™ouple of f—™ets we get —reX

      sw€X      IQUH
      sƒyX      RWS
      sqsX      QST
                                    FFF                                                    FFF
      sheX      PWH
                                    neuron proje™tion morphogenesisX                       TQT
      sfeX      IHR
                                    regul—tion of neuron di'erenti—tionX                   SQQ
      sƒƒX      QT
                                    lo™omotionX                                            RRU
      „eƒX      U
                                    ™entr—l nervous system developmentX                    RHV
      xeƒX      T
                                    response to stimulusX                                  QSS
      sƒeX      R
                                    FFF                                                    FFF
      si€X      Q
      sieX      P
      s‚hX      P
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



A user interface:
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Ease of Exploration II
      elsoD with — sm—ll —mount of workD ™—l™ul—tions like inform—tion
      ™ontentF
Introduction    Data as a document   What it gets you   Development and maintainability   Acknowledgments



Caching, speed, and data as a resource




               vi˜er—l (eld ™re—tion —llevi—tes the need for — lot of ™—™hes of
               queriesF
               …s seeding d—t— @f—™etsAF
               ell over r„„€!very e—sy to —dd — reverse proxy server in frontF
               i—sy —nd dire™t d—t— —™™ess for third p—rties @r„„€ ™lientsAF
               ‚eus—˜le ™omponents su™h —s term ™ompletion —nd spell™he™kF
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



All folded into AmiGO 2
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Development and maintainability




                        Development and maintainability
Introduction    Data as a document   What it gets you   Development and maintainability   Acknowledgments



Cons


               ƒome ™ontortions to get —t the d—t— th—t we w—nt @eFgF no
               joinsAF



               vo—ding is en—˜led through — p—r—llel softw—re st—™kF



               „here is more overhe—d in the ™re—tion —nd m—inten—n™e of the
               v—rious (elds ne™ess—ry to m—ke this —ll work outF
Introduction    Data as a document   What it gets you   Development and maintainability   Acknowledgments



Cons


               ƒome ™ontortions to get —t the d—t— th—t we w—nt @eFgF no
               joinsAF
               We can get by without with a little thought. Also, more
               Solr features coming.
               vo—ding is en—˜led through — p—r—llel softw—re st—™kF
               But can leverage a lot of the stu already out there
               (e.g. OWL API, SolrJ, etc.).
               „here is more overhe—d in the ™re—tion —nd m—inten—n™e of the
               v—rious (elds ne™ess—ry to m—ke this —ll work outF
               We use a conguration and loading manager in
               OWLTools and AmiGO 2.
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



OWLTools FlexLoader (YAML-based)
        idX ˜˜op•ont
        des™riptionX „est m—pping of ontology ™l—ss for qyF
        displ—y•n—meX yntology
        do™ument•™—tegoryX ontology•™l—ss
        weightX RH
        ˜oost•weightsX id”PFH l—˜el”PFH des™ription”IFH ™omment”HFS synonym
        result•weightsX l—˜el”IHFH id”VFH des™ription”TFH sour™e”RFH synonym”Q
        (eldsX
        E idX l—˜el
             des™riptionX gommon term n—meF
             displ—y•n—meX „erm
             typeX string
             propertyX ‘getennot—tion€roperty†—luesD l—˜el“
             se—r™h—˜leX true
        E idX FFF
        FFF
Introduction    Data as a document   What it gets you   Development and maintainability   Acknowledgments



Pros

               st9s very very f—stF
               xo more long running queriesF
               xo y‚ws @or ƒvAF
               hevelopment —nd de˜ugging e—sier for ™lients!everything is in
               — uniform return s™hem—GtypeF
               he™re—sed num˜er of l—yers ne™ess—ry to ™omplete — ™lient
               progr—m!you just need —n r„„€ ™lientF
               xew ™l—sses of fe—tures like t—v—ƒ™ript e€ssD —uto™ompleteD
               —nd spell™he™kF F F
               „rivi—l to o'er we˜ e€ssF
               ƒ™—les ni™ely @for ex—mpleD ™—n ™hop up store ˜etween di'erent
               serversAF
Introduction    Data as a document   What it gets you   Development and maintainability   Acknowledgments



The software involved

               ƒolr over tetty @the storeA
               vi™enseX ep—™he P
               httpsXGGlu™eneF—p—™heForgGsolrG
               emiqy P @the ™lientsA
               ƒeth g—r˜onD ghris wung—llD ƒh—hid w—nzoorD reiko hietzeD
               qene yntology gonsortium
               vi™enseX wodi(ed fƒh
               httpXGGwikiFgeneontologyForgGindexFphpGemiqy•P
               httpXGG—migoPF˜erkeley˜opForg
               y‡v„ools @the lo—derA
               ghris wung—llD reiko hietzeD qene yntology gonsortium
               vi™enseX fƒh PEgl—use
               httpsXGG™odeFgoogleF™omGpGowltoolsG
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Acknowledgments




                                    Acknowledgments
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



Acknowledgments



      ferkeley fioinform—ti™s ypenEsour™e €roje™ts

      „he qene yntology gonsortium

      ƒ—™™h—romy™es qenome h—t—˜—se

      ell the users of emiqy

      ell the future users of emiqy P
Introduction   Data as a document   What it gets you   Development and maintainability   Acknowledgments



AmiGO 2

S Carbon - AmiGO2: document-oriented approach to ontology software and escaping heartache of SQL

  • 1.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments AmiGO 2: a document-oriented approach to ontology software and escaping the heartache of SQL. ƒeth g—r˜on @with ghris wung—ll —nd reiko hietzeA Berkeley BOP (http://berkeleybop.org), Lawrence Berkeley National Lab IQ tuly PHIP
  • 2.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Outline 1 sntrodu™tion 2 h—t— —s — do™ument 3 ‡h—t it gets you 4 hevelopment —nd m—int—in—˜ility 5 e™knowledgments
  • 3.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Introduction Introduction
  • 4.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Where we were at: The Software emiqy @httpXGG—migoFgeneontologyForgA is —n openEsour™e we˜ —ppli™—tion th—t —llows users to queryD ˜rowseD —nd visu—lize ontologies —nd rel—ted gene produ™t —nnot—tion d—t—F
  • 5.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Where we were at: The Software emiqy @httpXGG—migoFgeneontologyForgA is —n openEsour™e we˜ —ppli™—tion th—t —llows users to queryD ˜rowseD —nd visu—lize ontologies —nd rel—ted gene produ™t —nnot—tion d—t—F „he ˜—si™ things th—t we h—ve to doX qet inform—tion —˜out gene produ™ts —nd termsF ƒe—r™h ˜y text in v—rious (eldsF pind dire™t —nnot—tions for termD (ltered ˜yF F F pind —ll inferred —nnot—tions —ndGor genes to — termD (ltered ˜y F F F
  • 6.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments AmiGO 1.8 (term details page)
  • 7.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Where we were at: The Problem woreD moreD moreF efter ye—rs of ™linging to — ™ore ƒv ˜—™kendD with —n in™re—sing num˜er of tri™ksD extensionsD —nd ™—™hes to keep the perform—n™e —t —n —™™ept—˜le levelD things h—d to ™h—ngeF F F gompli™—ted queries h—t— €rovided servi™es
  • 8.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Where we were at: The Problem woreD moreD moreF efter ye—rs of ™linging to — ™ore ƒv ˜—™kendD with —n in™re—sing num˜er of tri™ksD extensionsD —nd ™—™hes to keep the perform—n™e —t —n —™™ept—˜le levelD things h—d to ™h—ngeF F F gompli™—ted queriesX enri™hmentD su˜setsD se—r™hD reportsD et™F h—t—X ~IDSHHDHHH Eb ~IQDHHHDHHH Eb ~VHDHHHDHHH Eb ccc €rovided servi™es
  • 9.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments How the graph was in SQL qive me —ll of the genes in hrosophili— —nnot—ted to –neurogenesis9F ‘gh—do users should ˜e f—mili—r with our ontology modelF“
  • 10.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Our Solution: Solr ƒolr!— spe™i—lized r„„€ server over the vu™ene do™ument storeF
  • 11.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Our Solution: Solr ƒolr!— spe™i—lized r„„€ server over the vu™ene do™ument storeF emiqy P h—s gre—tly in™re—sed in )exi˜ilityD speedD reli—˜ilityD —nd development turn—round time over its ƒv prede™essorF por ex—mpleX — deep text se—r™h from £QHs down to £HFQsF st h—s —lso m—de things th—t were previously not fe—si˜le e—sy to doF
  • 12.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Data as a document Data as a document
  • 13.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments One minute overview of Solr st is — do™ument storeF i—™h do™ument ™—n h—ve —ny num˜er of n—med (eldsF „hese (eld n—mes do not need to ˜e unique!h—ving multiple ˜eh—ves like — listF „he v—lues of these (elds ™—n ˜e —ny num˜er of —tomi™ types @if they —re — stringAF
  • 14.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Example to this point do™ument•™—tegoryX ontology•™l—ss is•o˜soleteX f—lse l—˜elX neurogenesis l—˜el•se—r™h—˜leX neurogenesis idX qyXHHPPHHV sour™eX ˜iologi™—l•pro™ess des™riptionX qener—tion of ™ells within the nervous systemF des™ription•se—r™h—˜leX qener—tion of ™ells within the nervous systemF ™ommentX „his term w—s —dded ˜y qy•‚ipXHHHHHPIF ™omment•se—r™h—˜leX „his term w—s —dded ˜y qy•‚ipXHHHHHPIF synonymX nervous system ™ell gener—tion synonymX neur—l ™ell di'erenti—tion synonym•se—r™h—˜leX nervous system ™ell gener—tion synonym•se—r™h—˜leX neur—l ™ell di'erenti—tion
  • 15.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Conversion: how the graph was in SQL qive me —ll of the genes in hrosophili— —nnot—ted to –neurogenesis9F
  • 16.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Conversion: the graph with Solr e gr—ph —spe™t in ƒolrX ‘qyXHHRVUUH qyXHHQIWVV qyXHHHSTPQ qyXHHQIRIH qyXHHHSSUS qyXHHQIWVP qyXHHRQPQI qyXHHHSTPP qyXHHRRRRR qyXHHHSUQU qyXHHRQPPT qyXHHRQPPU qyXHHRRRTR qyXHHITHPQ qyXHHRQPPW qyXHHRRRPR“ wore on this l—terF F F
  • 17.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments In for a penny, in for a pound sn f—™tD why not ™r—m everything in we ™—nc vu™ene is designed for d—t— r—ther l—rger th—n wh—t we h—veF tƒyx m—ps of ids to l—˜els —nd l—˜els to idsF ‚i™h gr—ph segments —s nonEindexed tƒyx ˜lo˜sF enything th—t might h—ve ˜een ™—™hedF ‡—nt just dire™t —nnot—tionsc edd —nother (eldF
  • 18.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments A nal schema (annotation) document_category annotation id MGI:MGI:107940_:_GO:0014013 bioentity MGI:MGI:107940 bioentity_label Ezh2 bioentity_label_searchable Ezh2 source MGI date 20110523 taxon NCBITaxon:10090 taxon_label Mus musculus taxon_label_searchable Mus musculus reference MGI:MGI:4833736|PMID:20798045 evidence_type IMP evidence_with MGI:MGI:2661097 annotation_class GO:0014013 annotation_class_label regulation of gliogenesis annotation_class_label_searchable regulation of gliogenesis isa_partof_closure_map {GO:0051239:regulation of multicellular organismal process,GO:0 isa_partof_closure GO:0051239 isa_partof_closure GO:0009987 ... ... isa_partof_closure_label regulation of multicellular organismal process isa_partof_closure_label cellular process isa_partof_closure_label regulation of multicellular organismal development ... ... isa_partof_closure_label_searchable regulation of multicellular organismal process isa_partof_closure_label_searchable cellular process isa_partof_closure_label_searchable regulation of multicellular organismal development ... ...
  • 19.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments What it gets you What it gets you
  • 20.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Graph example in SQL qive me —ll of the genes in hrosophili— —nnot—ted to –neurogenesis9F
  • 21.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Graph example in SQL qive me —ll of the genes in hrosophili— —nnot—ted to –neurogenesis9F ƒivig„ termFn—me eƒ superterm•n—meD termF—™™ eƒ superterm•—™™D termFterm•type eƒ superterm•typeD —sso™i—tionFBD gene•produ™tFsym˜ol eƒ gp•sym˜olD gene•produ™tFsym˜ol eƒ gp•full•n—meD d˜xrefFxref•d˜n—me eƒ gp•d˜n—meD d˜xrefFxref•key eƒ gp•—™™D spe™iesFgenusD spe™iesFspe™iesD spe™iesFn™˜i•t—x—•idD spe™iesF™ommon•n—me p‚yw term sxxi‚ tysx gr—ph•p—th yx @termFidagr—ph•p—thFtermI•idA sxxi‚ tysx —sso™i—tion yx @gr—ph•p—thFtermP•ida—sso™i—tionFterm•idA sxxi‚ tysx gene•produ™t yx @—sso™i—tionFgene•produ™t•idagene•produ™tFidA sxxi‚ tysx spe™ies yx @gene•produ™tFspe™ies•idaspe™iesFidA sxxi‚ tysx d˜xref yx @gene•produ™tFd˜xref•idad˜xrefFidA ‡ri‚i
  • 22.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Graph example in Solr qive me —ll of the genes in hrosophili— —nnot—ted to –neurogenesis9F
  • 23.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Graph example in Solr qive me —ll of the genes in hrosophili— —nnot—ted to –neurogenesis9F edd to …‚vX what we want query arg —ny do™ qaBXB in genes fqado™ument•™—tegoryX4˜ioentity4 in ™losure fqais—•p—rtof•™losure•l—˜elX4neurogenesis4 with )y fqat—xon•l—˜el•se—r™h—˜leX4hrosophili—4
  • 24.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Text search example in SQL eny ontology term th—t ™ont—ins — referen™e to –pigment9D giving ™ert—in (elds more weighted th—n othersD s™ored —nd ordered ˜y relev—n™eD tr—nsitively rel—ted to –org—nelle9D with the relev—nt p—rts highlighted if s w—nt themF
  • 25.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Text search example in SQL eny ontology term th—t ™ont—ins — referen™e to –pigment9D giving ™ert—in (elds more weighted th—n othersD s™ored —nd ordered ˜y relev—n™eD tr—nsitively rel—ted to –org—nelle9D with the relev—nt p—rts highlighted if s w—nt themF F F F s9ll get ˜—™k to youF
  • 26.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Text search example in Solr eny ontology term th—t ™ont—ins — referen™e to –pigment9D giving ™ert—in (elds more weighted th—n othersD s™ored —nd ordered ˜y relev—n™eD tr—nsitively rel—ted to –org—nelle9D with the relev—nt p—rts highlighted if s w—nt themF
  • 27.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Text search example in Solr eny ontology term th—t ™ont—ins — referen™e to –pigment9D giving ™ert—in (elds more weighted th—n othersD s™ored —nd ordered ˜y relev—n™eD tr—nsitively rel—ted to –org—nelle9D with the relev—nt p—rts highlighted if s w—nt themF edd to …‚vX what we want query arg only term fqado™ument•™—tegoryX4ontology•™l—ss4 h—s 4pigment4 def„ypeaedism—x 8 qapigment weights qfa‘FFF“ l—˜el•se—r™h—˜le”P id”P ‘FFF“ in ™losure fqais—•p—rtof•™losure•l—˜elX4org—nelle4 highlighting hlFsimpleFprea`em ™l—ssa4hilite4b 8 hlatrue
  • 28.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Ease of Exploration I p—™ets ™—n m—ke life — lot e—sierF prom the neurogenesis ex—mpleD — ™ouple of f—™ets we get —reX sw€X IQUH sƒyX RWS sqsX QST FFF FFF sheX PWH neuron proje™tion morphogenesisX TQT sfeX IHR regul—tion of neuron di'erenti—tionX SQQ sƒƒX QT lo™omotionX RRU „eƒX U ™entr—l nervous system developmentX RHV xeƒX T response to stimulusX QSS sƒeX R FFF FFF si€X Q sieX P s‚hX P
  • 29.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments A user interface:
  • 30.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Ease of Exploration II elsoD with — sm—ll —mount of workD ™—l™ul—tions like inform—tion ™ontentF
  • 31.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Caching, speed, and data as a resource vi˜er—l (eld ™re—tion —llevi—tes the need for — lot of ™—™hes of queriesF …s seeding d—t— @f—™etsAF ell over r„„€!very e—sy to —dd — reverse proxy server in frontF i—sy —nd dire™t d—t— —™™ess for third p—rties @r„„€ ™lientsAF ‚eus—˜le ™omponents su™h —s term ™ompletion —nd spell™he™kF
  • 32.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments All folded into AmiGO 2
  • 33.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Development and maintainability Development and maintainability
  • 34.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Cons ƒome ™ontortions to get —t the d—t— th—t we w—nt @eFgF no joinsAF vo—ding is en—˜led through — p—r—llel softw—re st—™kF „here is more overhe—d in the ™re—tion —nd m—inten—n™e of the v—rious (elds ne™ess—ry to m—ke this —ll work outF
  • 35.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Cons ƒome ™ontortions to get —t the d—t— th—t we w—nt @eFgF no joinsAF We can get by without with a little thought. Also, more Solr features coming. vo—ding is en—˜led through — p—r—llel softw—re st—™kF But can leverage a lot of the stu already out there (e.g. OWL API, SolrJ, etc.). „here is more overhe—d in the ™re—tion —nd m—inten—n™e of the v—rious (elds ne™ess—ry to m—ke this —ll work outF We use a conguration and loading manager in OWLTools and AmiGO 2.
  • 36.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments OWLTools FlexLoader (YAML-based) idX ˜˜op•ont des™riptionX „est m—pping of ontology ™l—ss for qyF displ—y•n—meX yntology do™ument•™—tegoryX ontology•™l—ss weightX RH ˜oost•weightsX id”PFH l—˜el”PFH des™ription”IFH ™omment”HFS synonym result•weightsX l—˜el”IHFH id”VFH des™ription”TFH sour™e”RFH synonym”Q (eldsX E idX l—˜el des™riptionX gommon term n—meF displ—y•n—meX „erm typeX string propertyX ‘getennot—tion€roperty†—luesD l—˜el“ se—r™h—˜leX true E idX FFF FFF
  • 37.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Pros st9s very very f—stF xo more long running queriesF xo y‚ws @or ƒvAF hevelopment —nd de˜ugging e—sier for ™lients!everything is in — uniform return s™hem—GtypeF he™re—sed num˜er of l—yers ne™ess—ry to ™omplete — ™lient progr—m!you just need —n r„„€ ™lientF xew ™l—sses of fe—tures like t—v—ƒ™ript e€ssD —uto™ompleteD —nd spell™he™kF F F „rivi—l to o'er we˜ e€ssF ƒ™—les ni™ely @for ex—mpleD ™—n ™hop up store ˜etween di'erent serversAF
  • 38.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments The software involved ƒolr over tetty @the storeA vi™enseX ep—™he P httpsXGGlu™eneF—p—™heForgGsolrG emiqy P @the ™lientsA ƒeth g—r˜onD ghris wung—llD ƒh—hid w—nzoorD reiko hietzeD qene yntology gonsortium vi™enseX wodi(ed fƒh httpXGGwikiFgeneontologyForgGindexFphpGemiqy•P httpXGG—migoPF˜erkeley˜opForg y‡v„ools @the lo—derA ghris wung—llD reiko hietzeD qene yntology gonsortium vi™enseX fƒh PEgl—use httpsXGG™odeFgoogleF™omGpGowltoolsG
  • 39.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Acknowledgments Acknowledgments
  • 40.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments Acknowledgments ferkeley fioinform—ti™s ypenEsour™e €roje™ts „he qene yntology gonsortium ƒ—™™h—romy™es qenome h—t—˜—se ell the users of emiqy ell the future users of emiqy P
  • 41.
    Introduction Data as a document What it gets you Development and maintainability Acknowledgments AmiGO 2