KUPKB: Sharing, Connecting and
 Exposing Kidney and Urinary
Knowledge using RDF and OWL




              www.kupkb.org

        Julie Klein & Simon Jupp
      Bio-health informatics group
        University of Manchester
The problem domain

Thousands of studies have been conducted by the kidney research community



          On different species
                                            human   mouse



          On different materials

                                            urine   tissue    cell



         • On different biological levels

                                            gene    protein



      Large diversity  Integration of the knowldege is complex
Where does the data go?
      Bespoke kidney laboratory databases
                                             Research Papers




            Generalist databases




Scattered, hidden in figures, coming in different formats
                Most of the data is lost!
The Kidney and Urinary Pathway Knowledge Base:

                                        SHARE AND CONNECT

The iKUP Browser:

                                        EXPOSE


                        www.kupkb.org
Stucture

 Populous
                           Experimental data


KUP Ontology
  (schema)
                                  RightField




               RDF triple store


                                               iKUP Browser
                KUP Knowledge Base
Ontologies provide the schema
                  What has been observed, where and when?



      Mouse anatomy                                             Experimental factors
         ontology

                                     Gene Ontology


                                                                  Animal model
      Cell type ontology
                                                                Disease ontology




             We needed to connect these reference ontologies.
Creation of a specialized Kidney and Urinary Pathway Ontology (KUPO)
                           http://www.e-lico.org/public/kupo/
Ontologies by stealth
                 The domain experts are the experts so get them build it
                                                    Biological
                             Cells        Anatomy
                                                    processes(
                            (CTO)          (MAO)
                                                       GO)

Spreadsheet
  OPPL Scripts




 Ontology

                 Populous generates simple Excel based templates
                          http://www.e-lico.eu/populous.html
Describing/Collecting experimental data
Gathering good meta-data AND data again by stealth using RightField




                      Content of the meta-data cells is constraint to
                      the relevant set of KUPO terms




                  http://www.sysmo-db.org/rightfield
Describing/Collecting experimental data
Gathering good meta-data AND data again by stealth using RightField




                      Content of the meta-data cells is constraint to
                      the relevant set of KUPO terms
Mashing it all together



 Kidney and Urinary Pathway Ontology                          Experimental data
~1800 classes (~40,000 after imports closure)         220 KUP experiments integrated


                                        Owl reasoning




              RDF triple store
               ~35M triples
                                         KUP Knowledge Base
SPARQLing results
Make it all RDF/OWL and expose a SPARQL endpoint…
                                       …then we are done right?

   We can now ask queries that span several databases
   We can exploit OWL semantics for intelligent answers


     BUT!
 Easy to use application…
                                       …this is what the biologist really want
The iKUP browser




Built as an easy-to-use and light Google Web Toolkit application
To expose data from the KUPKB
Doing some biology
1. A biological question         2. No answer with classical tools
Can calreticulin be associated   Search in Pubmed and Google does
to the development of human      not return any relevant result!
kidney disease?



3. Querying the KUPKB




4. Validation in the wet-lab     5. Publish an innovative result
KUPKB in silico result           Accepted for publication in the FASEB J!
confirmed.
Reusing and Building


Ontologies provide the schema                        Experimental data


                                Owl reasoning




       RDF triple store
                                KUP Knowledge Base
Reusing and Building


    Ontologies provide the schema                          Experimental data
Kidney and Urinary Pathway Ontology                 Annotations, homogenization
   Tool to facilitate building of onto.            Tool to facilitate data annotation
                                        Owl reasoning




            RDF triple store                                     iKUP Browser
                                     KUP Knowledge Base
What next


 User study and evaluation experiments ongoing with
  Manchester Web Ergonomics Lab

 Application to other biological domains
    Change the domain model in the ontologies and we can construct any
     organ knowledge base in this way
    Already interests in gut, liver, heart and metabolic diseases
Acknowledgments
•   Simon Jupp

•   Stuart Owen, Matthew Horridge, Katy Wolstencroft and Carole Goble @
    University of Manchester for RightField

•   Joost Schanstra, Panagiotis Moulos, Jean-Loup Bascands @ Renal Fibrosis
    Lab, Toulouse, France

•   Aristidis Charonis, Bénédicte Buffin-Meyer, Myriem Fernandez for the CALR
    example

•   e-LICO FP7 project and EuroKUP

•   Robert Stevens, ontology development, University of Manchester

    Open Source License: GNU Lesser General Public License
    Code: http://code.google.com/p/kupkb-dev/
Thank you for listening…




www.kupk b .or
Some rough stats…
• 195 KUP experiments integrated
• KUPKB RDF store ~35M triples
• KUPK Ontology ~1800 classes. ~40,000 after imports closure



Architecture
• Sesame and BigOWLIM for the RDF store
• Web site developed with Google web toolkit
• OWL API and HermiT reasoner for classification and faceted browsing
Summary
   The KUPKB RDF store is a mashup of biological knowledge relating to the
    KUP domain

   Ontologies provide the schema and a consistent data annotation mechanism

   We expose this knowledge base through a simple web interface that real
    biologists can use, the iKUP

   iKUP and KUPKB provides a faster mechanism for the biologist to survey the
    data in biological publications and helps the hypothesis generation process.

   It is a testament to the tools and APIs that such applications are now being
    delivered at relatively low cost

J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

  • 1.
    KUPKB: Sharing, Connectingand Exposing Kidney and Urinary Knowledge using RDF and OWL www.kupkb.org Julie Klein & Simon Jupp Bio-health informatics group University of Manchester
  • 2.
    The problem domain Thousandsof studies have been conducted by the kidney research community  On different species human mouse  On different materials urine tissue cell • On different biological levels gene protein Large diversity  Integration of the knowldege is complex
  • 3.
    Where does thedata go? Bespoke kidney laboratory databases Research Papers Generalist databases Scattered, hidden in figures, coming in different formats Most of the data is lost!
  • 4.
    The Kidney andUrinary Pathway Knowledge Base: SHARE AND CONNECT The iKUP Browser: EXPOSE www.kupkb.org
  • 5.
    Stucture Populous Experimental data KUP Ontology (schema) RightField RDF triple store iKUP Browser KUP Knowledge Base
  • 6.
    Ontologies provide theschema What has been observed, where and when? Mouse anatomy Experimental factors ontology Gene Ontology Animal model Cell type ontology Disease ontology We needed to connect these reference ontologies. Creation of a specialized Kidney and Urinary Pathway Ontology (KUPO) http://www.e-lico.org/public/kupo/
  • 7.
    Ontologies by stealth The domain experts are the experts so get them build it Biological Cells Anatomy processes( (CTO) (MAO) GO) Spreadsheet OPPL Scripts Ontology Populous generates simple Excel based templates http://www.e-lico.eu/populous.html
  • 8.
    Describing/Collecting experimental data Gatheringgood meta-data AND data again by stealth using RightField Content of the meta-data cells is constraint to the relevant set of KUPO terms http://www.sysmo-db.org/rightfield
  • 9.
    Describing/Collecting experimental data Gatheringgood meta-data AND data again by stealth using RightField Content of the meta-data cells is constraint to the relevant set of KUPO terms
  • 10.
    Mashing it alltogether Kidney and Urinary Pathway Ontology Experimental data ~1800 classes (~40,000 after imports closure) 220 KUP experiments integrated Owl reasoning RDF triple store ~35M triples KUP Knowledge Base
  • 11.
    SPARQLing results Make itall RDF/OWL and expose a SPARQL endpoint… …then we are done right?  We can now ask queries that span several databases  We can exploit OWL semantics for intelligent answers BUT!  Easy to use application… …this is what the biologist really want
  • 12.
    The iKUP browser Builtas an easy-to-use and light Google Web Toolkit application
  • 13.
    To expose datafrom the KUPKB
  • 14.
    Doing some biology 1.A biological question 2. No answer with classical tools Can calreticulin be associated Search in Pubmed and Google does to the development of human not return any relevant result! kidney disease? 3. Querying the KUPKB 4. Validation in the wet-lab 5. Publish an innovative result KUPKB in silico result Accepted for publication in the FASEB J! confirmed.
  • 15.
    Reusing and Building Ontologiesprovide the schema Experimental data Owl reasoning RDF triple store KUP Knowledge Base
  • 16.
    Reusing and Building Ontologies provide the schema Experimental data Kidney and Urinary Pathway Ontology Annotations, homogenization Tool to facilitate building of onto. Tool to facilitate data annotation Owl reasoning RDF triple store iKUP Browser KUP Knowledge Base
  • 17.
    What next  Userstudy and evaluation experiments ongoing with Manchester Web Ergonomics Lab  Application to other biological domains  Change the domain model in the ontologies and we can construct any organ knowledge base in this way  Already interests in gut, liver, heart and metabolic diseases
  • 18.
    Acknowledgments • Simon Jupp • Stuart Owen, Matthew Horridge, Katy Wolstencroft and Carole Goble @ University of Manchester for RightField • Joost Schanstra, Panagiotis Moulos, Jean-Loup Bascands @ Renal Fibrosis Lab, Toulouse, France • Aristidis Charonis, Bénédicte Buffin-Meyer, Myriem Fernandez for the CALR example • e-LICO FP7 project and EuroKUP • Robert Stevens, ontology development, University of Manchester Open Source License: GNU Lesser General Public License Code: http://code.google.com/p/kupkb-dev/
  • 19.
    Thank you forlistening… www.kupk b .or
  • 20.
    Some rough stats… •195 KUP experiments integrated • KUPKB RDF store ~35M triples • KUPK Ontology ~1800 classes. ~40,000 after imports closure Architecture • Sesame and BigOWLIM for the RDF store • Web site developed with Google web toolkit • OWL API and HermiT reasoner for classification and faceted browsing
  • 21.
    Summary  The KUPKB RDF store is a mashup of biological knowledge relating to the KUP domain  Ontologies provide the schema and a consistent data annotation mechanism  We expose this knowledge base through a simple web interface that real biologists can use, the iKUP  iKUP and KUPKB provides a faster mechanism for the biologist to survey the data in biological publications and helps the hypothesis generation process.  It is a testament to the tools and APIs that such applications are now being delivered at relatively low cost

Editor's Notes

  • #3 Renal physiology Human urinary protein map Renal pathophysiology Biomarker discovery
  • #6 Animate
  • #11 Animate
  • #16 Animate
  • #17 Animate