Customisable Query Resolution in
     Biology and Medicine


                Peter Ansell
Microsoft Queensland University ...
Outline

●   What data is out there
●   Current data formats
●   RDF based system
●   Biology and Medicine case study




...
Current data formats

●   FASTA
●   EMBL
●   GFF
●   BSML
●   Genbank
●   Many other formats, including custom XML

    Br...
Brisbane   Health Informatics and Knowledge Management Workshop   21 Jan 2010
                                            ...
Brisbane   Health Informatics and Knowledge Management Workshop   21 Jan 2010
                                            ...
Linked Data
1) Use URIs as names for things
2) Use HTTP URIs so that people can look up
those names.
3) When someone looks...
Bio2RDF distributed queries
●   Assign namespaces to providers and create
    URI's based on the namespace
●   Just using ...
Bio2RDF workflow

                    Resolved URI: http://bio2rdf.org/label/go:0000345



           Host name: http://bi...
Demo background

●   The background for this hypothetical
    demonstration is a patient who has not been
    responding w...
Genomics demo
●   http://bio2rdf.org/drugbank_drugs:DB01247
●   Isocarboxazid
●   http://bio2rdf.org/drugbank_targets:3939...
Drug effects demo
●   http://bio2rdf.org/links/drugbank_drugs:DB01247
●   http://bio2rdf.org/drugbank_druginteractions:DB0...
Alternative drugs demo

●   http://bio2rdf.org/drugbank_targets:3939
●   http://bio2rdf.org/pfam:PF01593
               – ...
Private and public data

●   Private information could be provided using
    current or future access models
●   Public in...
Conclusion

●   Many large distributed datasources
●   Single interface, RDF
●   Distribute queries efficiently across the...
Upcoming SlideShare
Loading in …5
×

HIKM2010 - Query Resolution for Biology and Medicine

1,468 views
1,435 views

Published on

Published in: Health & Medicine, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,468
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

HIKM2010 - Query Resolution for Biology and Medicine

  1. 1. Customisable Query Resolution in Biology and Medicine Peter Ansell Microsoft Queensland University of Technology eResearch Centre p.ansell@qut.edu.au
  2. 2. Outline ● What data is out there ● Current data formats ● RDF based system ● Biology and Medicine case study Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 2
  3. 3. Current data formats ● FASTA ● EMBL ● GFF ● BSML ● Genbank ● Many other formats, including custom XML Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 3
  4. 4. Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 4
  5. 5. Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 5
  6. 6. Linked Data 1) Use URIs as names for things 2) Use HTTP URIs so that people can look up those names. 3) When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) 4) Include links to other URIs. so that they can discover more things. http://www.w3.org/DesignIssues/LinkedData.html Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 6
  7. 7. Bio2RDF distributed queries ● Assign namespaces to providers and create URI's based on the namespace ● Just using RDF is not enough, the URI's have to be transparent enough to be used and referenced ● Query across relevant providers given a users query and get results in a single RDF document ● Aggregate all results into a single RDF document and return to the user Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 7
  8. 8. Bio2RDF workflow Resolved URI: http://bio2rdf.org/label/go:0000345 Host name: http://bio2rdf.org/ Query: label/go:0000345 Regular expression: label/([w-]+):(.+) http://bio2rdf.org/query:labelsearch http://bio2rdf.org/query:labelsearchforgo Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 8
  9. 9. Demo background ● The background for this hypothetical demonstration is a patient who has not been responding well to a particular drug, Isocarboxazid, as a treatment for their depression ● The goal is to determine what information is available to a doctor in changing the treatment Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 9
  10. 10. Genomics demo ● http://bio2rdf.org/drugbank_drugs:DB01247 ● Isocarboxazid ● http://bio2rdf.org/drugbank_targets:3939 ● http://bio2rdf.org/hgnc:6834 ● http://bio2rdf.org/geneid:4129 – MAOB ● http://bio2rdf.org/pubmed:10653595 – Localisation of MAOA and MAOB in pancreas, thyroid and adrenal glands Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 10
  11. 11. Drug effects demo ● http://bio2rdf.org/links/drugbank_drugs:DB01247 ● http://bio2rdf.org/drugbank_druginteractions:DB00176_DB01247 – Possible adverse effects with Fluvoxamine ● http://bio2rdf.org/sider_drugs:3759 ● http://bio2rdf.org/sider_sideeffects:C0027813 – Known possible side effect of Neuritis Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 11
  12. 12. Alternative drugs demo ● http://bio2rdf.org/drugbank_targets:3939 ● http://bio2rdf.org/pfam:PF01593 – Amino oxidase protein family ● http://bio2rdf.org/drugbank_targets:3041 – Similar protein, L-amino-acid oxidase ● http://bio2rdf.org/drugbank_drugs:DB03147 – Drug for similar protein, Flavin-Adenine Dinucleotide Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 12
  13. 13. Private and public data ● Private information could be provided using current or future access models ● Public information can be linked to make it explicit what the links are from the private patient or clinical information to the wider set of biological and chemical databases are Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 13
  14. 14. Conclusion ● Many large distributed datasources ● Single interface, RDF ● Distribute queries efficiently across the endpoints ● Allow for private data to remain private, but be linked out to public information Brisbane Health Informatics and Knowledge Management Workshop 21 Jan 2010 14

×