NCBO DBP

630 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
630
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

NCBO DBP

  1. 1. MCW Driving Biological Project Simon Twigger, PhD 1 Monday, September 27, 2010
  2. 2. Rat Genome Database 2 Monday, September 27, 2010
  3. 3. Whats the problem? • large scale repositories with unused or inaccessible information • How can these databases be made more useful? • How to help researchers find and use this information to connect genes to disease? 3 Monday, September 27, 2010
  4. 4. Rat researchers ask... What tissue is this gene expressed in? What expression data is Are any of these genes known for SD (aka SD/NHsd, Harlan Sprague Dawley, associated with my Sprague Dawley) rats? phenotype? Has this gene been seen in the brain? What rat expression studies have been done on Mammary Cancer(aka breast neoplasms/breast cancer/cancer of the breast, breast carcinoma...)? Monday, September 27, 2010
  5. 5. What's the strategy? • Focus on GEO GEO Records (microarray) Create Annotation Jobs & Queue Up Q-Out • Use NCBO annotator 1..n Annot. Workers to markup text, RabbitMQ Index text review annotations at OBA and then use for tools Q-In Parse Results and visualization Results saved to Put results in to GMiner database queue for save • Combine annotations with biological data to derive new insights. 5 Monday, September 27, 2010
  6. 6. Current Ontologies http://bioportal.bioontology.org/ Monday, September 27, 2010
  7. 7. 7 Monday, September 27, 2010
  8. 8. 8 Monday, September 27, 2010
  9. 9. Progress Monday, September 27, 2010
  10. 10. Linking annotations to data Tm2d1 RGD1306410 Svs4 Hbb Scgb2a1 Alb Monday, September 27, 2010
  11. 11. Linking annotations to data Tm2d1 RGD1306410 Svs4 Hbb Scgb2a1 + Alb Hbb is_expressed_in rat kidney Tm2d1 is_expressed_in rat kidney Human (U133, U133v2.), Mouse (430, U74, U95) and Rat (U34a/b/c, 230, 230v2) 62,000 samples x ca. 25,000 genes/sample = 1.5B data points Monday, September 27, 2010
  12. 12. Probeset results on GMiner Probeset L08490cds_at for Gabra1 - gamma-aminobutyric acid (GABA) A receptor, alpha 1 Hs GABRA1 Monday, September 27, 2010
  13. 13. QTL Hypertensive G G G Phenotype Pathway Strain 1 != Strain 2 G Anatomy G (Kidney) Component Function Process Hypertension Monday, September 27, 2010
  14. 14. QTL Gene ‘Highlighter’ QTL G G G AllegroGraph Disease/Pheno. GMiner RGD OBO etc Monday, September 27, 2010
  15. 15. RDF/OWL sources Cell Ontology http://www.berkeleybop.org/ontologies/obo-all/cell/cell.owl Mouse Adult Gross Anatomy http://www.berkeleybop.org/ontologies/obo-all/adult_mouse_anatomy/ adult_mouse_anatomy.owl Mammalian Phenotype http://www.berkeleybop.org/ontologies/obo-all/mammalian_phenotype/ mammalian_phenotype.owl GO Function http://www.berkeleybop.org/ontologies/obo-all/molecular_function/molecular_function.owl GO Process http://www.berkeleybop.org/ontologies/obo-all/biological_process/biological_process.owl GO component http://www.berkeleybop.org/ontologies/obo-all/cellular_component/cellular_component.owl Monday, September 27, 2010
  16. 16. Rat Genome Database Wide variety of data types - genomic and physiological many with corresponding ontologies 16 Monday, September 27, 2010
  17. 17. Monday, September 27, 2010
  18. 18. RGD->RDF Existing RGD ‘object types’ & mappings to SO Monday, September 27, 2010
  19. 19. RGD Gene Monday, September 27, 2010
  20. 20. RGD QTL Monday, September 27, 2010
  21. 21. QTL Highlighter • Rails source code will be available on GitHub • RDFizer (ruby) http://github.com/simont/MCW-RDF Monday, September 27, 2010
  22. 22. Next Steps • Register PURL for RGD • Create RGD core object ontology (OWL/RDF) • Select appropriate URIs for RGD data • Ontology annotations - how best to represent in triple store? • Export GMiner data to RDF-> Triple Store • Document & refine biological use cases related to candidate gene selection/evaluation • Identify additional data required for candidate gene selection, RDFize as appropriate, load into triple store. • Connections to other RDF collections/LOD, etc.? Monday, September 27, 2010

×