PhD thesis presentation

1,463 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,463
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
25
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

PhD thesis presentation

  1. 1. Next-generation text-miningapplied to toxicogenomics data analysis Kristina Hettne PhD thesis defense 20 December, 2012
  2. 2. Toxicogenomics: study if a chemical causes damage to genesText mining: teach a computer to “read” articles and extract explicit informationNext-generation text mining: teach a computer to find implicit information in articles
  3. 3. Drug safety is essential! But… how to minimize animal testing?Image source: The Independent, July 12, 2012
  4. 4. Toxicogenomics data Interpretation using knowledge from manually curated databasesImage sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/
  5. 5. Toxicogenomics data Interpretation using knowledge from manually curated databases Not sufficient in coverage We hypothesize that next-generation text mining can increase the information coverageImage sources: Verhallen and Piersma, 2011, de Jong et al 2011, http://www.flickr.com/photos/jseita/3764113525/
  6. 6. Next-generation text mining = concept profile matching Information cloud for a gene concept Shared concepts Information cloud for a chemical conceptImage source: Herman van Haagen 7
  7. 7. Concepts come from a thesaurus and are identified in text with concept identification software A good thesaurus = the basis for good concept identificationImage source: Herman van Haagen
  8. 8. Research objectives:• Investigate information coverage in public biomedical and chemical thesauri and databases• Provide methods to improve the quality and coverage• Give recommendations for use• Investigate added value of next- generation text mining when interpreting toxicogenomics data 9
  9. 9. Results 10
  10. 10. A thesaurus of chemical concepts1 andmethods1,2,3 to prepare a thesaurus to beused with concept identification softwarehttp://www.biosemantics.org/casper http://www.biosemantics.org/jochem1. Hettne et al. Bioinformatics, 20092. Hettne et al. Journal of Biomedical Semantics, 2010 113. Hettne et al. Journal of Cheminformatics, 2010
  11. 11. A next-generation text mining-based method for interpreting biological data Next-generation Biological data Statistical test text mining 12 This method gives more, and more specific results1 than other available tools http://www.biosemantics.org/weightedglobaltest1. Jelier R, Goeman JJ, Hettne KM, Schuemie MJ, den Dunnen JT, t Hoen PA. Briefings in Bioinformatics, 2011
  12. 12. Application to toxicogenomics Hettne et al. (submitted)http://www.biosemantics.org/index.php?page=chemicalresponse-specific-gene-sets
  13. 13. See developmental defects in stem cells instead of in animal embryos Embryonic structure 1.2. Posterior neuropore open A) Control group rat embryo B)Triazole-exposed rat embryoImage sources1. Verhallen and Piersma, 2011, 2. De Jong et al 2012
  14. 14. Toxicity class prediction (case study: Triazoles) 25 times larger chemical-gene matrix compared to manual work (Comparative Toxicogenomics Database) Chemical 1.Image source 1: Verhallen and Piersma, 2011
  15. 15. ConclusionsNext-generation text mining combined withstatistical tests complements, and issometimes superior to, manually curateddatabases in:- Relating chemical information to gene expression data- Identifying toxic effects already at the gene expression stage- Discriminating between different classes of chemicals
  16. 16. Future1. Make the method easier to use(currently being worked on)2. Apply the method for new drugswith unknown toxicityEarly prediction of toxicity ->less animal testing and safer drugs
  17. 17. Thank you to all who made this possible!

×