First year present

490 views

Published on

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
490
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

First year present

  1. 1. 1 Using Text Mining to Explore  Concept Complexity in Obesity through Concept Maps George Karystianis School of Computer Science Supervisors: Goran Nenadic, Iain Buchan Advisor: Andrea Schalk
  2. 2. 2 Motivation ● Complex nature of obesity. ● Wide range of biomedical data sources available. – implementation of biomedical text/data mining. ● Possible to reveal hidden links between obesity and other diseases. ● Partial completed knowledge representation models of obesity. ● A systematic approach required for: – analysis and interpretation of clinical knowledge.
  3. 3. 3 Concept Maps ● Knowledge representation models. ● Consisted of: – nodes (concepts). – links (relationships between the nodes). ● Aim: gather, understand, explore knowledge. ● Variety of users. ● No explicit detail. ● Implemented primarily in education.
  4. 4. 4 Concept Map Example
  5. 5. 5 Aim ● To design a framework to build/enhance medical concept maps. ● To improve the understanding of health care concept complexity. ● Assist medical professionals in the representation, exploration and validation of their expert knowledge. ● Improvement of the clinical health care.
  6. 6. 6 Objectives ● Design and implement methods for health care concept detection. ● Concept organisation in a concept map form. ● Method generation for concept map updates. ● Build a framework for the design/enhancement/validation of medical concept maps. ● Methodology evaluation through the health problem of obesity: – validation of obesity related concepts with current structured obesity information available. – identify gaps in clinical knowledge.
  7. 7. 7 Research Hypothesis & Questions -The analysis required to extract health care concepts. -The approach to built and enhance a concept map. -The concept map contribution in the representation/validation of knowledge. -The text mining results help to understand/explore clinical problems. Biomedical Text Mining Scientific literature Concept map Improvement of health care Framework
  8. 8. 8 Obesity ● Worldwide problem. ● Epidemic proportions: – WHO rates (2005): 1.6 billion overweight, 400 million obese. ● Associations to various diseases. ● Complex risk factors and complications. ● Various aspects. ● Lots of research.
  9. 9. 9
  10. 10. 10 Biomedical Text Mining ● Extraction of information from unstructured data of biomedical nature. ● Discovery of new, previously unknown knowledge. ● Performed on documents with complex/specific terminology and expressions. ● Challenges: – language ambiguity. – variation of language expression. ● Various tools and applications (Termine, Whatizit, GATE). ● Adaptation to user's tasks and requirements.
  11. 11. 11 What we are looking for? ● Risk Factors ● Causal Factors ● Confounding Factors ● Outcomes ● Complications ● Interventions ● ...
  12. 12. 12 Methodology Overview 1. Document retrieval. 2. Term/concept extraction. 3. Feature engineering and Information extraction: - application of classification/clustering techniques. 4. Concept map design.
  13. 13. 13 Evaluation-Obesity Case Study ● Comparison: – What ? ● biomedical text mining results. ● concept map information. – How ? ● concepts and relationships. ● New ones. ● Examination/manipulation/validation of new knowledge by experts. ● Enhancement of the concept map.
  14. 14. 14 Progress so far (1) ● Corpus collection. ● Application of Automated Term Recognition (ATR). ● C-value method. ● Single word ATR: – terminological head identification. – word of a multi-word term that defines the term class. – example: ● “Childhood diabetes type II”. ● Terminological head: “diabetes”.
  15. 15. 15 Progress so far (2) ● Ranking head measures: – total head frequency, – single head frequency, – maximum and average C-value, – abstract frequency, – ratio of single head frequency/total head frequency, – tf*idf (term frequency*inverse document frequency).
  16. 16. 16 Results tf*idf total freq single freq abstract freq word freq max_c aver_c ratio 0 5 10 15 20 25 30 35 40 45 0 10 20 30 40 50 Statistical measure Numberofkeywords
  17. 17. 17 Progress so far (3) ● Pattern extraction from abstracts for: – risk, confounding and causal factors, – interventions, – complications, – outcomes. Obesity risk is increased among women with psychiatric disorders Potential risk factor
  18. 18. 18 Example Potential risk factors Potential interventions Potential complications
  19. 19. 19 Future plan Species identification in obesity corpus (Linneus) Exploration of single word terms ATR Calculation of z-score Integration of single and multi-word terms Lexical/semantic analysis of the existing concept map Paper preparation for the extraction of single terms in text Pattern extraction from manual analysis Pattern rule design with Minor Third Feature engineering Clustering Classification Paper preparation for the classification of disease descriptors Paper preparation for the clustering of health care concepts Integration of the results Preparation of the second year interview/report Design of concept map relationships (exploration) Application of visual mapping tools Update of the new concept map Comparison and validation of knowledge Exploration of concept complexity in obesity Paper preparation for the automatic design of clinical concept maps Produced generic framework of the methodology Writing the thesis October 2010 April 2011 November 2011 May 2012 Year 3 Year 2 Date Year 2 (1/2): Concept extraction
  20. 20. 20 Future plan Species identification in obesity corpus (Linneus) Exploration of single word terms ATR Calculation of z-score Integration of single and multi-word terms Lexical/semantic analysis of the existing concept map Paper preparation for the extraction of single terms in text Pattern extraction from manual analysis Pattern rule design with Minor Third Feature engineering Clustering Classification Paper preparation for the classification of disease descriptors Paper preparation for the clustering of health care concepts Integration of the results Preparation of the second year interview/report Design of concept map relationships (exploration) Application of visual mapping tools Update of the new concept map Comparison and validation of knowledge Exploration of concept complexity in obesity Paper preparation for the automatic design of clinical concept maps Produced generic framework of the methodology Writing the thesis October 2010 April 2011 November 2011 May 2012 Year 3 Year 2 Date Year 2 (2/2): Concept structuring
  21. 21. 21 Future plan Species identification in obesity corpus (Linneus) Exploration of single word terms ATR Calculation of z-score Integration of single and multi-word terms Lexical/semantic analysis of the existing concept map Paper preparation for the extraction of single terms in text Pattern extraction from manual analysis Pattern rule design with Minor Third Feature engineering Clustering Classification Paper preparation for the classification of disease descriptors Paper preparation for the clustering of health care concepts Integration of the results Preparation of the second year interview/report Design of concept map relationships (exploration) Application of visual mapping tools Update of the new concept map Comparison and validation of knowledge Exploration of concept complexity in obesity Paper preparation for the automatic design of clinical concept maps Produced generic framework of the methodology Writing the thesis October 2010 April 2011 November 2011 May 2012 Year 3 Year 2 Date Year 3: Design of the medical concept map
  22. 22. 22 Summary ● Framework creation for clinical concept map building and enhancement. ● Improved understanding of health care concept complexity. ● So far: – comprehension of literature review. – methodology design. – single ATR. – pattern design.
  23. 23. 23 End Acknowledgements 2. School of Computer Science University of Manchester 1. Medical Research Council

×