Knowledge Engineering in Oncology           Andre Dekker, PhD            Medical Physicist            MAASTRO Clinic
2The five components of Radiation            Oncology            Clinic               Biology                     Radiatio...
Contents                                                            3 09:00-09:45 Knowledge Engineering: The need and the ...
Quality of medical decisions
Experiment                                 5       Lowest                 Highest       Survival               Survival   ...
Experiment                  6                         AUC                         1.00                         AUC        ...
Prediction by MDs?                                                           7                                            ...
Question                                                     8 Why do you think the doctor‟s are bad at predicting outcome...
The doctor is drowning                                                                 9                                  ...
Problem definition                                            11  • Personalized medicine is the future  • Prediction of t...
Improve medical decisions
Rapid Learning                                 13 In [..] rapid-learning [..] data routinely generated through patient car...
Engineering Knowledge                                                14         © MAASTRO 2013         Lambin, P. et al. N...
15© MAASTRO 2013
CAT ~ 2005 = MAASTRO Knowledge Engineering          16     Build Decision Support Systems to           individualize patie...
Barriers to sharing dataand a way to overcome these
Question                                                     18 What do you think would prevent people from sharing data? ...
Barriers to sharing data                                      19  [..] the problem is not really technical […]. Rather, th...
Basic concept in data sharing   20  • Syntax and Semantics  • Centralized and Federated           © MAASTRO 2013
Syntactics and Semantics – A story
Semantic interoperability – Patrick story                           22  To explain and distinguish the 4 different levels,...
Semantic interoperability – Level 0                                 23  Level 0 (no interoperability at all)  Patrick has ...
Semantic interoperability – Level 1                                          24  Level 1 (technical and syntactical intero...
Semantic interoperability – Level 2                                           25  Level 2 (partial semantic interoperabili...
Semantic interoperability – Level 3                                        26  Level 3 (full semantic interoperability, co...
Question                                                        27 What semantic interoperability level do you think medic...
Central versus Federated Data Sharing
Centralized Data (e.g. for Research)                                      29      Hospital 1               HIS            ...
Federated Data (e.g. for Research)            30             Hospital 1                            integrated data        ...
Question                                                       31 What do you think are the pros and cons for centralized ...
An example data sharing project: euroCAT
Example: MAASTRO‟s euroCAT approach                                                  33 euroCAT is a research project in w...
Components                                             34                           CTMS                           PACS   ...
Ontology – International Coding System                                                         35                         ...
Ontology use                         36Ontology is a set of terms & theirrelationshipsRetrospective analysis“Xerostomia”Al...
Data extraction system - Federated   37          © MAASTRO 2013
Distributed                                         Final Model Created    Learning                                       ...
Lecture key points                                                                            39  •   The problem of decis...
BREAK
How to gofrom data to decision support
Engineering: Data>Model>Decision Support   42         © MAASTRO 2013
Data>Model>Decision Support                      43 1. Modeling    “Learn a model from data” 2. Validation    “Estimate mo...
Learn a model from data                                   44    Training cohort         – 322 patients (MAASTRO)    Clinic...
Estimate model performance                                                             45                                 ...
Decision Support                                   46                             Stage IIIA 10 (14%)                     ...
How to learn better?          47• More patients• Diversity• More variables  •   Overfitting• Different machine  learning m...
Let‟s build a model from data
Lung cancer staging system   49         © MAASTRO 2013
Hugin                    50        © MAASTRO 2013
Discussion model                                              51 • There is always missing data • Some states (N2) have lo...
Why is there a bias, any ideas?   52          © MAASTRO 2013
Validation
Validation of a survival model                                       54  • Discrimination: Is the model able to classify t...
Laryngeal carcinoma model          55994 MAASTRO patients1990-2005www.predictcancer.orgInput parameters    –   Age    –   ...
Validation @ RTOG (Trial 0522) - Result   56         © MAASTRO 2013
Data>Model>Decision Support                                      57 Prediction Models: Revolutionary in Principle, But Do ...
Models built & validated                                   58Lung cancer                    Rectal cancer   – Survival    ...
Lecture key points                                                                           59  •   The process to go fro...
Thank you for your attention            More info on:          www.eurocat.info        www.predictcancer.org         www.c...
Upcoming SlideShare
Loading in …5
×

Knowledge Engineering in Oncology

791 views
606 views

Published on

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
791
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Knowledge Engineering in Oncology

  1. 1. Knowledge Engineering in Oncology Andre Dekker, PhD Medical Physicist MAASTRO Clinic
  2. 2. 2The five components of Radiation Oncology Clinic Biology Radiation Oncology Molecular Physics Imaging Computer science © MAASTRO 2013
  3. 3. Contents 3 09:00-09:45 Knowledge Engineering: The need and the data • Problem: “Current quality of Medical Decisions” • Rapid Learning – How to get the data? 09:45-10:00 Break 10:00-10:45 Knowledge Engineering: From data to decision support • Methodology & an example • Demo of machine learning • Evaluating decision support 10:45-11:00 Q&R © MAASTRO 2013
  4. 4. Quality of medical decisions
  5. 5. Experiment 5 Lowest Highest Survival Survival Probability Probability © MAASTRO 2013
  6. 6. Experiment 6 AUC 1.00 AUC 0.72 AUC 0.50 © MAASTRO 2013
  7. 7. Prediction by MDs? 7 Non Small Cell Lung Cancer 2 year survival 30 patients 8 MDs Retrospective AUC: 0.57 Non Small Cell Lung Cancer 2 year survival 158 patients 5 MDs Prospective AUC: 0.56 © MAASTRO 2013 Cary Oberije et al.
  8. 8. Question 8 Why do you think the doctor‟s are bad at predicting outcomes? © MAASTRO 2013
  9. 9. The doctor is drowning 9 • Explosion of data • Explosion of decisions • Explosion of „evidence‟* • 3 % in trials, bias • Sharp knife *2010: 1574 & 1354 articles on lung cancer & radiotherapy = 7.5 per day Half-life of knowledge estimated at 7 years J Clin Oncol 2010;28:4268 JMI 2012 Friedman, Rigby © MAASTRO 2013
  10. 10. Problem definition 11 • Personalized medicine is the future • Prediction of the outcome for treatment a, b or c is needed • The current method of getting evidence is too costly and unsustainable • It is unethical to ask a person to make that prediction J Clin Oncol 28:4268-4274 © MAASTRO 2013
  11. 11. Improve medical decisions
  12. 12. Rapid Learning 13 In [..] rapid-learning [..] data routinely generated through patient care and clinical research feed into an ever- growing [..] set of coordinated databases. J Clin Oncol 2010;28:4268 [..] rapid learning [..] where we can learn from each patient to guide practice, is [..] crucial to guide rational health policy and to contain costs [..]. Lancet Oncol 2011;12:933 © MAASTRO 2013
  13. 13. Engineering Knowledge 14 © MAASTRO 2013 Lambin, P. et al. Nat. Rev. Clin. Oncol. 10, 27–40 (2013)
  14. 14. 15© MAASTRO 2013
  15. 15. CAT ~ 2005 = MAASTRO Knowledge Engineering 16 Build Decision Support Systems to individualize patient care Theme 2 by using machine learning Learning (5%) to extract multifactorial personalized prediction models from existing databases Theme 1 containing all data on all patients Data that are validated in external datasets (95%) © MAASTRO 2013
  16. 16. Barriers to sharing dataand a way to overcome these
  17. 17. Question 18 What do you think would prevent people from sharing data? © MAASTRO 2013
  18. 18. Barriers to sharing data 19 [..] the problem is not really technical […]. Rather, the problems are ethical, political, and administrative. Lancet Oncol 2011;12:933 1. Administrative (time to capture, time to curate) 2. Political (value, authorship) 3. Ethical (privacy) 4. Technical © MAASTRO 2013
  19. 19. Basic concept in data sharing 20 • Syntax and Semantics • Centralized and Federated © MAASTRO 2013
  20. 20. Syntactics and Semantics – A story
  21. 21. Semantic interoperability – Patrick story 22 To explain and distinguish the 4 different levels, consider the following scenario: 56 year old Patrick recently moved from Ireland to Spain to take up his new job in a multinational IT company. A few weeks after arriving, he falls ill, consults his local (Spanish) GP and is transferred to the next hospital for further tests. © MAASTRO 2013 SemanticHEALTH Report, January 2009
  22. 22. Semantic interoperability – Level 0 23 Level 0 (no interoperability at all) Patrick has to undergo a full set of lengthy investigations for the doctors to find out the cause of his severe pain. Unfortunately, results from the local GP as well as from his Irish GP are not available at the point of care within the hospital due to the missing technical equipment. © MAASTRO 2013 SemanticHEALTH Report, January 2009
  23. 23. Semantic interoperability – Level 1 24 Level 1 (technical and syntactical interoperability): Patrick’s doctor in the hospital is able to receive electronic documents that were released from the Irish GP as well as his local GP upon request. Widely available applications supporting syntactical interoperability (such as web browsers and email clients), allow the download of patient data and provide immediate access. Unfortunately, none of the available doctors in the hospital is able to translate the Irish document, and only human intervention allows interpreting the information submitted by the local GP for adding into the hospitals information system. © MAASTRO 2013 SemanticHEALTH Report, January 2009
  24. 24. Semantic interoperability – Level 2 25 Level 2 (partial semantic interoperability): The Spanish hospital doctor is able to securely access via the Internet parts of Patrick’s Electronic Health Record released by his Irish GP as well as the local GP that he visited just hours earlier. Although both documents contain mostly free text, fragments of high importance (such as demographics, allergies, diagnoses, and parts of medical history) are encoded using international coding schemes, which the hospital information system can automatically detect, interpret and meaningfully present to the attending physician. © MAASTRO 2013 SemanticHEALTH Report, January 2009
  25. 25. Semantic interoperability – Level 3 26 Level 3 (full semantic interoperability, co-operability) In this ideal situation and after thorough authentication took place, the Spanish hospital information system is able to automatically access, interpret and present all necessary medical information about Patrick to the physician at the point of care. Neither language nor technological differences prevent the system to seamlessly integrate the received information into the local record and provide a complete picture of Patrick’s health as if it would have been collected locally. Further, the anonymised data feeds directly into the tools of public health authorities and researchers. © MAASTRO 2013 SemanticHEALTH Report, January 2009
  26. 26. Question 27 What semantic interoperability level do you think medicine is at the moment in NL & Europe? © MAASTRO 2013
  27. 27. Central versus Federated Data Sharing
  28. 28. Centralized Data (e.g. for Research) 29 Hospital 1 HIS Research System data domains PACS clinical integrated data LIS Open Clinica imaging Hospital 2 NBIA HIS e.g. tranSMART biobanking PACS e.g. caTissue LIS © MAASTRO 2013
  29. 29. Federated Data (e.g. for Research) 30 Hospital 1 integrated data HIS PACS e.g. euroCAT LIS Hospital 2 integrated data HIS e.g. PACS euroCAT LIS © MAASTRO 2013
  30. 30. Question 31 What do you think are the pros and cons for centralized vs. federated? © MAASTRO 2013
  31. 31. An example data sharing project: euroCAT
  32. 32. Example: MAASTRO‟s euroCAT approach 33 euroCAT is a research project in which we develop an IT infrastructure -> technical to make radiotherapy centers (Maastricht, Liege, Aachen, Eindhoven, Hasselt) semantic interoperable (SIOp*) / machine readable -> administrative while the data stays inside your hospital -> ethical under your full control -> political * SIOp level 3 = Machine Readable ->Data in common syntax and with common meaning © MAASTRO 2013
  33. 33. Components 34 CTMS PACS ETL Export Distributed Deident. Learning & Filter Application Ontology Sharing XML Query DICOM © MAASTRO 2013
  34. 34. Ontology – International Coding System 35 2. Search the ontology for the matching concept 1. Select the local term 声门下区 3. Map the local term to the ontology 4. See the result of your © MAASTRO 2013 mapping
  35. 35. Ontology use 36Ontology is a set of terms & theirrelationshipsRetrospective analysis“Xerostomia”All head & neck cancer patients © MAASTRO 2013
  36. 36. Data extraction system - Federated 37 © MAASTRO 2013
  37. 37. Distributed Final Model Created Learning Update Model Architecture Central Server Send Average Send Average Send Average Consensus Model Consensus Model Consensus Model Send Model Parameters Send Model Parameters Model Server RTOG Send Model Parameters Model Server Roma Model Server Learn Model from MAASTRO Local Data Learn Model from Learn Model from Local Data Local DataOnly aggregate data is exchanged between the Central Server and the local Servers
  38. 38. Lecture key points 39 • The problem of decision is caused by • Too much data on an individual patient to process for a human being • Not enough evidence in literature to make a decision in an individual patient • Not enough patients in trials • Patients that are selected in these trials do not represent the patient in which a decision needs to be taken • This problem will get worse as the number of decisions is rising in personalized medicine • Rapid learning or learning from each patient is hampered by barriers to share data • The most important barrier is the administrative effort needed to capture & share data • Semantic interoperability and federation may overcome some of these barriers © MAASTRO 2013
  39. 39. BREAK
  40. 40. How to gofrom data to decision support
  41. 41. Engineering: Data>Model>Decision Support 42 © MAASTRO 2013
  42. 42. Data>Model>Decision Support 43 1. Modeling “Learn a model from data” 2. Validation “Estimate model performance” 3. Decision Support “Impact of the model on clinical practice” © MAASTRO 2013
  43. 43. Learn a model from data 44 Training cohort – 322 patients (MAASTRO) Clinical variables Support Vector Machines NomogramDehing-Oberije (MAASTRO), IJROBP 2009;74:355 © MAASTRO 2013 Cary Oberije et al.
  44. 44. Estimate model performance 45 • INDEPENDENT Validation cohort – 36 patients (Leuven) – 65 patients (Ghent) • Discrimination, Calibration, Reclassification • AUC 0.75Dehing-Oberije (MAASTRO), IJROBP 2009;74:355 © MAASTRO 2013 Cary Oberije et al.
  45. 45. Decision Support 46 Stage IIIA 10 (14%) Stage IIIB 13 (19%) T4 12 (17%) © MAASTRO 2013 Cary Oberije et al.
  46. 46. How to learn better? 47• More patients• Diversity• More variables • Overfitting• Different machine learning methods © MAASTRO 2013
  47. 47. Let‟s build a model from data
  48. 48. Lung cancer staging system 49 © MAASTRO 2013
  49. 49. Hugin 50 © MAASTRO 2013
  50. 50. Discussion model 51 • There is always missing data • Some states (N2) have low number of observations • Machine learning without input from domain experts gives models that do not make sense • T->N->M & TNM->Stage • There is always bias in data (e.g. T4N0M0->IIIB, T2N3M0- >IIIA) -> Most difficult too solve • You learn something new T->N Relation • (Low numbers give bad models) © MAASTRO 2013
  51. 51. Why is there a bias, any ideas? 52 © MAASTRO 2013
  52. 52. Validation
  53. 53. Validation of a survival model 54 • Discrimination: Is the model able to classify the population into two or more groups with different observed survival in an external validation set? • Calibration: Is the estimated probability of survival equal to the observed survival probability in an external validation set? • Clinical usefulness: Are the training and external validation set representative for my patient and is the predicted outcome clinically relevant for my patient? © MAASTRO 2013
  54. 54. Laryngeal carcinoma model 55994 MAASTRO patients1990-2005www.predictcancer.orgInput parameters – Age – Hemoglobin – T-stage – Radiotherapy Dose (Gy) – Gender – N+ – Tumor locationOutput parameters – Overall survival © MAASTRO 2013
  55. 55. Validation @ RTOG (Trial 0522) - Result 56 © MAASTRO 2013
  56. 56. Data>Model>Decision Support 57 Prediction Models: Revolutionary in Principle, But Do They Do More Good Than Harm? “we are drowning in prediction models [..] more than 100 prediction models on prostate cancer alone” “currently [..] a large number of models [..] are not independently validated at all” J Clin Oncol 2011;29:2951 © MAASTRO 2013
  57. 57. Models built & validated 58Lung cancer Rectal cancer – Survival – Tumor response – Lung dyspnea – Local recurrences – Lung dysphagia – Distant metastases – Overall survivalLarynx cancer – Local recurrences – Overall survival www.predictcancer.org © MAASTRO 2013
  58. 58. Lecture key points 59 • The process to go from data to decision support is through Modeling & External! Validation • Adding more patients and more diverse patients to your training set is always good • Beware of adding more variables (overfitting) without adding more patients • When evaluating decision support systems look at • Discrimination: Classify into two or more groups with different event probability • Calibration: Observed event probability in the external validation • Clinical Useful: Representative/relevant patient population © MAASTRO 2013
  59. 59. Thank you for your attention More info on: www.eurocat.info www.predictcancer.org www.cancerdata.org www.mistir.info www.maastro.nl

×