Successfully reported this slideshow.
Your SlideShare is downloading. ×

Towards Trustable AI for Complex Systems

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 32 Ad

More Related Content

Similar to Towards Trustable AI for Complex Systems (20)

More from HPCC Systems (20)

Advertisement

Recently uploaded (20)

Towards Trustable AI for Complex Systems

  1. 1. OPEN DATA SCIENCE CONFERENCE London | Nov. 19 - Nov. 22 2019
  2. 2. a Life Science and AIOps Perspective Towards Trustable AI for Complex Systems Research Fellow Data Science Institute Imperial College London Xian Yang
  3. 3. Conclusion Towards Trustable AI for Complex Systems Make data trustable Make a good understanding of systems Make AI algorithm trustable Background Overview Ways to achieve trustable AI
  4. 4. Background
  5. 5. Complex system: a system of systems Complex life systems Source: Wikipedia, GREATOPS The signal transduction pathway in a cell The deployment diagram of a large-scale IT systemcomplex networks of biologically relevant entities all related computer hardware, software, firmware, and data for the communication, transmission, processing, manipulation, storage, or protection of information. Complex IT systems Connected Communicated Complicated
  6. 6. Medical AI AIOps Source: Itgsopedia, Riverbed Missions of AI Understand systems Diagnose systems Control systems use of complex algorithms and software to emulate human cognition in the analysis of complicated medical data. automate and enhance IT operations by 1) analyse big data collected from various tools and devices via analytics and machine learning; 2) automatically spot and react to issues in real time. AI for complex life systems and IT systems
  7. 7. AI for complex life systems and IT systems AI based diagnosis Embedding mapping layer Categorical feature Classification Layer ......Title σ σ tanh x σ x x + tanh σ σ tanh x σ x x + tanh σ σ tanh x σ x x + tanh Description i-1 Description i Description i+1 Output: Failure type Input: Failure description Input: Engineers’ discussion Input: Failure’s characteristics Disease diagnosis Failure diagnosis Source: ReferralMD Medical AI AIOps
  8. 8. Elements of AI in Complex systems Surgical robot, AI CT scan reader, AI nurse Pathologic analysis, Efficacy analysis Clinical pathway optimization, Hospital bed management, Disease prevention AIOps component library, Intelligent prediction, Chatbot Anomaly detection & prediction, Root Cause Analysis Performance Optimization, Defragmentati on, Cost analysis, Capacity management Efficiency Quality Cost Data Standardizatio n Data Acquisition Data Channel Data Cleaning, ETL, Meta Data Management, Offline Computation Realtime Computation Feature Engineering Efficiency Quality Cost Medical AI AIOps AI Applications Big Data Platform Regression analysis Time series analysis Causal inference Dynamical model construction Correlation analysis Differential feature selection Cluster analysis Component analysis AI algorithms
  9. 9. Categories of AI algorithms Ease of experiments Power of explanation
  10. 10. My focus: Trustable AI for complex systems What's AI’s holdup? It is not technical The barrier is the trust aspects. If it is not trustable, then it is not useful. Working towards trustable AI. How to get humans to use our AI technology and rely on it? My focus
  11. 11. Ways to achieve trustable AI 1.1. Deeper 1.2. Wider 1.3. Bigger 2.1. A holistic view: Simplification VS. Complication 3.1. Moving beyond correlation 3.2. Moving beyond AI black-box Make data trustable Make a good understanding of systems Make AI algorithm trustable 1. 2. 3.
  12. 12. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Deeper: Extracting detailed information from the data1.1 Make data trustable Phenotype annotation from electronic health record EXAMPLE Cohort selection for precision medicine Use as the inputs of the AI model extracted features Extract information from raw data
  13. 13. Problem: Disease code cannot fully represent medical information in electronic health records (EHRs). Case study: two patients with the same ICD code have different level of severity. Example: Phenotype annotation from electronic health record Case 1 admission_id=198908, subject_id=28912 brief hospital course the patient was seen in the emergency room at the request of neurology and the emergency room staff at doctor last name family she received a dilantin load on arrival to hospital a ct was obtained which showed subarachnoid blood the patient was neurologically intact except for some mild confusion about her location stating she was still at doctor last name family hospital her films were reviewed by the neurosurgery staff and the decision was made to take her for cerebral angiogram the angiogram showed an aneurysm at the vertebral basilar junction mm to mm it was unable to be treated endovascularly an open repair was considered to be complicated given the location and the patients age dr first name stitle elected to transfer the patient to hospital to dr last name stitle for further evaluation and possible intervention Case 2 admission_id=188170, subject_id=56707 brief hospital course patient was admitted to neurosurgery on for further management she underwent the above stated procedure please review dictated operative report for details she had a negative angiogram and was to intensive care unit in stable condition she had an uncomplicated intensive care unit course and was transferred to floor in stable condition throughout her hospital course she remained neurologically stable and intact she complained of a mild headache that worsens when she sits up and walks around now dod patient is vss and neurologically stable patient s pain is well controlled and the patient is tolerating a good oral diet pt s incision is clean dry and inctact without evidence of infection patient is ambulating without issues she is set for discharge home in stable condition and will follow up in month for mri a brain with dr first name stitle ICD 430: Subarachnoid hemorrhage Find EHRs that diagnosed as 430 ONLY, Phenotypes (including severity) in red Phenotype (HPO) terms can better characterize patients by providing deeper information .
  14. 14. admission_id=188170, subject_id=56707 "brief hospital course patient was admitted to neurosurgery on for further management she underwent the above stated procedure please review dictated operative report for details she had a negative angiogram and was to intensive care unit in stable condition she had an uncomplicated intensive care unit course and was transferred to floor in stable condition throughout her hospital course she remained neurologically stable and intact she complained of a mild headache that worsens when she sits up and walks around now dod patient is vss and neurologically stable patient s pain is well controlled and the patient is tolerating a good oral diet pt s incision is clean dry and inctact without evidence of infection patient is ambulating without issues she is set for discharge home in stable condition and will follow up in month for mri a brain with dr first name stitle" Example: Phenotype annotation from electronic health record • Problem: HPO terms cannot be fully found using the keyword search method: synonyms and implicit information • Solution: apply AI to do automatic phenotype annotation unsupervised learning with no labelled data ICD 430: Subarachnoid hemorrhage Synonyms: subarachnoid blood == subarachnoid hemorrhage Synonyms: vertebral basilar == vertebrobasilar Implicit information: terms in blue
  15. 15. Example: Phenotype annotation from electronic health record J. Zhang, X. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records”, 2019 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019 There are two types of data sources. : a collection of EHRs and each EHR consists of textual notes written by clinicians. : a standardized general category of human phenotypic abnormalities provided by HPO. The HPO also provides additional subclasses
  16. 16. Example: Phenotype annotation from electronic health record J. Zhang, X. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records”, 2019 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019 Assumptions The semantics of a general phenotype is represented by a prior distribution. The prior distribution of each phenotype should be ‘distinct’ enough from each other. The semantics of EHR is a composition of the semantics of phenotypes. Represented by Generated by Sampled from phenotype vector prior ‘Distinct’ enough other priors Sampled from Represented by Generated by
  17. 17. Example: Phenotype annotation from electronic health record J. Zhang, X. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Unsupervised Annotation of Phenotypic Abnormalities via Semantic Latent Representations on Electronic Health Records”, 2019 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019 Loss 1: text reconstruction of EHRs. Loss 2: text reconstruction of the general phenotypic abnormalities. Loss 3: text reconstruction of the phenotype subclasses. Loss 4: the latent vectors sampled from different priors can be classified to different classes, then the priors are thought to be ‘distinct’ enough.
  18. 18. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Wider: Integrating multi-modal data1.2 Make data trustable Pan-cancer Classification based on Multi-Omics analysis EXAMPLE … Combining Multimodal Combine Data from different modalities Provide a comprehensive view of patients Make more accurate clinical decision
  19. 19. Example: Pan-cancer Classification based on Multi-Omics analysis Our method: We combine the variational autoencoder with a classification network to achieve task- oriented feature extraction and multi- class classification. Inputs Outputs X. Zhang, J. Zhang, K. Sun, X. Yang, C. Dai, and Y. Guo, “Integrated Multi-omics Analysis Using Variational Autoencoders: Application to Pan-cancer Classification”, (short paper) 2019 IEEE International Conference on Bioinformatics and Biomedicine (IEEE BIBM), 2019
  20. 20. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Bigger: Augmenting data1.3 Make data trustable Augmented Data Augment data for limited samples Increase the volume of training samples • Rare disease study • System failure study Imbalance d Data EXAMPLE Synthetic medical image augmentation
  21. 21. Example: Synthetic medical image augmentation Frid-Adar, Maayan, et al. "GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification." Neurocomputing 321 (2018): 321-331. Traditional image augmentation methods: • Flip: flip images horizontally and vertically • Rotation: rotate images by angles • Scale: scale images outward or inward • Crop: randomly sample a section from the original image • Translation: move the image along the X or Y direction • Gaussian Noise: add noise to enhance the learning capability Advanced image augmentation method: • GAN
  22. 22. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable A holistic view: Investigate a complex system as a whole2 Make a good understanding of systems Biological system System of systems Cloud System Investigate system in a holistic view Study all anomaly/failure signals across the whole system
  23. 23. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Simplification VS. Complication: Keep balance in between2 Make a good understanding of systems Inferring model for large-scale biological network EXAMPLEComplicatio n Simplification Model networks of complex systems for better understanding Effective modeling of entities and inter-connections in large scale systems
  24. 24. Example: Inferring large-scale biological network Simplification: Easy to model Hard to mimic the real behavior of system Complication: Hard to model Good to mimic the real behavior of system Problem: How to make a balance between simplification and complication? A. Holehouse, X. Yang, I. Adcock, and Y. Guo, “Developing a novel integrated model of p38 MAPK and glucocorticoid signalling pathways”, Computational Intelligence and Bioinformatics and Computational Biology (CIBCB), 2012.
  25. 25. Example: Inferring large-scale biological network Key observations of large-scale networks • Separation of timescales Sparsity of variations • Cross-reactivity Combined-measurement Prerequisites of sparse signal recovery • The signal is sparse in some domain • A measurement is a weighted linear combination of several points of the signal My suggestion: • We can study the complex network under different timescales. • Within each time scale, only some entities have dynamic changes. • Thus, we can apply sparse learning to infer a model under each timescale. • Then, we combine all models obtained from all timescales together. L. Nie, X. Yang, I. Adcock, Z. Xu, and Y. Guo, “Inferring cell-scale signalling networks via compressive sensing”, PLoS One, vol. 9, no. 4, 2014.
  26. 26. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Moving beyond correlation: Explore towards causation3.1 Make AI algorithm trustable A signal’s predictive power does not necessarily imply that the signal is actually related to or explains the phenomena being predicted. Moving from correlation to causation is especially important for understanding : what are the conditions under which it may fail? how long we can expect it to be predictive? how widely applicable it may be? For an AI model… • Deriving functional connectivity from Brain fMRI data • Root cause diagnosis for system failures APPLICATIONS Correlation Causation
  27. 27. the covariance of the two variables divided by the product of their standard deviations. Fast computation of minimum partial correlation The choice of a hyperparameter, the significant threshold, greatly influences the results. The minimum of all absolute values of partial correlations by controlling on all possible subsets of other nodes Remove indirect relationship L. Nie, X. Yang, P. M. Matthews, Z. Xu, and Y. Guo, “Inferring functional connectivity in fMRI using minimum partial correlation”, International Journal of Automation and Computing, 2017. Causation Correlation measures the degree of association between two random variables, with the effect of a set of controlling random variables removed. Partial correlation Pearson correlation PC algorithm Minimum partial correlation Automatically increase the significant threshold within a given time limit to maximally approach the minimum partial correlation. Avoid repeating partial correlation done previously with lower significant threshold. Elastic PC algorithm
  28. 28. 1. 2. 3.Make data trustable Make a good understanding of systems Make AI algorithm trustable Moving beyond AI black-box: Explore towards explainable AI3.2 Make AI algorithm trustable Explanability is the process of giving explanations to Human Why we need Explainable AI? Demands from industry and society Desires of human brain My future research direction: towards AI algorithm explainability Black-box Explainable
  29. 29. Towards AI algorithm explainability AUTO FEATURE ENGINEERING COMBINATION OF PHYSICS/MATH/TRADITIONAL ML MODELS WITH ADVANCED DL LEARNING LAWS FROM DATA RATHER THAN CURVE FITTING Automatically generate explainable features in the model construction process Features: height, weight -> Label: health [] Features: height, weight, BMI = w/h2 -> Label: health []
  30. 30. Conclusion There is still a long way to go before fully trustable AI. We must work on it now! The future of AI for complex systems depends on trust. To achieve this, we need to work beyond the algorithm aspect and work on 1) trustable data, 2) good understanding of systems 3) trustable AI algorithms. Moving towards causation is crucial for making AI algorithms trustable.
  31. 31. Thanks Xian YangTowards Trustable AI for Complex Systems: a Life Science and AIOps Perspective
  32. 32. Q&A

Editor's Notes


  • Source:
    Wikipedia
    https://en.wikipedia.org/wiki/File:Signal_transduction_pathways.png#filelinks

    GREATOPS
    http://www.gaowei.vip/m/Library/detail?no=94991143


  • Riverbed
    https://www.riverbed.com/faq/what-is-aiops.html




  • ReferralMD
    https://getreferralmd.com/2018/12/big-data-and-ai-in-healthcare-marketing/

×