This document discusses interpretable machine learning and explainable AI. It begins with definitions of key terms and an overview of interpretable methods. Deep learning models are often treated as "black boxes" that are difficult to interpret. Interpretability can be achieved by using inherently interpretable models like linear models or decision trees, adding attention mechanisms, or interpreting models before, during or after building them. Later sections discuss specific interpretable techniques like understanding data through examples, MMD-Critic for learning prototypes and criticisms, and visualizing convolutional neural networks to understand predictions. The document emphasizes the importance of interpretability and explains several approaches to make machine learning models more transparent to humans.
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
This presentation explains basic ideas of graph neural networks (GNNs) and their common applications. Primary target audiences are students, engineers and researchers who are new to GNNs but interested in using GNNs for their projects. This is a modified version of the course material for a special lecture on Data Science at Nara Institute of Science and Technology (NAIST), given by Preferred Networks researcher Katsuhiko Ishiguro, PhD.
An Introduction to XAI! Towards Trusting Your ML Models!Mansour Saffar
Machine learning (ML) is currently disrupting almost every industry and is being used as the core component in many systems. The decisions made by these systems may have a great impact on society and specific individuals and thus the decision-making process has to be clear and explainable so humans can trust it. Explainable AI (XAI) is a rather new field in ML in which researchers try to develop models that are able to explain the decision-making process behind ML models. In this talk, we'll learn about the fundamentals of XAI and discuss why we need to start to integrate XAI with our ML models!
Presented in Edmonton DataScience Meetup on October 2nd, 2019. Learn more: https://youtu.be/gEkPXOsDt_w
PR-409: Denoising Diffusion Probabilistic ModelsHyeongmin Lee
이번 논문은 요즘 핫한 Diffusion을 처음으로 유행시킨 Denoising Diffusion Probabilistic Models (DDPM) 입니다. ICML 2015년에 처음 제안된 Diffusion의 여러 실용적인 측면들을 멋지게 해결하여 그 유행의 시작을 알린 논문인데요, Generative Model의 여러 분야와 Diffusion, 그리고 DDPM에서는 무엇이 바뀌었는지 알아보도록 하겠습니다.
논문 링크: https://arxiv.org/abs/2006.11239
영상 링크: https://youtu.be/1j0W_lu55nc
Usage of AI and machine learning models is likely to become more commonplace as larger swaths of the economy embrace automation and data-driven decision-making. While these predictive systems can be quite accurate, they have been treated as inscrutable black boxes in the past, that produce only numeric predictions with no accompanying explanations. Unfortunately, recent studies and recent events have drawn attention to mathematical and sociological flaws in prominent weak AI and ML systems, but practitioners usually don’t have the right tools to pry open machine learning black-boxes and debug them.
This presentation introduces several new approaches to that increase transparency, accountability, and trustworthiness in machine learning models. If you are a data scientist or analyst and you want to explain a machine learning model to your customers or managers (or if you have concerns about documentation, validation, or regulatory requirements), then this presentation is for you!
https://telecombcn-dl.github.io/2018-dlai/
Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...Preferred Networks
This presentation explains basic ideas of graph neural networks (GNNs) and their common applications. Primary target audiences are students, engineers and researchers who are new to GNNs but interested in using GNNs for their projects. This is a modified version of the course material for a special lecture on Data Science at Nara Institute of Science and Technology (NAIST), given by Preferred Networks researcher Katsuhiko Ishiguro, PhD.
An Introduction to XAI! Towards Trusting Your ML Models!Mansour Saffar
Machine learning (ML) is currently disrupting almost every industry and is being used as the core component in many systems. The decisions made by these systems may have a great impact on society and specific individuals and thus the decision-making process has to be clear and explainable so humans can trust it. Explainable AI (XAI) is a rather new field in ML in which researchers try to develop models that are able to explain the decision-making process behind ML models. In this talk, we'll learn about the fundamentals of XAI and discuss why we need to start to integrate XAI with our ML models!
Presented in Edmonton DataScience Meetup on October 2nd, 2019. Learn more: https://youtu.be/gEkPXOsDt_w
PR-409: Denoising Diffusion Probabilistic ModelsHyeongmin Lee
이번 논문은 요즘 핫한 Diffusion을 처음으로 유행시킨 Denoising Diffusion Probabilistic Models (DDPM) 입니다. ICML 2015년에 처음 제안된 Diffusion의 여러 실용적인 측면들을 멋지게 해결하여 그 유행의 시작을 알린 논문인데요, Generative Model의 여러 분야와 Diffusion, 그리고 DDPM에서는 무엇이 바뀌었는지 알아보도록 하겠습니다.
논문 링크: https://arxiv.org/abs/2006.11239
영상 링크: https://youtu.be/1j0W_lu55nc
Usage of AI and machine learning models is likely to become more commonplace as larger swaths of the economy embrace automation and data-driven decision-making. While these predictive systems can be quite accurate, they have been treated as inscrutable black boxes in the past, that produce only numeric predictions with no accompanying explanations. Unfortunately, recent studies and recent events have drawn attention to mathematical and sociological flaws in prominent weak AI and ML systems, but practitioners usually don’t have the right tools to pry open machine learning black-boxes and debug them.
This presentation introduces several new approaches to that increase transparency, accountability, and trustworthiness in machine learning models. If you are a data scientist or analyst and you want to explain a machine learning model to your customers or managers (or if you have concerns about documentation, validation, or regulatory requirements), then this presentation is for you!
Introductory presentation to Explainable AI, defending its main motivations and importance. We describe briefly the main techniques available in March 2020 and share many references to allow the reader to continue his/her studies.
Slide for Arithmer Seminar given by Dr. Daisuke Sato (Arithmer) at Arithmer inc.
The topic is on "explainable AI".
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Transformer modality is an established architecture in natural language processing that utilizes a framework of self-attention with a deep learning approach.
This presentation was delivered under the mentorship of Mr. Mukunthan Tharmakulasingam (University of Surrey, UK), as a part of the ScholarX program from Sustainable Education Foundation.
Interpreting deep learning and machine learning models is not just another regulatory burden to be overcome. Scientists, physicians, researchers, and analyst that use these technologies for their important work have the right to trust and understand their models and the answers they generate. This talk is an overview of several techniques for interpreting deep learning and machine learning models and telling stories from their results.
Speaker: Patrick Hall is a Data Scientist and Product Engineer at H2O.ai. He’s also an Adjunct Professor at George Washington University for the Department of Decision Sciences. Prior to joining H2O, Patrick spent many years as a Senior Data Scientist SAS and has worked with many Fortune 500 companies on their data science and machine learning problems. https://www.linkedin.com/in/jpatrickhall
Explainable AI (XAI) is becoming Must-Have NFR for most AI enabled product or solution deployments. Keen to know viewpoints and collaboration opportunities.
Interpretable Machine Learning describes the process of revealing causes of predictions and explaining a derived decision in a way that is understandable to humans. The ability to understand the causes that lead to a certain prediction enables data scientists to ensure that the model is consistent to the domain knowledge of an expert. Furthermore, interpretability is critical to obtain trust in a model and to be able to tackle problems like unfair biases or discrimination against particular subgroups. This talk covers an introduction to the concept of interpretability and an overview of popular interpretability techniques.
Speaker: Marcel Spitzer, inovex
Event: Kaggle Munich Meetup, 20.11.2018
Mehr Tech-Vorträge: www.inovex.de/vortraege
Mehr Tech-Artikel: www.inovex.de/blog
This was presented at the London Artificial Intelligence & Deep Learning Meetup.
https://www.meetup.com/London-Artificial-Intelligence-Deep-Learning/events/245251725/
Enjoy the recording: https://youtu.be/CY3t11vuuOM.
- - -
Kasia discussed complexities of interpreting black-box algorithms and how these may affect some industries. She presented the most popular methods of interpreting Machine Learning classifiers, for example, feature importance or partial dependence plots and Bayesian networks. Finally, she introduced Local Interpretable Model-Agnostic Explanations (LIME) framework for explaining predictions of black-box learners – including text- and image-based models - using breast cancer data as a specific case scenario.
Kasia Kulma is a Data Scientist at Aviva with a soft spot for R. She obtained a PhD (Uppsala University, Sweden) in evolutionary biology in 2013 and has been working on all things data ever since. For example, she has built recommender systems, customer segmentations, predictive models and now she is leading an NLP project at the UK’s leading insurer. In spare time she tries to relax by hiking & camping, but if that doesn’t work ;) she co-organizes R-Ladies meetups and writes a data science blog R-tastic (https://kkulma.github.io/).
https://www.linkedin.com/in/kasia-kulma-phd-7695b923/
[Video recording available at https://www.youtube.com/playlist?list=PLewjn-vrZ7d3x0M4Uu_57oaJPRXkiS221]
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, and lending. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
In this talk, Dmitry shares his approach to feature engineering which he used successfully in various Kaggle competitions. He covers common techniques used to convert your features into numeric representation used by ML algorithms.
Introductory presentation to Explainable AI, defending its main motivations and importance. We describe briefly the main techniques available in March 2020 and share many references to allow the reader to continue his/her studies.
Slide for Arithmer Seminar given by Dr. Daisuke Sato (Arithmer) at Arithmer inc.
The topic is on "explainable AI".
"Arithmer Seminar" is weekly held, where professionals from within and outside our company give lectures on their respective expertise.
The slides are made by the lecturer from outside our company, and shared here with his/her permission.
Arithmer株式会社は東京大学大学院数理科学研究科発の数学の会社です。私達は現代数学を応用して、様々な分野のソリューションに、新しい高度AIシステムを導入しています。AIをいかに上手に使って仕事を効率化するか、そして人々の役に立つ結果を生み出すのか、それを考えるのが私たちの仕事です。
Arithmer began at the University of Tokyo Graduate School of Mathematical Sciences. Today, our research of modern mathematics and AI systems has the capability of providing solutions when dealing with tough complex issues. At Arithmer we believe it is our job to realize the functions of AI through improving work efficiency and producing more useful results for society.
Transformer modality is an established architecture in natural language processing that utilizes a framework of self-attention with a deep learning approach.
This presentation was delivered under the mentorship of Mr. Mukunthan Tharmakulasingam (University of Surrey, UK), as a part of the ScholarX program from Sustainable Education Foundation.
Interpreting deep learning and machine learning models is not just another regulatory burden to be overcome. Scientists, physicians, researchers, and analyst that use these technologies for their important work have the right to trust and understand their models and the answers they generate. This talk is an overview of several techniques for interpreting deep learning and machine learning models and telling stories from their results.
Speaker: Patrick Hall is a Data Scientist and Product Engineer at H2O.ai. He’s also an Adjunct Professor at George Washington University for the Department of Decision Sciences. Prior to joining H2O, Patrick spent many years as a Senior Data Scientist SAS and has worked with many Fortune 500 companies on their data science and machine learning problems. https://www.linkedin.com/in/jpatrickhall
Explainable AI (XAI) is becoming Must-Have NFR for most AI enabled product or solution deployments. Keen to know viewpoints and collaboration opportunities.
Interpretable Machine Learning describes the process of revealing causes of predictions and explaining a derived decision in a way that is understandable to humans. The ability to understand the causes that lead to a certain prediction enables data scientists to ensure that the model is consistent to the domain knowledge of an expert. Furthermore, interpretability is critical to obtain trust in a model and to be able to tackle problems like unfair biases or discrimination against particular subgroups. This talk covers an introduction to the concept of interpretability and an overview of popular interpretability techniques.
Speaker: Marcel Spitzer, inovex
Event: Kaggle Munich Meetup, 20.11.2018
Mehr Tech-Vorträge: www.inovex.de/vortraege
Mehr Tech-Artikel: www.inovex.de/blog
This was presented at the London Artificial Intelligence & Deep Learning Meetup.
https://www.meetup.com/London-Artificial-Intelligence-Deep-Learning/events/245251725/
Enjoy the recording: https://youtu.be/CY3t11vuuOM.
- - -
Kasia discussed complexities of interpreting black-box algorithms and how these may affect some industries. She presented the most popular methods of interpreting Machine Learning classifiers, for example, feature importance or partial dependence plots and Bayesian networks. Finally, she introduced Local Interpretable Model-Agnostic Explanations (LIME) framework for explaining predictions of black-box learners – including text- and image-based models - using breast cancer data as a specific case scenario.
Kasia Kulma is a Data Scientist at Aviva with a soft spot for R. She obtained a PhD (Uppsala University, Sweden) in evolutionary biology in 2013 and has been working on all things data ever since. For example, she has built recommender systems, customer segmentations, predictive models and now she is leading an NLP project at the UK’s leading insurer. In spare time she tries to relax by hiking & camping, but if that doesn’t work ;) she co-organizes R-Ladies meetups and writes a data science blog R-tastic (https://kkulma.github.io/).
https://www.linkedin.com/in/kasia-kulma-phd-7695b923/
[Video recording available at https://www.youtube.com/playlist?list=PLewjn-vrZ7d3x0M4Uu_57oaJPRXkiS221]
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we present an overview of model interpretability and explainability in AI, key regulations / laws, and techniques / tools for providing explainability as part of AI/ML systems. Then, we focus on the application of explainability techniques in industry, wherein we present practical challenges / guidelines for effectively using explainability techniques and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, and lending. Finally, based on our experiences in industry, we identify open problems and research directions for the data mining / machine learning community.
In this talk, Dmitry shares his approach to feature engineering which he used successfully in various Kaggle competitions. He covers common techniques used to convert your features into numeric representation used by ML algorithms.
Optimized Neural Network for Classification of Multispectral ImagesIDES Editor
The proposed work involves the multiobjective PSO
based optimization of artificial neural network structure for
the classification of multispectral satellite images. The neural
network is used to classify each image pixel in various land
cove types like vegetations, waterways, man-made structures
and road network. It is per pixel supervised classification using
spectral bands (original feature space). Use of neural network
for classification requires selection of most discriminative
spectral bands and determination of optimal number of nodes
in hidden layer. We propose new methodology based on
multiobjective particle swarm optimization (MOPSO) to
determine discriminative spectral bands and the number of
hidden layer node simultaneously. The result obtained using
such optimized neural network is compared with that of
traditional classifiers like MLC and Euclidean classifier. The
performance of all classifiers is evaluated quantitatively using
Xie-Beni and â indexes. The result shows the superiority of
the proposed method.
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Sri Ambati
This meetup took place in Mountain View on January 24th, 2019.
Description:
With the effort and contributions from researchers and practitioners from academia and industry, Machine Learning Interpretation has become a young sub-field of ML. However, the norms around its definition and understanding is still in its infancy and there are numerous different approaches emerging rapidly. However, there seems to be a lack of a consistent explanation framework to evaluate and consistently benchmark different algorithms - evaluating against interpretation, completeness and consistency of the algorithms.
The idea with the gym is to provide a controlled interactive environment for all forms of Machine Learning algorithms, - initially focusing on supervised predictive modeling problems, to allow analysts and data-scientists to explore, debug and generate insightful understanding of the models by
1.Model Validation: Ways to explore and validate black box ML systems enabling model comparison both globally and locally - identifying biases in the training data through interpretation.
2.What-if Analysis: An interactive environment where communication can happen i.e. enable learning through interactions. User having the ability to conduct "What-If" analysis - effect of single or multiple features and their interactions
3.Model Debugging: Ways to analyze the misbehavior of the model by exploring counterfactual examples(adversarial examples and training)
4. Interpretable Models: Ability to build natively interpretable models - with the goal to simplify complex models to enable better understanding.
The central concept with MLI gym is to have an interactive environment where one could explore and simulate variations in the world(a world post a model is operationalized) beyond the defined model metrics point estimates - e.g. ROC-AUC, confusion matrix, RMSE, R2 score and others.
Speaker's Bio:
Pramit is a Lead Data Scientist/ at H2O.ai. His area of interests is building Statistical/Machine Learning models(Bayesian and Frequentist Modeling techniques) to help the business realize their data-driven goals.
Currently, he is exploring "Model Interpretation" as means to efficiently understand the true nature of predictive models to enable model robustness and security. He believes effective Model Inference coupled with Adversarial training could lead to building trustworthy models with known blind spots. He has started an open source project Skater: https://github.com/datascienceinc/Skater to solve the need for Model Inference(The project is still in its early stages of development but check it out, always eager for feedback)
A Parallel Framework For Multilayer Perceptron For Human Face RecognitionCSCJournals
Artificial neural networks have already shown their success in face recognition and similar complex pattern recognition tasks. However, a major disadvantage of the technique is that it is extremely slow during training for larger classes and hence not suitable for real-time complex problems such as pattern recognition. This is an attempt to develop a parallel framework for the training algorithm of a perceptron. In this paper, two general architectures for a Multilayer Perceptron (MLP) have been demonstrated. The first architecture is All-Class-in-One-Network (ACON) where all the classes are placed in a single network and the second one is One-Class-in-One-Network (OCON) where an individual single network is responsible for each and every class. Capabilities of these two architectures were compared and verified in solving human face recognition, which is a complex pattern recognition task where several factors affect the recognition performance like pose variations, facial expression changes, occlusions, and most importantly illumination changes. Experimental results show that the proposed OCON structure performs better than the conventional ACON in terms of network training convergence speed and which can be easily exercised in a parallel environment.
Image segmentation and classification tasks in computer vision have proven to be highly effective using neural networks, specifically Convolutional Neural Networks (CNNs). These tasks have numerous
practical applications, such as in medical imaging, autonomous driving, and surveillance. CNNs are capable
of learning complex features directly from images and achieving outstanding performance across several
datasets. In this work, we have utilized three different datasets to investigate the efficacy of various preprocessing and classification techniques in accurssedately segmenting and classifying different structures
within the MRI and natural images. We have utilized both sample gradient and Canny Edge Detection
methods for pre-processing, and K-means clustering have been applied to segment the images. Image
augmentation improves the size and diversity of datasets for training the models for image classification
Image segmentation and classification tasks in computer vision have proven to be highly effective using neural networks, specifically Convolutional Neural Networks (CNNs). These tasks have numerous
practical applications, such as in medical imaging, autonomous driving, and surveillance. CNNs are capable
of learning complex features directly from images and achieving outstanding performance across several
datasets. In this work, we have utilized three different datasets to investigate the efficacy of various preprocessing and classification techniques in accurssedately segmenting and classifying different structures
within the MRI and natural images. We have utilized both sample gradient and Canny Edge Detection
methods for pre-processing, and K-means clustering have been applied to segment the images. Image
augmentation improves the size and diversity of datasets for training the models for image classification.
This work highlights transfer learning’s effectiveness in image classification using CNNs and VGG 16 that
provides insights into the selection of pre-trained models and hyper parameters for optimal performance.
We have proposed a comprehensive approach for image segmentation and classification, incorporating preprocessing techniques, the K-means algorithm for segmentation, and employing deep learning models such
as CNN and VGG 16 for classification.
Utilizing XAI Technique to Improve Autoencoder based Model for Computer Netwo...IJCNCJournal
Machine learning (ML) and Deep Learning (DL) methods are being adopted rapidly, especially in computer network security, such as fraud detection, network anomaly detection, intrusion detection, and much more. However, the lack of transparency of ML and DL based models is a major obstacle to their implementation and criticized due to its black-box nature, even with such tremendous results. Explainable Artificial Intelligence (XAI) is a promising area that can improve the trustworthiness of these models by giving explanations and interpreting its output. If the internal working of the ML and DL based models is understandable, then it can further help to improve its performance. The objective of this paper is to show that how XAI can be used to interpret the results of the DL model, the autoencoder in this case. And, based on the interpretation, we improved its performance for computer network anomaly detection. The kernel SHAP method, which is based on the shapley values, is used as a novel feature selection technique. This method is used to identify only those features that are actually causing the anomalous behaviour of the set of attack/anomaly instances. Later, these feature sets are used to train and validate the autoencoderbut on benign data only. Finally, the built SHAP_Model outperformed the other two models proposed based on the feature selection method. This whole experiment is conducted on the subset of the latest CICIDS2017 network dataset. The overall accuracy and AUC of SHAP_Model is 94% and 0.969, respectively.
UTILIZING XAI TECHNIQUE TO IMPROVE AUTOENCODER BASED MODEL FOR COMPUTER NETWO...IJCNCJournal
Machine learning (ML) and Deep Learning (DL) methods are being adopted rapidly, especially in computer network security, such as fraud detection, network anomaly detection, intrusion detection, and much more. However, the lack of transparency of ML and DL based models is a major obstacle to their implementation and criticized due to its black-box nature, even with such tremendous results. Explainable Artificial Intelligence (XAI) is a promising area that can improve the trustworthiness of these models by giving explanations and interpreting its output. If the internal working of the ML and DL based models is understandable, then it can further help to improve its performance. The objective of this paper is to show that how XAI can be used to interpret the results of the DL model, the autoencoder in this case. And, based on the interpretation, we improved its performance for computer network anomaly detection. The kernel SHAP method, which is based on the shapley values, is used as a novel feature selection technique. This method is used to identify only those features that are actually causing the anomalous behaviour of the set of attack/anomaly instances. Later, these feature sets are used to train and validate the autoencoderbut on benign data only. Finally, the built SHAP_Model outperformed the other two models proposed based on the feature selection method. This whole experiment is conducted on the subset of the latest CICIDS2017 network dataset. The overall accuracy and AUC of SHAP_Model is 94% and 0.969, respectively.
Machine learning in science and industry — day 4arogozhnikov
- tabular data approach to machine learning and when it didn't work
- convolutional neural networks and their application
- deep learning: history and today
- generative adversarial networks
- finding optimal hyperparameters
- joint embeddings
Variational Continual Learning
Cuong V. Nguyen, Yingzhen Li, Thang D. Bui, Richard E. Turner
Published at International Conference on Learning Representations (ICLR) 2018
Survey on Script-based languages to write a ChatbotNguyen Giang
I’ve been playing with both AIML and ChatScript. AIML is very easy to use but not powerful enough if you want to make a chatbot that “looks” intelligent these days. AIML is a very simple pattern language, substantially less complex than regexes and as such less than level 3 Chomsky hierarchy) . ChatScript on the other hand is TOO powerful and learning how to use it may take you a while.
And then I found Artificial Intelligence Scripting Language , and I love it! Because it is a compromise of simplicity, power and easy to use. Try it and you will love it.
Tala is Kenya’s #1 finance app! We provide credit at your fingertips, whenever, wherever. All with low fees and an easy repayment schedule. Tala is mobile technology and data science company that is revolutionizing financial services in emerging markets.
Virtual assistant with Amazon Alexa Virtual Assistant. Amazon Alexa is a virtual assistant developed by Amazon, first used in the Amazon Echo and the Amazon Echo Dot smart speakers developed by Amazon Lab126.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
6. Types of Interpretable Methods
We can interpret the model either before building the model, when
building it, or after building a model.
Most interpretation methods for DNNs interpret the model after it is built.
6
9. Attention Mechanisms
Attention mechanisms guide deep neural networks to focus on
relevant input features, which allows to interpret how the model made
certain predictions.
9
[Bahdanau et al. 15] Neural Machine Translation by Jointly Learning to Align and Translate, ICLR 2015
10. Limitation of Conventional Attention Mechanisms
Conventional attention models may allocate attention inaccurately since
they are trained in a weakly-supervised manner.
The problem becomes more prominent when a task has no one-to-one
mapping from inputs to the final predictions.
10
11. Limitation of Conventional Attention Mechanisms
This is because the conventional attention mechanisms do not consider
uncertainties in the model and the input, which often leads to
overconfident attention allocations.
Such unreliability may lead to incorrect predictions and/or interpretations
which can result in fatal consequences for safety-critical applications.
11
13. Uncertainty Aware Attention (UA)
13
Multi-class classification performance on the three health records datasets
14. Info-GAN
14
There are structures in the noise vectors that have meaningful and
consistent effects on the output of the generator.
However, there’s no systematic way to find these structures. The only
thing affecting to the generator output is the noise input, so we have no
idea how to modify the noise to generate expected images.
15. Info-GAN
15
The idea is to provide a latent code, which has meaningful and consistent
effects on the output - disentangled representation
The hope is that if you keep the code the same and randomly change the
noise, you get variations of the same digit.
18. Understanding Black-Box Predictions
Given a high-accuracy blackbox model and a prediction from it, can we
answer why the model made a certain prediction?
[Koh and Liang 17] tackles this question by training a model’s prediction through its learning algorithm
and back to the training data.
To formalize the impact of a training point on a prediction, they ask the counterfactual:
What would happen if we did not have this training point or if its value were slightly changed?
18
[Koh and Liang 17] Understanding Black-box Predictions via Influence Functions, ICML 2017
19. Interpretable Mimic Learning
This framework is mainly based on knowledge distillation from Neural
Networks.
However, they use Gradient Boosting Trees (GBT) instead of another neural
network as the student model since GBT satisfies our requirements for
both learning capacity and interpretability.
19[Che et al. 2016] Z. Che, S. Purushotham, R. Khemani, and Y. Liu. Interpretable Deep Models for
ICU outcome prediction, AMIA 2016.
Knowledge distillation
G. Hinton et al. 15
20. Interpretable Mimic Learning
The resulting simple model works even better than the best deep learning
model – perhaps due to suppression of the overfitting.
20[Che et al. 2016] Z. Che, S. Purushotham, R. Khemani, and Y. Liu. Interpretable Deep Models for
ICU outcome prediction, AMIA 2016.
21. Visualizing Convolutional Neural Networks
Propose Deconvolution Network (deconvnet) to inversely map the feature
activations to pixel space and provide a sensitivity analysis to point out
which regions of an image affect to decision making process the most.
21
[Zeiler and Fergus 14] Visualizing and Understanding Convolutional Networks, ECCV 2014
22. Prediction difference analysis
22
The visualization method shows which pixels of a specific input image are
evidence for or against a prediction
[Zintgraf et al. 2017] Visualizing Deep Neural Network Decisions: Prediction Difference Analysis, ICLR 2017
Shown is the evidence for (red) and against (blue) the prediction.
We see that the facial features of the cockatoo are most supportive for the decision, and
parts of the body seem to constitute evidence against it.
24. Understanding Data Through Examples
[Kim et al. 16] propose to interpret the given data by providing examples
that can show the full picture – majorities + minorities
[Kim et al. 16] Examples are not Enough, Learn to Criticize! Criticism for Interpretability 24
47. 48
Pilot study with human subjects
Definition of interpretability: A method is interpretable if a user can
correctly and efficiently predict the method’s results.
Task: Assign a new data point to one of the groups using 1) all images
2) prototypes 3) prototypes and criticisms 4) small set of randomly
selected images
50. Take-home messages
51
• There are three types of Interpretable Methods, but mostly after building
models
• Criticism and prototypes are equally important and are a step towards
improving interpretability of complex data distributions
• MMD-critic learns prototypes + criticisms that highlight aspects of
data that are overlooked by prototypes.
51. Discussion
52
• If we have the insight into a dataset, can we really build a better model?
Human intuition is biased and not realiable!
52. Gap in Interpretable ML research
53
• Limited work to explain the operation of RNNs, only CNN. Attention
mechanism is not enough. Especially in multimodal network (CNN +
RNN), this kind of research is more necessary
As a result of the success of deep learning over the past decade, many model success and even surpass human performance on classification tasks. However, it still remains secrect how deep learning models actually works.
DL models are usually considered as black-box
First and foremost, I would like to provide a bird view over X-ai
As a result of the success of deep learning over the past decade, many model success and even surpass human performance on classification tasks. However, it still remains secrect how deep learning models actually works.
DL models are usually considered as black-box
To deal with this, interpretation should be given to support the operation of DL models. However, Interpretability is not a well-defined concept
Generally speaking, interpretable methods are now divided into three categories: before building the model, when building it, or after building a model. However, Most interpretation methods for DNNs interpret the model after it is built.
First, when building a new model, we can use/
An intuitive example is to use a sparse models, which is easy to understand. In addition, decision tree support human intuition as we can know the decision at each stage.
Another solution is to use attention mechanism as at each time step, we can adjust the focal point in input
The next category, interpretation after building a mode, which covers almost all papers in this course.
In a paper, Understanding Black-box Predictions via Influence Functions, Koh and Liang address the question: why the model made a certain prediction
By training a model’s prediction through its learning algorithm and back to the training data. To formalize the impact of a training point on a prediction, they ask the counterfactual:What would happen if we did not have this training point or if its value were slightly changed?
In paper Visualizing and Understanding Convolutional Networks, authors proposed to visualize learned representations in convolutional neural networks using deconvolution and maximally activating images.
Another paper, mostly you know, Visualizing Deep Neural Network Decisions: Prediction Difference Analysis, they highlights areas in a given input image that provide evidence for or against a certain class.
The paper I am gonna present today falls into this type of category, Interpretation Before Building a Model
This paper explore data analysis through examples
Now I will introduce the paper: Examples are not Enough, Learn to Criticize! Criticism for Interpretability
AI community invents million of different DL models, but essentially, AI is data-driven, what we get is what we have. Its mean the model will behave based on the data we provide
So, it would be nice if we know what we are having before building any models
Imagine you are given a giant dataset, that contains one billion of data points. Before modeling, you wanna get a sense of what the data looks like. Of couse you don’t have time to look at all one billion images so you might do sampling from this group
A lot of images look like this
Another group shows that this kind of image is popular.
But the problem is that protoptyes images don’t give you the full picture. There are also groups like this, and we need to look at them to get the full picture. Then the question is which group should we look?
We want to look at important minorities. Others you can ignore.
Like this one, animal laying on keyboard. These are small but noy ignorable
Or this one. They are different from prototypes we have seen so far
So you finally want to come up with an algorithm to efficiently select majorities and important minorities
So this paper is about an algorithm of that kind. The idea is not only select prototypes but also important minorities. This helps human get better insights into a complex high dimensional dataset
Now coming to related work of this paper
Human tends to over-generalize and this cartoon suggest overgeneralization. This algorthim in this paper help us to minimiza the over-generalization via prototypes + criticisms
However, examples are not enough. Relying only on examples to explain the models’ behavior can lead over-generalization and misunderstanding. Examples alone may be sufficient when the distribution of data points are ‘clean’ – in the sense that there exists a set of prototypical examples which sufficiently represent the data. However, this is rarely the case in real world data. For instance, fitting models to complex datasets often requires the use of regularization
Here fitting models to complex datasets often requires the use of regularization means when training, we add regularization to generalize both prototype and criticism then we can not see the real distribution of data.
There are number of methods to select prototypes but non of them focus on minorities. There are outlier detection methods that consider minorities however mostly focus on detecting abnormalities rather than representing the whole distribution.
Now, we will explore how MMD-critic works
So, technically speaking, this work will select prototypes generated from distribution p, and criticism from …
Here, how can we measure the distance between the distribution, the authors propose to use MMD
MMD is used to calculate the discrepancy between two distribution P and Q, by this witness function. However, this function is intractable; as a result, we need to approximate this function by sampling like this function.
To further measure this function, authors use Bayesian model criticism and two-sample tests.
Prototypes: min vi cac representative la se dung gan nhau
Criticisms: max boi vi 2 distribution se la xa nhau
Now jumping to experiments
This paper conducts three experiments, both qualitatively and quantitatively
Competitive performance with PS, thuat toan classifier su dung nearest neighbor de classify (clustering)
Measure how well they did and how quickly they give back the response. Talking about speed first, people work fastest using prototypes (make sense vi so sample trong prototypes la it nhat)…
Conclusion: When criticism is given together with prototypes, a human pilot study suggests that humans are better able to perform a predictive task that requires the data-distributions to be well-explained. This suggests that criticism and prototypes are a step towards improving interpretability of complex data distributions. (Nhom thu 3 perform tot nhat boi vi da biet nhom so 2 la prototype roi). Prototypes + criticisms works best suggest that human intuition works best if the dataset only have prototypes + criticisms => we can filter data to get only prototype+criticism, khi do human da co insight tot => co the build model tot hon