The document discusses interpreting deep neural networks using decision trees. It describes experiments comparing the performance of neural networks with 1-15 hidden layers to decision trees built from the outputs of hidden layers on several datasets. The results show decision trees can approximate neural networks with less than 1% difference in accuracy. Tree size generally decreases as more hidden layers are added, though may increase or stop changing for some problems. The work seeks to determine the optimal number of hidden layers needed for problems by examining tree size. Future work is proposed to further analyze the effects of training parameters and test on larger datasets.
Towards Dropout Training for Convolutional Neural Networks Mah Sa
Design inspired by : https://www.slideshare.net/roelofp/python-for-image-understanding-deep-learning-with-convolutional-neural-nets?qid=06301e83-f65e-40a9-92a2-201664cd6119&v=&b=&from_search=1
Special tank to him....
Towards Dropout Training for Convolutional Neural Networks Mah Sa
Design inspired by : https://www.slideshare.net/roelofp/python-for-image-understanding-deep-learning-with-convolutional-neural-nets?qid=06301e83-f65e-40a9-92a2-201664cd6119&v=&b=&from_search=1
Special tank to him....
Decision tree knowledge discovery through neural Networks
structure of decision tree and neural networks.
how they work?
Models
working
knowledge discovery
clustering
Efficient Reduced BIAS Genetic Algorithm for Generic Community Detection Obje...Aditya K G
This presentation is presented for the thesis defense for ERBGA Dissertation for partial fulfillment of graduation for Master in Computer Science at UMSL on 18th April 2018 by Aditya Karnam.
K-Nearest neighbor is one of the most commonly used classifier based in lazy learning. It is one of the most commonly used methods in recommendation systems and document similarity measures. It mainly uses Euclidean distance to find the similarity measures between two data points.
Decision Trees for Classification: A Machine Learning AlgorithmPalin analytics
Decision Trees in Machine Learning - Decision tree method is a commonly used data mining method for establishing classification systems based on several covariates or for developing prediction algorithms for a target variable.
Image classification is perhaps the most important part of digital image analysis. In this paper, we compare the most widely used model CNN Convolutional Neural Network , and MLP Multilayer Perceptron . We aim to show how both models differ and how both models approach towards the final goal, which is image classification. Souvik Banerjee | Dr. A Rengarajan "Hand-Written Digit Classification" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42444.pdf Paper URL: https://www.ijtsrd.comcomputer-science/artificial-intelligence/42444/handwritten-digit-classification/souvik-banerjee
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
More Related Content
Similar to Interpreting Deep Neural Networks Based on Decision Trees
Decision tree knowledge discovery through neural Networks
structure of decision tree and neural networks.
how they work?
Models
working
knowledge discovery
clustering
Efficient Reduced BIAS Genetic Algorithm for Generic Community Detection Obje...Aditya K G
This presentation is presented for the thesis defense for ERBGA Dissertation for partial fulfillment of graduation for Master in Computer Science at UMSL on 18th April 2018 by Aditya Karnam.
K-Nearest neighbor is one of the most commonly used classifier based in lazy learning. It is one of the most commonly used methods in recommendation systems and document similarity measures. It mainly uses Euclidean distance to find the similarity measures between two data points.
Decision Trees for Classification: A Machine Learning AlgorithmPalin analytics
Decision Trees in Machine Learning - Decision tree method is a commonly used data mining method for establishing classification systems based on several covariates or for developing prediction algorithms for a target variable.
Image classification is perhaps the most important part of digital image analysis. In this paper, we compare the most widely used model CNN Convolutional Neural Network , and MLP Multilayer Perceptron . We aim to show how both models differ and how both models approach towards the final goal, which is image classification. Souvik Banerjee | Dr. A Rengarajan "Hand-Written Digit Classification" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42444.pdf Paper URL: https://www.ijtsrd.comcomputer-science/artificial-intelligence/42444/handwritten-digit-classification/souvik-banerjee
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Interpreting Deep Neural Networks Based on Decision Trees
1. Interpreting Deep Neural Networks
Based on Decision Trees
University of Aizu
System Intelligence Laboratory
s1240183 Tsukasa Ueno
Supervised by Qiangfu Zhao
1
3. Background
・From the 1980s, Neural Network(NN) has been studied and used successfully for solving many
problems.
・From the 2010s, Deep Neural Network(DNN) has come to be noticed with good results.
・ DNN is becoming a core for machine learning.
・Image recognition, Voice recognition, Abnormality detection
3
4. Background
・However, it is difficult for human to understand why DNN outputs the results.
・That is called “the Black Box problem”.
・Therefore, it is difficult to use DNN resolving problems which should be resolved
carefully
・Medical, Politics, Judicature, etc.
4
5. Background
・3 types of approaches for interpreting
・Decompositional approach [1]
・Transform each neuron one by one into logic formula
・Computational cost is expensive(exponential to the number of inputs)
・Pedagogical approach [2][3]
・Use the trained NN as a teacher, and train another interpretable model such as Decision Tree
・Computational cost is low, but generalization ability is poor
[1]H. Tsukimoto, “Extracting rules from trained neural networks,” IEEE Transactions on Neural Networks, Vol. 11, No. 2, pp. 377—389, 2000.
[2]S. Ardiansyah, M. A. Majid, and J. M. Zain, “Knowledge of extraction from trained neural network by using decision tree,” 2nd IEEE International Conference on
Science in Information Technology (ICSITech), pp. 220-225, 2016
[3]M. Sato, H. Tsukimoto, “Rule extraction from neural networks via decision tree induction}, Proceedings of International Joint Conference on Neural Networks
(IJCNN'01, No. 3, pp. 1870-1875, 2001.
5
6. Background
・The third approach
・Eclectic approach
・Combines decompositional and pedagogical approach
・Makes a balance between computational cost and performance
・Our approach belongs to this approach
・Pedagogical approach deals whole NN as teacher
・Our approach deals outputs of a hidden layer as teacher
6
7. Experiment
・This experiment is trying to interpret DNN using Decision Tree(DT).
・DT is known as an interpretable model.
・We create DT from the outputs of hidden neurons of DNN.
7
8. Experiment
・Preceding study, 1-5 hidden layers
・An extracting DT from a hidden layer closer to output layer can be more accurate
・And, the DT can be simpler in the sense that the number of nods is smaller
・It shows the possibility of extracting more accurate and more understandable knowledge from a
well-trained DNN.
・This study is extension of the preceding study.
・Here, we study NN with 1-15 hidden layers using more databases
・5 layers were not enough to know the trend
8
9. Experiment
・Experimental flow
・Step 1: Train NN using Back Propagation
・Step 2: Create DT from output of NN hidden layer which is closer to output layer
・Step 3: Add a new hidden layer between existing layer and output layer
- Before that, we fix the weight of existing layers
・We repeat these steps until number of hidden layers is 15
・We would like to confirm how much difference between the accuracy of DNNs and DT
・And if tree size depends on a number of a hidden layer.
9
10. Experiment
・Datasets
・From UCI Machine Learning Repository
Data Classes Features Instance
australian 2 14 690
cancer 2 24 683
german 2 24 1000
BHP 4 22 1075
statlog 7 19 2310
wine 3 13 178 10
11. Experiment
・NN Settings
・Num of Hidden Layers: 1 ~ L_max (in this study: L_max = 15)
・Activation Function: bi-polar sigmoid
・Solver: SGD
・Learning Rate:0.05
・Num of Hidden Neurons: same as number of features of data
・Validation: 10-fold cross validation
11
13. Result(NN)
・From this result, deep NNs do not
improve the performance significantly
compared with shallow NNs.
・The only exception is the dataset BHP
・For this dataset, the accuracy can
become approximately 100% when the
number of hidden layers is above 6.
13
17. Result(difference between NN and DT)
・The difference in most cases are
smaller than 1%
・This means that the DTs can
approximate the original NN very
closely.
17
19. Result(Tree Size)
・The size decreases when the number
of hidden layer increases
・When the number of hidden layers
reaches a certain number, however, the
tree size often does not change.
・In some case, the tree size may even
increase.
19
35. Discussion
・The performance of the NNs is almost the same as that of the DTs.
・When there is enough number of hidden layers, the tree size will not decrease
anymore.
・We can use the tree size as a criterion to determine the number of layers needed
for solving a given problem.
・For example, for most datasets considered here, the proper number of hidden
layers should be less than 6 or 7.
35
36. Future Work
・Investigate the effect of training parameters
・number of hidden neurons per layers
・number of epochs
・Experiment with larger datasets or datasets for regression
・Define the meaning of hidden neurons outputs
36