Amirkabir University of Technology
Advanced Database Course
Conference Presentation
Review on Data Mining and its techniques.
Supervisor: Dr. Bagheri
November 2016
In English Presented in Persian
دانشگاه صنعتی امیرکبیر (پلی تکنیک تهران)
دانشکده مهندسی کامپیوتر و فناوری اطلاعات
ارائه کنفرانس درس پایگاه داده پیشرفته
داده کاوی و تکنیک های آن
استاد: دکتر علیرضا باقری
آذرماه 1395
Understanding the Machine Learning AlgorithmsRupak Roy
includes distinguishable definitions from supervised vs unsupervised learning with their types and the workflow, algorithm map;
Let me know if anything is required. Happy to help, Talk soon! #bobrupakroy
In this tutorial, we will learn the the following topics -
+ Training and Visualizing a Decision Tree
+ Making Predictions
+ Estimating Class Probabilities
+ The CART Training Algorithm
+ Computational Complexity
+ Gini Impurity or Entropy?
+ Regularization Hyperparameters
+ Regression
+ Instability
No machine learning algorithm dominates in every domain, but random forests are usually tough to beat by much. And they have some advantages compared to other models. No much input preparation needed, implicit feature selection, fast to train, and ability to visualize the model. While it is easy to get started with random forests, a good understanding of the model is key to get the most of them.
This talk will cover decision trees from theory, to their implementation in scikit-learn. An overview of ensemble methods and bagging will follow, to end up explaining and implementing random forests and see how they compare to other state-of-the-art models.
The talk will have a very practical approach, using examples and real cases to illustrate how to use both decision trees and random forests.
We will see how the simplicity of decision trees, is a key advantage compared to other methods. Unlike black-box methods, or methods tough to represent in multivariate cases, decision trees can easily be visualized, analyzed, and debugged, until we see that our model is behaving as expected. This exercise can increase our understanding of the data and the problem, while making our model perform in the best possible way.
Random Forests can randomize and ensemble decision trees to increase its predictive power, while keeping most of their properties.
The main topics covered will include:
* What are decision trees?
* How decision trees are trained?
* Understanding and debugging decision trees
* Ensemble methods
* Bagging
* Random Forests
* When decision trees and random forests should be used?
* Python implementation with scikit-learn
* Analysis of performance
Understanding the Machine Learning AlgorithmsRupak Roy
includes distinguishable definitions from supervised vs unsupervised learning with their types and the workflow, algorithm map;
Let me know if anything is required. Happy to help, Talk soon! #bobrupakroy
In this tutorial, we will learn the the following topics -
+ Training and Visualizing a Decision Tree
+ Making Predictions
+ Estimating Class Probabilities
+ The CART Training Algorithm
+ Computational Complexity
+ Gini Impurity or Entropy?
+ Regularization Hyperparameters
+ Regression
+ Instability
No machine learning algorithm dominates in every domain, but random forests are usually tough to beat by much. And they have some advantages compared to other models. No much input preparation needed, implicit feature selection, fast to train, and ability to visualize the model. While it is easy to get started with random forests, a good understanding of the model is key to get the most of them.
This talk will cover decision trees from theory, to their implementation in scikit-learn. An overview of ensemble methods and bagging will follow, to end up explaining and implementing random forests and see how they compare to other state-of-the-art models.
The talk will have a very practical approach, using examples and real cases to illustrate how to use both decision trees and random forests.
We will see how the simplicity of decision trees, is a key advantage compared to other methods. Unlike black-box methods, or methods tough to represent in multivariate cases, decision trees can easily be visualized, analyzed, and debugged, until we see that our model is behaving as expected. This exercise can increase our understanding of the data and the problem, while making our model perform in the best possible way.
Random Forests can randomize and ensemble decision trees to increase its predictive power, while keeping most of their properties.
The main topics covered will include:
* What are decision trees?
* How decision trees are trained?
* Understanding and debugging decision trees
* Ensemble methods
* Bagging
* Random Forests
* When decision trees and random forests should be used?
* Python implementation with scikit-learn
* Analysis of performance
A decision tree is a guide to the potential results of a progression of related choices. It permits an individual or association to gauge potential activities against each other dependent on their costs, probabilities, and advantages. They can be utilized either to drive casual conversation or to outline a calculation that predicts the most ideal decision scientifically.
This presentation discusses decision trees as a machine learning technique. This introduces the problem with several examples: cricket player selection, medical C-Section diagnosis and Mobile Phone price predictor. It discusses the ID3 algorithm and discusses how the decision tree is induced. The definition and use of the concepts such as Entropy, Information Gain are discussed.
What is the Covering (Rule-based) algorithm?
Classification Rules- Straightforward
1. If-Then rule
2. Generating rules from Decision Tree
Rule-based Algorithm
1. The 1R Algorithm / Learn One Rule
2. The PRISM Algorithm
3. Other Algorithm
Application of Covering algorithm
Discussion on e/m-learning application
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand why Machine Learning came into picture, what is Machine Learning, types of Machine Learning, Machine Learning algorithms with a detailed explanation on linear regression, decision tree & support vector machine and at the end you will also see a use case implementation where we classify whether a recipe is of a cupcake or muffin using SVM algorithm. Machine learning is a core sub-area of artificial intelligence; it enables computers to get into a mode of self-learning without being explicitly programmed. When exposed to new data, these computer programs are enabled to learn, grow, change, and develop by themselves. So, to put simply, the iterative aspect of machine learning is the ability to adapt to new data independently. Now, let us get started with this Machine Learning presentation and understand what it is and why it matters.
Below topics are explained in this Machine Learning presentation:
1. Why Machine Learning?
2. What is Machine Learning?
3. Types of Machine Learning
4. Machine Learning Algorithms
- Linear Regression
- Decision Trees
- Support Vector Machine
5. Use case: Classify whether a recipe is of a cupcake or a muffin using SVM
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars. This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
This presentation discusses about following topics:
Types of Problems Solved Using Artificial Intelligence Algorithms
Problem categories
Classification Algorithms
Naive Bayes
Example: A person playing golf
Decision Tree
Random Forest
Logistic Regression
Support Vector Machine
Support Vector Machine
K Nearest Neighbors
Improve Your Regression with CART and RandomForestsSalford Systems
Why You Should Watch: Learn the fundamentals of tree-based machine learning algorithms and how to easily fine tune and improve your Random Forest regression models.
Abstract: In this webinar we'll introduce you to two tree-based machine learning algorithms, CART® decision trees and RandomForests®. We will discuss the advantages of tree based techniques including their ability to automatically handle variable selection, variable interactions, nonlinear relationships, outliers, and missing values. We'll explore the CART algorithm, bootstrap sampling, and the Random Forest algorithm (all with animations) and compare their predictive performance using a real world dataset.
A decision tree is a guide to the potential results of a progression of related choices. It permits an individual or association to gauge potential activities against each other dependent on their costs, probabilities, and advantages. They can be utilized either to drive casual conversation or to outline a calculation that predicts the most ideal decision scientifically.
This presentation discusses decision trees as a machine learning technique. This introduces the problem with several examples: cricket player selection, medical C-Section diagnosis and Mobile Phone price predictor. It discusses the ID3 algorithm and discusses how the decision tree is induced. The definition and use of the concepts such as Entropy, Information Gain are discussed.
What is the Covering (Rule-based) algorithm?
Classification Rules- Straightforward
1. If-Then rule
2. Generating rules from Decision Tree
Rule-based Algorithm
1. The 1R Algorithm / Learn One Rule
2. The PRISM Algorithm
3. Other Algorithm
Application of Covering algorithm
Discussion on e/m-learning application
Machine Learning Tutorial Part - 1 | Machine Learning Tutorial For Beginners ...Simplilearn
This presentation on Machine Learning will help you understand why Machine Learning came into picture, what is Machine Learning, types of Machine Learning, Machine Learning algorithms with a detailed explanation on linear regression, decision tree & support vector machine and at the end you will also see a use case implementation where we classify whether a recipe is of a cupcake or muffin using SVM algorithm. Machine learning is a core sub-area of artificial intelligence; it enables computers to get into a mode of self-learning without being explicitly programmed. When exposed to new data, these computer programs are enabled to learn, grow, change, and develop by themselves. So, to put simply, the iterative aspect of machine learning is the ability to adapt to new data independently. Now, let us get started with this Machine Learning presentation and understand what it is and why it matters.
Below topics are explained in this Machine Learning presentation:
1. Why Machine Learning?
2. What is Machine Learning?
3. Types of Machine Learning
4. Machine Learning Algorithms
- Linear Regression
- Decision Trees
- Support Vector Machine
5. Use case: Classify whether a recipe is of a cupcake or a muffin using SVM
About Simplilearn Machine Learning course:
A form of artificial intelligence, Machine Learning is revolutionizing the world of computing as well as all people’s digital interactions. Machine Learning powers such innovative automated technologies as recommendation engines, facial recognition, fraud protection and even self-driving cars. This Machine Learning course prepares engineers, data scientists and other professionals with knowledge and hands-on skills required for certification and job competency in Machine Learning.
Why learn Machine Learning?
Machine Learning is taking over the world- and with that, there is a growing need among companies for professionals to know the ins and outs of Machine Learning
The Machine Learning market size is expected to grow from USD 1.03 Billion in 2016 to USD 8.81 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 44.1% during the forecast period.
We recommend this Machine Learning training course for the following professionals in particular:
1. Developers aspiring to be a data scientist or Machine Learning engineer
2. Information architects who want to gain expertise in Machine Learning algorithms
3. Analytics professionals who want to work in Machine Learning or artificial intelligence
4. Graduates looking to build a career in data science and Machine Learning
Learn more at: https://www.simplilearn.com/
This presentation discusses about following topics:
Types of Problems Solved Using Artificial Intelligence Algorithms
Problem categories
Classification Algorithms
Naive Bayes
Example: A person playing golf
Decision Tree
Random Forest
Logistic Regression
Support Vector Machine
Support Vector Machine
K Nearest Neighbors
Improve Your Regression with CART and RandomForestsSalford Systems
Why You Should Watch: Learn the fundamentals of tree-based machine learning algorithms and how to easily fine tune and improve your Random Forest regression models.
Abstract: In this webinar we'll introduce you to two tree-based machine learning algorithms, CART® decision trees and RandomForests®. We will discuss the advantages of tree based techniques including their ability to automatically handle variable selection, variable interactions, nonlinear relationships, outliers, and missing values. We'll explore the CART algorithm, bootstrap sampling, and the Random Forest algorithm (all with animations) and compare their predictive performance using a real world dataset.
How Machine Learning Helps Organizations to Work More Efficiently?Tuan Yang
Data is increasing day by day and so is the cost of data storage and handling. However, by understanding the concepts of machine learning one can easily handle the excessive data and can process it in an affordable manner.
The process includes making models by using several kinds of algorithms. If the model is created precisely for certain task, then the organizations have a very wide chance of making use of profitable opportunities and avoiding the risks lurking behind the scenes.
Learn more about:
» Understanding Machine Learning Objectives.
» Data dimensions in Machine Learning.
» Fundamentals of Algorithms and Mapping from Input/Output.
» Parametric and Non-parametric Machine Learning Algorithms.
» Supervised, Unsupervised and Semi-Supervised Learning.
» Estimating Over-fitting and Under-fitting.
» Use Cases.
This slide gives brief overview of supervised, unsupervised and reinforcement learning. Algorithms discussed are Naive Bayes, K nearest neighbour, SVM,decision tree, Markov model.
Difference between regression and classification. difference between supervised and reinforcement, iterative functioning of Markov model and machine learning applications.
Application of Machine Learning in AgricultureAman Vasisht
With the growing trend of machine learning, it is needless to say how machine learning can help reap benefits in agriculture. It will be boon for the farmer welfare.
Knowledge Discovery Tutorial By Claudia d'Amato and Laura Hollnik at the Summer School on Ontology Engineering and the Semantic Web in Bertinoro, Italy (SSSW2015)
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
8. DM Methods
What are Classification, Clustering, Association
Rules and Regression in Data Mining?
2
Introduction DM Methods
Complementary
Information
Conclusion
8/59
10. Classification
Problem Given: Training set
labeled set of 𝑁 input-output pairs 𝐷={xi , yi}
where 1<i<N
𝑦 ∈ {1,…,𝐾}
Goal: Given an input 𝒙 as a test data, assign it to
one of 𝐾 classes
Examples:
▸ Spam filter
▸ Shape recognition
10/59
11. Learning and Decision Boundary
Assume that training data is perfectly linearly separable
Note that we seek w such that
wT x ≥ 0 when y = +1
wT x < 0 when y = −1
wT x n yn ≥ 0 E(w) = Σ wT x n yn
11/59
12. Learning and Decision Boundary
Assume that training data is perfectly linearly separable
Note that we seek w such that
wT x ≥ 0 when y = +1
wT x < 0 when y = −1
wT x n yn ≥ 0 E(w) = Σ wT x n yn
12/59
13. Learning and Decision Boundary
Assume that training data is perfectly linearly separable
Note that we seek w such that
wT x ≥ 0 when y = +1
wT x < 0 when y = −1
wT x n yn ≥ 0 E(w) = Σ wT x n yn
13/59
14. Margin
Which line is better to select as the
boundary to provide more
generalization capability?
Larger margin provides better
generalization to unseen data
A hyperplane that is farthest from
all training samples
The largest margin has equal
distances to the nearest sample of
both classes
14/59
15. Margin
Which line is better to select as the
boundary to provide more
generalization capability?
Larger margin provides better
generalization to unseen data
A hyperplane that is farthest from
all training samples
The largest margin has equal
distances to the nearest sample of
both classes
×
15/59
17. Beyond Linear Separability
Noise in the linearly separable
classes
Overlapping classes that can be
approximately separated by a
linear boundary
17/59
18. Beyond Linear
Separability:
Soft-Margin
SVM
Soft margin:
Maximizing a margin while trying to minimize the distance
between misclassified points and their correct margin plane
SVM with slack variables:
Allows samples to fall within the margin, but penalizes them
18/59
19. Soft-Margin
SVM:
Parameter 𝐶 is a tradeoff parameter:
small 𝐶 allows margin constraints to be easily
ignored large margin
large 𝐶 makes constraints hard to ignore
narrow margin
𝐶=∞ enforces all constraints: hard margin
19/59
20. Support Vectors
Hard Margin Support Vectors :
(SVs) = {𝑥i | 𝛼>0}
The direction of hyper-plane
can be found only based on
support vectors:
The direction of hyper-plane can be found only based on support
vectors
𝑊 = 𝛼𝑖 𝑦(𝑖)
𝑥(𝑖)
𝛼 𝑖
20/59
24. Clustering
Problem We have a set of unlabeled data points {𝒙(i) }
where 1<i<N and we intend to find groups of
similar objects (based on the observed
features)
24/59
25. K-means
Clustering Given: the number of clusters 𝐾 and a set of
unlabeled data 𝒳=𝒙1,…,𝒙N
Goal: find groups of data points 𝒞={𝒞1,𝒞2,…,𝒞k}
Hard Partitioning:
∀𝑗,𝒞𝑗≠∅
∀𝑖,𝑗,𝒞𝑖∩𝒞𝑗=∅
Inter-cluster distances are small (compared with
intra-cluster distances)
25/59
26. Distortion measure
Our goal is to find 𝒞={𝒞1,𝒞2,…,𝒞k } and {𝝁1,…,𝝁k} so as to minimize 𝐽C
26/59
27. K-means
Algorithm
Select 𝑘 random points 𝝁1,𝝁2,…𝝁k as clusters’ initial
centroids.
Repeat until converges (or other stopping
criterion):
for i=1 to 𝑁 do:
Assign 𝒙(𝑖) to the closest cluster
for k=1 to 𝐾 do:
Centriod update
27/59
30. Summary
of the First
Part
Hard margin SVM :
maximizing margin
Soft margin SVM:
handling noisy data
and overlapping
classes
Linearly Separable
Labeled Data
Clustering
Unlabeled Data
K-means:
Assigning data
to clusters
Centriod update
Data
Mining
30/59
38. Apriori
Algorithm Used to find Frequent Items
Uses candidate generation
Uses prior knowledge
Level-wise search
Uses minimum support
Apriori property: All nonempty subsets of a frequent
item set must also be frequent.
In level k, k-item sets are found
Then these items are used to explore k+1
38/59
46. Regression
Previous classifications labels
Used for prediction
Numeric, continuous value
Relation between independent and dependent
variables
46/59
49. Regression
Continue
Class label related to attribute if not we use
Correlation Coefficient
Nonlinear regressions can be converted to linear
Generalized linear model is Logistic Regression
uses probability
Decision tree to Regression trees by predicting
continuous values rather than class labels
49/59
52. Business Software:
IBM Intelligent Miner
SAS Enterprise Miner
Microsoft SQL Server 2005
SPSS Clementine
…
Open Source Software:
Rapid-I Rapid Miner
Weka
…
DM Tools Business Software:
IBM Intelligent Miner
SAS Enterprise Miner
Microsoft SQL Server 2005
SPSS Clementine
…
Open Source Software:
Rapid-I Rapid Miner
Weka
…
52/59
53. DM Usage
Bank
Financial issues
Perfect quality
Granting loans
Financial services
Reducing risks
Money
Laundering and
Financial damages
Marketing
Massive data
Increasing fast
E-commerce
Shopping patterns
Service quality
Customer
satisfaction
Advertising
Discount
Bioinformation
Laboratory
Information
Protein structures
(Gene)
Massive number of
sequences
Need for computer
algorithms to
analyze them
Accurate
53/59
54. DM Types
Text Mining
• No tables
• Books, articles, texts
• Semi-structured data
• Data Recovery and
Database
• Key words
• Massive data and text
Web Mining
• Massive Unstructured,
Semi-structured,
Multimedia data
• Links, advertisements
• Poor quality, changing
• Web structure, Content,
Web usage Mining
• Search engines
Multimedia Mining
• Voice, image, video
• Nature of the data
• Key words or
Patterns and shapes
Graph Mining
• Electronic circuits,
image, web and …
• Graph search algorithm
• Difference, index
• Social networks
analisis
Spatial Mining
• Medical images,
VLSI layers
• Location based
• Efficient techniques
54/59
56. “Challenges
Individual systems or
Single-purpose systems
Scalable and Interactive systems
Standardization of data mining languages
Complex data
Distributed and Real Time data mining
56/59
58. C. M. Bishop, Pattern Recognition and Machine
Learning; Springer, 2006.
Jiawei Han, Micheline Kamber; Data Mining
Concepts and Techniques, Second Edition,
Elsevier Inc. , 2006.
J. Furnkranz et al. ;Foundations of Rule
Learning: Cognitive Technologies, Springer-
Verlag Berlin Heidelberg, 2012.
Abraham Silberschatz, Henry F.Korth,
S.Sudarshan; Database System Concepts, Sixth
Edition, McGraw-Hill, 2010.
اسماعیلی،مهدی؛و مفاهیمهایتکنیککاوی داده؛دانش نیاز،
ماه تیر1391.
References
58/59