2. CONTENT
ARCHITECTURE OF DATA MINING
05
01
02
03
04
WHAT IS MINING?
TYPES OF DATA MINING
TASKS OF DATA MINING
APPLICATION
2
3. WHAT IS DATA MINING?
02
Data mining refers to extracting or mining knowledge from large amounts of
data.
Thus, data mining should have been more appropriately named as knowledge
mining which emphasis on mining from large amounts of data.
It is the computational process of discovering patterns in large data sets with
the intersection of artificial intelligence, machine learning, statistics, and
database systems.
The overall goal of the data mining process is to extract information from a
data set and transform it into an understandable structure for further use.
3
5. Types of data that can be mined
04
1. Data stored in the database
• A database is also called a database management system or DBMS.
• Every DBMS stores data that are related to each other in a way or the
other.
• It also has a set of software programs that are used to manage data and
provide easy access to it.
• These software programs defines structure for database, making sure that
the stored information remains secured and consistent, and managing
different types of data access, such as shared, distributed, and concurrent.
5
6. 2. Data warehouse
• It is a single data storage location that collects data from multiple
sources and then stores it in the form of a unified plan.
• When data is stored in a data warehouse, it undergoes cleaning,
integration, loading, and refreshing.
• Data stored in a data warehouse is organized in several parts.
• If you want information on data that was stored 6 or 12 months
back, you will get it in the form of a summary.
6
7. 3. Transactional data
• Transactional database stores record that are
captured as transactions.
• These transactions include flight booking,
customer purchase, click on a website, and
others.
• Every transaction record has a unique ID. It
also lists all those items that made it a
transaction.
7
8. 4. Other types of data
• We have a lot of other types of data as well that are known for
their structure, semantic meanings, and versatility. They are used
in a lot of applications.
• Here are a few of those data types: data streams, engineering
design data, sequence data, graph data, spatial data, multimedia
data, and more.
8
9. TASKS OF DATA MINING
06
Association rule learning (Dependency modelling) –
• Searches for relationships between variables.
• For example a supermarket might gather data on customer purchasing
habits. Using association rule learning, the supermarket can determine
which products are frequently bought together and use this information
for marketing purposes. This is sometimes referred to as market basket
analysis.
Clustering
• It is the task of discovering groups and structures in the data that are in
some way or another "similar", without using known structures in the
data.
9
10. Classification –
• It is the task of generalizing known structure to apply to new
data.
• For example, an e-mail program might attempt to classify an e-
mail as "legitimate" or as "spam".
Regression
• It attempts to find a function which models the data with the
least error.
Summarization
• It providing a more compact representation of the data set,
including visualization and report generation.
10
11. APPLICATIONS
1. Healthcare
• It can be used to identify best practices based on data and analytics,
which can help healthcare facilities to reduce costs and improve
patient outcomes.
• Data mining, along with machine learning, statistics, data
visualization, and other techniques can be used to forecasting
patients of different categories.
• This will help patients to receive intensive care when and where
they want it.
• Data mining can also help healthcare insurers to identify fraudulent
activities.
11
12. 2. Education
• Use of data mining in education aims to develop techniques that
can use data coming out of education environments for knowledge
exploration.
• It provides techniques to study how educational support impacts
students, supporting the future-leaning needs of students, and
promoting the science of learning amongst others.
• Educational institutions can use these techniques to not only
predict how students are going to do in examinations but also
make accurate decisions.
• With this knowledge, these institutions can focus more on their
teaching pedagogy.
12