4. What is data mining?
Data mining is also called knowledge
discovery and data mining (KDD)
extraction of useful patterns from data
sources, e.g., databases, texts, web,
image.
Patterns must be:
valid, novel, potentially useful,
understandable
5. Why data mining now?
The data is abundant.
The data is being warehoused.
The computing power is affordable.
The competitive pressure is strong.
Data mining tools have become
available
6. DATAWARE HOUSING
The electronic storage of a large
amount of information by a business.
Warehoused data must be
stored in a manner that is
secure, reliable, easy to
retrieve and easy to manage.
7. DATABASE
A database is a collection of information
that is organized so that it can
easily be accessed, managed,
and updated.
8. DIFF B/W DWH & DATA MINING
Process to compiling
& organizing data into
same database.
Engineering phase
but no business users
are involve.
Process of designing,
how to data stored to
improve reporting.
Process to extracting
meaningful data from
database.
Business users but
with the assistance of
engineer.
Statical analysis,
analyst use technical
to query and sort
through Terabytes of
data looking for
patterns.
9. DIFF B/W DB & DATA MINING
It involves day-to-day
processing.
It is used to run the
business.
It provides primitive
and highly detailed
data.
It is based on Entity
Relationship Model.
It contains current
data.
extraction of
previously unknown
and interesting
information from raw
data.
Due to the
exponential growth of
data, especially in
areas such as
business.
convert this large
wealth of data in to
business intelligence.
10.
11. Data mining applications
Marketing, customer profiling and
retention, identifying potential
customers, market segmentation.
Fraud detection
identifying credit card fraud, intrusion
detection
Scientific data analysis
Text and web mining
Any application that involves a large
amount of data …
12. ADVANTAGES OF DATA
MINING
Marking/Retailing: Data mining can aid
direct marketers by providing them with useful and
accurate trends about their customers’ purchasing
behavior.
Banking/Crediting: Data mining can assist
financial institutions in areas such as credit
reporting and loan information.
13. ADVANTAGEOUS OF DATA
MINING (cont…)
Law enforcement: Data mining can aid
law enforcers in identifying criminal suspects
as well as apprehending these criminals by
examining trends in location, crime type, habit,
and other patterns of behaviors.
Researchers: Data mining can assist
researchers by speeding up their data analyzing
process; thus, allowing them more time to work on
other projects.
14. DIS-ADVANTAGEOUS OF DATA
MINING
Privacy Issues: For example, according to
Washing Post, in 1998, CVS had sold their
patient’s prescription purchases to a different
company ….
Security issues: Although companies have
a lot of personal information about us available
online, they do not have sufficient security systems
in place to protect that information.
15. DIS-ADVANTAGEOUS OF DATA
MINING (cont….)
Misuse of information: Some of the
company will answer your phone based on your
purchase history. If you have spent a lot of money
or buying
a lot of product from one company, your call will be
answered really soon. So you should not think that
your call is really being answer in the order in
which it was receive.
16. DATA MINING TECHNIQUES
Classification:
Classification is learning rules that can be applied to
new data and will typically include following steps:
preprocessing of data, designing modeling,
learning/feature selection and Evaluation/validation.
Association:
Association is looking for relationships between
variables.
Clustering:
Clustering is identifying similar groups from
unstructured data.
17. DATA MINING
TECHNIQUES(CONT…)
Sequential pattern mining:
A sequential rule: A B, says that event A will be
immediately followed by event B with a certain
confidence.
Deviation detection:
discovering the most significant changes in data.
Regressions:
Regression is finding functions with minimal error
to model data .