SlideShare a Scribd company logo
1 of 42
Follow Us:1
Importance of Data Mining in
IT Industry
Follow Us:2
Data , Data everywhere..
Follow Us:3
Introduction
 What is Data?
Facts and statistics collected together for
reference or analysis.
As per Moore’s Law,
The information density on silicon integrated
circuits double every 18 to 24 months.
Follow Us:4
 There has been a huge increase in the amount
of data being stored in database over past
twenty years.
 All that user wants is more sophisticated
information.
 This surges up the demand of Data mining.
Follow Us:5
What is Data mining?
 Data mining refers to extracting or “mining”
Knowledge from large amount of data.
- By Jiawei Han, Micheline Kamber,
Follow Us:6
 Data mining focuses on extraction of information
from a large set of data and transforms it into an
easily interpretable structure for further use.
 As data is growing at very remarkable rate, there
comes a need to analyze large, complex and
information rich data sets to gain the hidden
information. This may result into greater customer
satisfaction and remarkable turn over for the firm.
Follow Us:
Why Data mining?
7
 I am not able to find the data I need.
(Data is dispersed over the network)
 The data I have access to is poorly documented.
(Proper information is missing)
 I am not able to use the data I have.
(Unexpected results)
Follow Us:
What is Data Warehouse?
8
 Data Warehouse is an integrated, subject oriented
and time-varying collection of data that is used
primarily in organizational decision making.
-Bill Inmon, Building the Data Warehouse 1996
 Data Warehousing:
Data Warehousing is a process of managing data
from various sources in order of answering the
Business questions.
Follow Us:9
Examples of Data warehousing:
 Wal-Mart – Terabytes (10^12 bytes)
 Geographic Information System – Peta bytes(10^15 bytes)
 National Medical Records – Extra bytes (10^18 bytes)
 Weather Images- Zettabyte(10^21 bytes)
 Intelligent video agency- Yottabyte (10^24 bytes)
Follow Us:
How data mining works with
warehouse data?
10
 Data warehouse avails enterprise with memory.
 Data warehouse avails enterprise with intelligence.
Follow Us:
Example of Data mining in IT:
11
 A company that offers online services in several
countries faces a problem in times of loading the
homepage. Even IT department employees could
not solve the bug. Based on customer feedback and
last mile measurements, company gets to know that
IT services are suffering from insufficient time for
loading the homepage for some users.
Follow Us:12
 Management has the high interest in solving
the problem because the response time of the
site directly affects the online sales, IT
department has already looked in to log files
with the help of mining tool.
 It has been discovered that there’s no such
single type of typical problem. Instead data
mining program discovers that there are
various problem clusters.
Follow Us:13
 The patterns for this problem is determined by a
combination of country code , operating system, and
the version of a browser that the customers use.
 One another pattern shows that the same bad
performance in loading times of the website is
caused by a different problem.
Follow Us:14
 Speed of internet and cookie settings also lead
to greater time for loading the site. This is how
data mining helps in indentifying the problem
and hence solving the problem becomes quick.
 This is just one example of how data mining
can be so useful.
Follow Us:15
Data mining Models and Tasks
Data mining is widely divided into two
parts:
 Predictive Data mining
 Descriptive Data mining
Follow Us:16
Follow Us:17
Predictive Data mining
 The objective of predictive tasks is to use the
values of some variable to predict the values
of other variable.
 Ex: Web mining is used by the online
marketers to predict the purchase by online
user on a website.
Follow Us:18
 Classification is used to map data in a predefined
groups.
 Regression maps a data item to a real valued
prediction variable.
 Clustering forms a similar data together.
 Summarization is used to map data in a subsets.
 Link Analysis defines relationships among data.
Follow Us:19
Descriptive Data mining
 The objective of descriptive tasks is to find
human readable patterns which describes the
relationships between data.
Follow Us:20
Architectures of a data mining
system
There are four Data mining architecture:
I. No- Coupling
II. Loose Coupling
III.Semi tight Coupling
IV.Tight coupling
Follow Us:21
• In this architecture, data mining system doesn’t use
any functionality of a database or data warehouse
system.
• Data is retrieved from data sources like file system and
processed using data mining algorithms which are
stored into file system.
• This architecture is considered as a poor architecture
for data mining system as it does not take any
advantages of database or data warehouse.
• However it is used for simple data mining
processes
No-coupling:
Follow Us:22
 The loose coupling data mining system uses
database or data warehouse for data retrieval.
 In this architecture, data mining system retrieves
data from database or data warehouse, processes
data using data mining algorithms and stores the
result in those systems.
 Loose coupling architecture is for memory-based
data mining system which does not require high
scalability and high performance.
Loose Coupling:
Follow Us:23
In semi-tight coupling data mining architecture, it
not only links it to database or data warehouse
system, but it also uses several features of
database or data warehouse systems which
perform some data mining tasks like sorting and
indexing etc. Moreover the intermediate result can
also be stored in database or data warehouse
system for better performance.
Semi-tight Coupling:
Follow Us:24
Tight-coupling:
 In this architecture, database or data warehouse is
treated as an information retrieval component.
 Tight-coupling data mining architecture provides
scalability, high performance and integrated
information.
Follow Us:25
Three tiers of Tight-coupling
Data mining Architecture:
i. Data Layer
ii. Data mining application layer
iii.Front-end Layer
Follow Us:26
Data layer can be database or data warehouse
systems which is an interface for all data
sources. It stores results in data layer so it can
be presented to end-user in form of reports.
Datalayer
Follow Us:27
This layer is used to retrieve data from
database. And by performing transformation
routine, we can get data into desired format.
Then data is processed using various data
mining algorithms.
Data mining application layer
Follow Us:28
Front-end layer provides visceral and user
friendly interface for end-user to interact with
data mining system.
The result of data mining is presented in
visualization form in front-end layer.
Front-end layer:
Follow Us:29
 Although data mining is still in its early stage;
companies in a wide range of industries such
as retail, heath care, manufacturing, finance,
and transportation are already using data
mining techniques to take advantage of
historical data.
 Data mining helps analysts to recognize
significant facts, relationships, trends, patterns
and anomalies which might go unnoticed
otherwise.
Application Area of Data mining:
Follow Us:30
 Data mining uses pattern recognition technologies
and mathematical techniques to sift through
warehoused information.
 In business, data mining is useful for discovering
patterns and relationships in data to help make
better decisions.
 Data mining helps in developing smarter
marketing campaigns and to predict customer
loyalty.
Follow Us:31
Advantages of Data Mining
 Marketing /Retail :
 Data mining helps marketing companies to build
models based on historical data which will precisely
predict responders to the new marketing campaigns.
 Marketers will have appropriate approach for targeted
customers.
 Data mining helps retail companies as well. By using
market basket analysis, a store can have an
appropriate arrangement in such a way that customers
can purchase frequent buying products together with
pleasant. It also helps the retail companies to offer
certain discounts which will attract more customers.
Follow Us:32
 Finance / Banking
By building a model from historical customer’s
data of loans, the bank officials and financial
institution can determine good and bad loans.
Data mining also helps banks to detect fraudulent
credit card transactions.
Follow Us:33
 Manufacturing
Data mining is useful in operational engineering
data which can detect faulty equipments and
determines optimal control parameters.
Data mining can determine the range of control
parameters which leads to the production of perfect
product. Hence optimal control parameters can
provide the desired quality.
Follow Us:34
 Governments
Data mining helps government agency to analyze
records of financial transaction which will help in
building patterns that can detect money laundering
or criminal activities.
Follow Us:35
 Market segmentation - Data mining helps to
identify the common characteristics of customers
who buy the same products from your company.
 Customer anticipation - It helps to predict which
customers may leave your company and go to a
competitor.
 Fraud detection - It indentifies which transactions
are most likely to be fraudulent.
Specific use of data mining
include:
Follow Us:36
 Direct marketing - Direct marketing identifies
which prospects should be included to obtain the
highest response rate.
 Interactive marketing - It is useful for predicting
what each user on a Web site is most likely
interested in seeing.
 Market basket analysis - It helps to understand
what products or services are commonly
purchased together.
 Trend analysis - Trend analysis identifies the
difference between a typical customer this month
and last.
Follow Us:37
Disadvantages of data mining
Privacy Issues
 The internet is booming with social networks, e-
commerce, blogs etc, the concerns about the
personal privacy has been increasing.
 This worries the users as the information might be
collected and used in unethical way which can
potentially cause a lot of troubles.
 Businesses collect the information of its users for
setting up the marketing strategies but there are
chances that business might be taken by other firms
or gets shut down and that’s where a concern of
misusing or leaking the personal information arises.
Follow Us:38
 Security issues
Security is the biggest concern in data mining.
Businesses own all the information of their
employees which even includes personal and
financial information, there are the chances of
misusing data by hackers and which cause
serious trouble to the organization and its
employees.
Follow Us:39
 Misuse of information/inaccurate information
The information collected by the data mining may
be exploited by unethical people or businesses in
order to take benefits of vulnerable people.
Data mining techniques is not totally accurate. So
inaccurate information may lead the wrong
decision-making which may cause serious
consequences.
Follow Us:40
Challenges of Data Mining
 Scalability: Scalable techniques are needed for
handling massive size of datasets that are now
being created.
 Poor efficiency: Such large datasets require the
use of efficient methods for storing, indexing, and
retrieving data from secondary or even tertiary
storage systems.
 In order to perform the data mining task in timely
manner, data mining techniques are needed.
Follow Us:41
 Complexity: Such techniques can dramatically
increase the size of the datasets which can be
handled and for that it requires new designs and
algorithms.
 Dimensionality: Some domains have number of
dimensions which are very large and makes the data
analyzing difficult. That is called as 'curse of
dimensionality’. For example, bioinformatics.
Follow Us:42

More Related Content

What's hot

Data Visualization
Data VisualizationData Visualization
Data Visualizationsimonwandrew
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceMahir Haque
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & ApplicationsFazle Rabbi Ador
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and workAmr Abd El Latief
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data martAmit Sarkar
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data MiningAmritanshu Mehra
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Simplilearn
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : ConceptsPragya Pandey
 
Importance of data analytics for business
Importance of data analytics for businessImportance of data analytics for business
Importance of data analytics for businessBranliticSocial
 
kinds of analytics
kinds of analyticskinds of analytics
kinds of analyticsBenila Paul
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processingVijayasankariS
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 

What's hot (20)

Data Visualization
Data VisualizationData Visualization
Data Visualization
 
Market baasket analysis
Market baasket analysisMarket baasket analysis
Market baasket analysis
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
 
Data mining concepts and work
Data mining concepts and workData mining concepts and work
Data mining concepts and work
 
Data warehousing and data mart
Data warehousing and data martData warehousing and data mart
Data warehousing and data mart
 
Data analytics
Data analyticsData analytics
Data analytics
 
web mining
web miningweb mining
web mining
 
Knowledge Discovery and Data Mining
Knowledge Discovery and Data MiningKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining
 
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...
 
Data Mining : Concepts
Data Mining : ConceptsData Mining : Concepts
Data Mining : Concepts
 
Data mining
Data mining Data mining
Data mining
 
Importance of data analytics for business
Importance of data analytics for businessImportance of data analytics for business
Importance of data analytics for business
 
Three Big Data Case Studies
Three Big Data Case StudiesThree Big Data Case Studies
Three Big Data Case Studies
 
kinds of analytics
kinds of analyticskinds of analytics
kinds of analytics
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Data mining
Data miningData mining
Data mining
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Introduction data mining
Introduction data miningIntroduction data mining
Introduction data mining
 

Viewers also liked

Significance of Data Mining
Significance of Data MiningSignificance of Data Mining
Significance of Data Mining8trackweb
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
The Importance Of Data Mining By Musa Mohd. Nordin, Noor
The Importance Of Data Mining By Musa Mohd. Nordin, NoorThe Importance Of Data Mining By Musa Mohd. Nordin, Noor
The Importance Of Data Mining By Musa Mohd. Nordin, Noormuzkara
 
Importance of Normalization
Importance of NormalizationImportance of Normalization
Importance of NormalizationShwe Yee
 
Bpr implementation process an analysis of key success & failure factors
Bpr implementation process an analysis of key success & failure factorsBpr implementation process an analysis of key success & failure factors
Bpr implementation process an analysis of key success & failure factorsSana Fatima
 
Top10 algorithms data mining
Top10 algorithms data miningTop10 algorithms data mining
Top10 algorithms data miningAsad Ahamad
 
Logistics strategy & planning, Customer Service & Products
Logistics strategy & planning, Customer Service & ProductsLogistics strategy & planning, Customer Service & Products
Logistics strategy & planning, Customer Service & ProductsFahad Ali
 
Data mining PPT
Data mining PPTData mining PPT
Data mining PPTKapil Rode
 
DAMA Webinar: The Data Governance of Personal (PII) Data
DAMA Webinar: The Data Governance of  Personal (PII) DataDAMA Webinar: The Data Governance of  Personal (PII) Data
DAMA Webinar: The Data Governance of Personal (PII) DataDATAVERSITY
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingankur bhalla
 
Data mining fp growth
Data mining fp growthData mining fp growth
Data mining fp growthShihab Rahman
 
Data Processing-Presentation
Data Processing-PresentationData Processing-Presentation
Data Processing-Presentationnibraspk
 
Types of databases
Types of databasesTypes of databases
Types of databasesPAQUIAAIZEL
 

Viewers also liked (20)

Significance of Data Mining
Significance of Data MiningSignificance of Data Mining
Significance of Data Mining
 
Data mining
Data miningData mining
Data mining
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
The Importance Of Data Mining By Musa Mohd. Nordin, Noor
The Importance Of Data Mining By Musa Mohd. Nordin, NoorThe Importance Of Data Mining By Musa Mohd. Nordin, Noor
The Importance Of Data Mining By Musa Mohd. Nordin, Noor
 
Importance of Normalization
Importance of NormalizationImportance of Normalization
Importance of Normalization
 
Bpr implementation process an analysis of key success & failure factors
Bpr implementation process an analysis of key success & failure factorsBpr implementation process an analysis of key success & failure factors
Bpr implementation process an analysis of key success & failure factors
 
Top10 algorithms data mining
Top10 algorithms data miningTop10 algorithms data mining
Top10 algorithms data mining
 
Logistics strategy & planning, Customer Service & Products
Logistics strategy & planning, Customer Service & ProductsLogistics strategy & planning, Customer Service & Products
Logistics strategy & planning, Customer Service & Products
 
Multimedia Database
Multimedia DatabaseMultimedia Database
Multimedia Database
 
Data mining PPT
Data mining PPTData mining PPT
Data mining PPT
 
DAMA Webinar: The Data Governance of Personal (PII) Data
DAMA Webinar: The Data Governance of  Personal (PII) DataDAMA Webinar: The Data Governance of  Personal (PII) Data
DAMA Webinar: The Data Governance of Personal (PII) Data
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Data mining
Data miningData mining
Data mining
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data mining fp growth
Data mining fp growthData mining fp growth
Data mining fp growth
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 
Bpr ppt
Bpr pptBpr ppt
Bpr ppt
 
Types dbms
Types dbmsTypes dbms
Types dbms
 
Data Processing-Presentation
Data Processing-PresentationData Processing-Presentation
Data Processing-Presentation
 
Types of databases
Types of databasesTypes of databases
Types of databases
 

Similar to Importance of Data Mining

A Survey on Data Mining
A Survey on Data MiningA Survey on Data Mining
A Survey on Data MiningIOSR Journals
 
krithi-talk-impact.ppt
krithi-talk-impact.pptkrithi-talk-impact.ppt
krithi-talk-impact.pptKRISHNARAJ207
 
UNIT - 1 : Part 1: Data Warehousing and Data Mining
UNIT - 1 : Part 1: Data Warehousing and Data MiningUNIT - 1 : Part 1: Data Warehousing and Data Mining
UNIT - 1 : Part 1: Data Warehousing and Data MiningNandakumar P
 
Data warehouse,data mining & Big Data
Data warehouse,data mining & Big DataData warehouse,data mining & Big Data
Data warehouse,data mining & Big DataRavinder Kamboj
 
Big data destruction of bus. models
Big data destruction of bus. modelsBig data destruction of bus. models
Big data destruction of bus. modelsEdgar Revilla Lavado
 
Data Mining - Presentation.pptx
Data Mining - Presentation.pptxData Mining - Presentation.pptx
Data Mining - Presentation.pptxfahadusman23
 
Modern trends in information systems
Modern trends in information systemsModern trends in information systems
Modern trends in information systemsPreeti Sontakke
 
Big Data Analytics Research Report
Big Data Analytics Research ReportBig Data Analytics Research Report
Big Data Analytics Research ReportIla Group
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big DataSonovate
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Oomph! Recruitment
 
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsBig Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsSherinMariamReji05
 
Data mining concepts
Data mining conceptsData mining concepts
Data mining conceptsBasit Rafiq
 
13500892 data-warehousing-and-data-mining
13500892 data-warehousing-and-data-mining13500892 data-warehousing-and-data-mining
13500892 data-warehousing-and-data-miningNgaire Taylor
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big dataHari Priya
 

Similar to Importance of Data Mining (20)

A Survey on Data Mining
A Survey on Data MiningA Survey on Data Mining
A Survey on Data Mining
 
krithi-talk-impact.ppt
krithi-talk-impact.pptkrithi-talk-impact.ppt
krithi-talk-impact.ppt
 
krithi-talk-impact.ppt
krithi-talk-impact.pptkrithi-talk-impact.ppt
krithi-talk-impact.ppt
 
UNIT - 1 : Part 1: Data Warehousing and Data Mining
UNIT - 1 : Part 1: Data Warehousing and Data MiningUNIT - 1 : Part 1: Data Warehousing and Data Mining
UNIT - 1 : Part 1: Data Warehousing and Data Mining
 
Ch03
Ch03Ch03
Ch03
 
DWM
DWMDWM
DWM
 
Data warehouse,data mining & Big Data
Data warehouse,data mining & Big DataData warehouse,data mining & Big Data
Data warehouse,data mining & Big Data
 
Big data destruction of bus. models
Big data destruction of bus. modelsBig data destruction of bus. models
Big data destruction of bus. models
 
Big data
Big dataBig data
Big data
 
Data Mining - Presentation.pptx
Data Mining - Presentation.pptxData Mining - Presentation.pptx
Data Mining - Presentation.pptx
 
Modern trends in information systems
Modern trends in information systemsModern trends in information systems
Modern trends in information systems
 
Big Data Analytics Research Report
Big Data Analytics Research ReportBig Data Analytics Research Report
Big Data Analytics Research Report
 
QuickView #3 - Big Data
QuickView #3 - Big DataQuickView #3 - Big Data
QuickView #3 - Big Data
 
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate
 
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital ForensicsBig Data in Distributed Analytics,Cybersecurity And Digital Forensics
Big Data in Distributed Analytics,Cybersecurity And Digital Forensics
 
Data mining concepts
Data mining conceptsData mining concepts
Data mining concepts
 
13500892 data-warehousing-and-data-mining
13500892 data-warehousing-and-data-mining13500892 data-warehousing-and-data-mining
13500892 data-warehousing-and-data-mining
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
 
Unit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdfUnit-1 introduction to Big data.pdf
Unit-1 introduction to Big data.pdf
 

Recently uploaded

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Recently uploaded (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Importance of Data Mining

  • 1. Follow Us:1 Importance of Data Mining in IT Industry
  • 2. Follow Us:2 Data , Data everywhere..
  • 3. Follow Us:3 Introduction  What is Data? Facts and statistics collected together for reference or analysis. As per Moore’s Law, The information density on silicon integrated circuits double every 18 to 24 months.
  • 4. Follow Us:4  There has been a huge increase in the amount of data being stored in database over past twenty years.  All that user wants is more sophisticated information.  This surges up the demand of Data mining.
  • 5. Follow Us:5 What is Data mining?  Data mining refers to extracting or “mining” Knowledge from large amount of data. - By Jiawei Han, Micheline Kamber,
  • 6. Follow Us:6  Data mining focuses on extraction of information from a large set of data and transforms it into an easily interpretable structure for further use.  As data is growing at very remarkable rate, there comes a need to analyze large, complex and information rich data sets to gain the hidden information. This may result into greater customer satisfaction and remarkable turn over for the firm.
  • 7. Follow Us: Why Data mining? 7  I am not able to find the data I need. (Data is dispersed over the network)  The data I have access to is poorly documented. (Proper information is missing)  I am not able to use the data I have. (Unexpected results)
  • 8. Follow Us: What is Data Warehouse? 8  Data Warehouse is an integrated, subject oriented and time-varying collection of data that is used primarily in organizational decision making. -Bill Inmon, Building the Data Warehouse 1996  Data Warehousing: Data Warehousing is a process of managing data from various sources in order of answering the Business questions.
  • 9. Follow Us:9 Examples of Data warehousing:  Wal-Mart – Terabytes (10^12 bytes)  Geographic Information System – Peta bytes(10^15 bytes)  National Medical Records – Extra bytes (10^18 bytes)  Weather Images- Zettabyte(10^21 bytes)  Intelligent video agency- Yottabyte (10^24 bytes)
  • 10. Follow Us: How data mining works with warehouse data? 10  Data warehouse avails enterprise with memory.  Data warehouse avails enterprise with intelligence.
  • 11. Follow Us: Example of Data mining in IT: 11  A company that offers online services in several countries faces a problem in times of loading the homepage. Even IT department employees could not solve the bug. Based on customer feedback and last mile measurements, company gets to know that IT services are suffering from insufficient time for loading the homepage for some users.
  • 12. Follow Us:12  Management has the high interest in solving the problem because the response time of the site directly affects the online sales, IT department has already looked in to log files with the help of mining tool.  It has been discovered that there’s no such single type of typical problem. Instead data mining program discovers that there are various problem clusters.
  • 13. Follow Us:13  The patterns for this problem is determined by a combination of country code , operating system, and the version of a browser that the customers use.  One another pattern shows that the same bad performance in loading times of the website is caused by a different problem.
  • 14. Follow Us:14  Speed of internet and cookie settings also lead to greater time for loading the site. This is how data mining helps in indentifying the problem and hence solving the problem becomes quick.  This is just one example of how data mining can be so useful.
  • 15. Follow Us:15 Data mining Models and Tasks Data mining is widely divided into two parts:  Predictive Data mining  Descriptive Data mining
  • 17. Follow Us:17 Predictive Data mining  The objective of predictive tasks is to use the values of some variable to predict the values of other variable.  Ex: Web mining is used by the online marketers to predict the purchase by online user on a website.
  • 18. Follow Us:18  Classification is used to map data in a predefined groups.  Regression maps a data item to a real valued prediction variable.  Clustering forms a similar data together.  Summarization is used to map data in a subsets.  Link Analysis defines relationships among data.
  • 19. Follow Us:19 Descriptive Data mining  The objective of descriptive tasks is to find human readable patterns which describes the relationships between data.
  • 20. Follow Us:20 Architectures of a data mining system There are four Data mining architecture: I. No- Coupling II. Loose Coupling III.Semi tight Coupling IV.Tight coupling
  • 21. Follow Us:21 • In this architecture, data mining system doesn’t use any functionality of a database or data warehouse system. • Data is retrieved from data sources like file system and processed using data mining algorithms which are stored into file system. • This architecture is considered as a poor architecture for data mining system as it does not take any advantages of database or data warehouse. • However it is used for simple data mining processes No-coupling:
  • 22. Follow Us:22  The loose coupling data mining system uses database or data warehouse for data retrieval.  In this architecture, data mining system retrieves data from database or data warehouse, processes data using data mining algorithms and stores the result in those systems.  Loose coupling architecture is for memory-based data mining system which does not require high scalability and high performance. Loose Coupling:
  • 23. Follow Us:23 In semi-tight coupling data mining architecture, it not only links it to database or data warehouse system, but it also uses several features of database or data warehouse systems which perform some data mining tasks like sorting and indexing etc. Moreover the intermediate result can also be stored in database or data warehouse system for better performance. Semi-tight Coupling:
  • 24. Follow Us:24 Tight-coupling:  In this architecture, database or data warehouse is treated as an information retrieval component.  Tight-coupling data mining architecture provides scalability, high performance and integrated information.
  • 25. Follow Us:25 Three tiers of Tight-coupling Data mining Architecture: i. Data Layer ii. Data mining application layer iii.Front-end Layer
  • 26. Follow Us:26 Data layer can be database or data warehouse systems which is an interface for all data sources. It stores results in data layer so it can be presented to end-user in form of reports. Datalayer
  • 27. Follow Us:27 This layer is used to retrieve data from database. And by performing transformation routine, we can get data into desired format. Then data is processed using various data mining algorithms. Data mining application layer
  • 28. Follow Us:28 Front-end layer provides visceral and user friendly interface for end-user to interact with data mining system. The result of data mining is presented in visualization form in front-end layer. Front-end layer:
  • 29. Follow Us:29  Although data mining is still in its early stage; companies in a wide range of industries such as retail, heath care, manufacturing, finance, and transportation are already using data mining techniques to take advantage of historical data.  Data mining helps analysts to recognize significant facts, relationships, trends, patterns and anomalies which might go unnoticed otherwise. Application Area of Data mining:
  • 30. Follow Us:30  Data mining uses pattern recognition technologies and mathematical techniques to sift through warehoused information.  In business, data mining is useful for discovering patterns and relationships in data to help make better decisions.  Data mining helps in developing smarter marketing campaigns and to predict customer loyalty.
  • 31. Follow Us:31 Advantages of Data Mining  Marketing /Retail :  Data mining helps marketing companies to build models based on historical data which will precisely predict responders to the new marketing campaigns.  Marketers will have appropriate approach for targeted customers.  Data mining helps retail companies as well. By using market basket analysis, a store can have an appropriate arrangement in such a way that customers can purchase frequent buying products together with pleasant. It also helps the retail companies to offer certain discounts which will attract more customers.
  • 32. Follow Us:32  Finance / Banking By building a model from historical customer’s data of loans, the bank officials and financial institution can determine good and bad loans. Data mining also helps banks to detect fraudulent credit card transactions.
  • 33. Follow Us:33  Manufacturing Data mining is useful in operational engineering data which can detect faulty equipments and determines optimal control parameters. Data mining can determine the range of control parameters which leads to the production of perfect product. Hence optimal control parameters can provide the desired quality.
  • 34. Follow Us:34  Governments Data mining helps government agency to analyze records of financial transaction which will help in building patterns that can detect money laundering or criminal activities.
  • 35. Follow Us:35  Market segmentation - Data mining helps to identify the common characteristics of customers who buy the same products from your company.  Customer anticipation - It helps to predict which customers may leave your company and go to a competitor.  Fraud detection - It indentifies which transactions are most likely to be fraudulent. Specific use of data mining include:
  • 36. Follow Us:36  Direct marketing - Direct marketing identifies which prospects should be included to obtain the highest response rate.  Interactive marketing - It is useful for predicting what each user on a Web site is most likely interested in seeing.  Market basket analysis - It helps to understand what products or services are commonly purchased together.  Trend analysis - Trend analysis identifies the difference between a typical customer this month and last.
  • 37. Follow Us:37 Disadvantages of data mining Privacy Issues  The internet is booming with social networks, e- commerce, blogs etc, the concerns about the personal privacy has been increasing.  This worries the users as the information might be collected and used in unethical way which can potentially cause a lot of troubles.  Businesses collect the information of its users for setting up the marketing strategies but there are chances that business might be taken by other firms or gets shut down and that’s where a concern of misusing or leaking the personal information arises.
  • 38. Follow Us:38  Security issues Security is the biggest concern in data mining. Businesses own all the information of their employees which even includes personal and financial information, there are the chances of misusing data by hackers and which cause serious trouble to the organization and its employees.
  • 39. Follow Us:39  Misuse of information/inaccurate information The information collected by the data mining may be exploited by unethical people or businesses in order to take benefits of vulnerable people. Data mining techniques is not totally accurate. So inaccurate information may lead the wrong decision-making which may cause serious consequences.
  • 40. Follow Us:40 Challenges of Data Mining  Scalability: Scalable techniques are needed for handling massive size of datasets that are now being created.  Poor efficiency: Such large datasets require the use of efficient methods for storing, indexing, and retrieving data from secondary or even tertiary storage systems.  In order to perform the data mining task in timely manner, data mining techniques are needed.
  • 41. Follow Us:41  Complexity: Such techniques can dramatically increase the size of the datasets which can be handled and for that it requires new designs and algorithms.  Dimensionality: Some domains have number of dimensions which are very large and makes the data analyzing difficult. That is called as 'curse of dimensionality’. For example, bioinformatics.