SlideShare a Scribd company logo
DEFINITION
Data mining, the extraction of
hidden predictive information
from large databases, is a
powerful new technology with great
potential to help companies focus on the
most important information in their data
warehouses.
Extract, transform, and load transaction data
onto the data warehouse system.
Store and manage the data in a
multidimensional database system.
Provide data access to business analysts
and information technology professionals.
Analyze the data by application software.
Present the data in a useful format, such as
a graph or table.
Classes
Clusters
Association
Sequential
patterns
Stored data is used to locate data in
predetermined groups. For example, a
restaurant chain could mine customer
purchase data to determine when customers
visit and what they typically order. This
information could be used to increase traffic
by having daily specials.
Data items are grouped according to logical
relationships or consumer preferences. For
example, data can be mined to identify market
segments or consumer affinities.
Data can be mined to identify associations.
The beer-diaper example is an example of
associative mining.
• Data is mined to anticipate behavior patterns
and trends. For example, an outdoor
equipment retailer could predict the likelihood
of a backpack being purchased based on a
consumer's purchase of sleeping bags and
hiking shoes.
Evolutionary Step Business Question Enabling Technologies Product Providers Characteristics
Data
Collection(1960s)
"What was my total
revenue in the last five
years?"
Computers, tapes, disks IBM, CDC Retrospective,
static data
delivery
Data Access(1980s) "What were unit sales in
New England last March?"
Relational databases
(RDBMS), Structured
Query Language (SQL),
ODBC
Oracle, Sybase,
Informix, IBM,
Microsoft
Retrospective,
dynamic data
delivery at
record level
Data Warehousing
&Decision Support
(1990s)
"What were unit sales in
New England last March?
Drill down to Boston."
On-line analytic
processing (OLAP),
multidimensional
databases, data
warehouses
Pilot, Comshare,
Arbor, Cognos,
Microstrategy
Retrospective,
dynamic data
delivery at
multiple levels
Data
Mining(Emerging
Today)
"What’s likely to happen to
Boston unit sales next
month? Why?"
Advanced
algorithms,
multiprocessor
computers, massive
databases
Pilot, Lockheed,
IBM, SGI,
numerous
startups (nascent
industry)
Prospective,
proactive
information
delivery
Techniques
Neural Network
Decision
Tree
Visualisation
Link
Analysis
Neural Network
• Are used in a blackbox fashion.
• One creates a test data set,lets the neural
network learn patterns based on known
outcomes, then sets the neural network loose on
huge amounts of data.
• For example, a credit card company has 3,000
records, 100 of which are known fraud records
• The data set updates the neural network to make
sure it knows the difference between the fraud
records and the legitimate ones.
Link analysis
• This is another technique for associating like
records
• Not used too much, but there are some tools
created just for this.
• As the name suggests, the technique tries to
find links, either in customers, transactions
and demonstrate those links.
Visualisation
• Helps users understand their data
• Makes the bridge from text based to graphical
presentation.
• Such things as decision tree, rule ,cluster and
pattern visualization help users see data
relationships rather than read about them.
• Many of the stronger data mining programs
have made strides in improving their visual
content over the past few years.
Decision Tree
• Use real data mining algorithms
• Decision trees help with classification and spit out
information that is very descriptive,helping users to
understand their data.
• A decision tree process will generate the rules followed
in a process.
• For example, a lender at a bank goes through a set of
rules when approving a loan.
• Based on the loan data a bank has, the outcomes of
the loans and limits of acceptable levels of default, the
decision tree can set up the guidelines for the lending
institution.
PROCESS STAGES
1 The initial exploration
2
3
Model building or pattern identification with
validation/verification
Deployment
Stage 1: Exploration
• This stage usually starts with data preparation
which may involve cleaning data, data
transformations, selecting subsets of records
and - in case of data sets with large numbers
of variables ("fields")
Stage 2: Model building and
validation
This stage involves considering various models
and choosing the best one based on their
predictive performance.
• i.e. explaining the variability in question and
producing stable results across samples.
Process Models
Business Understanding Data Understanding
Data Preparation Modeling
Evaluation
Deployment
Define
Measure
Analyze
Improve
Control
Sample
Explore
Modify
Model
Assess
Stage 3: Deployment
That final stage involves using the model
selected as best in the previous stage and
applying it to new data in order to generate
predictions or estimates of the expected
outcome.
• KDD Nuggets and Rexer
Analytics have surveys and
asked people involved in
data mining which the
most popular software that
they use.
• While it is not necessarily
true that the most popular
software is the best for a
particular purpose they can
help guide us in choosing
which software to evaluate.
• Include a wide variety of methods.
• Easy to use interface makes it accessible
for general user
• Flexibility and extensibility make it
suitible for academic user
• Is written in java and released under the
GNU General Public Licence (GPL).
• Can be run in Windows, Linux, Mac and
other platform.
• Part of SAS suite of analysis software and uses a
client-server architacture with java based client
allowing parallel processing and grid-computing.
• Can be deployed on both Windows and
Linux/Unix platforms.
• User interface-easy to use data-flow gui
• Can intergrate code written in the SAS language.
• Data mining package with multiple techniques
and data flow interface
Data mining

More Related Content

What's hot

Top Data Mining Techniques and Their Applications
Top Data Mining Techniques and Their ApplicationsTop Data Mining Techniques and Their Applications
Top Data Mining Techniques and Their Applications
PromptCloud
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
Sanzid Kawsar
 
Data Mining and Data Warehouse
Data Mining and Data WarehouseData Mining and Data Warehouse
Data Mining and Data Warehouse
Anupam Sharma
 
Data mining and its applications!
Data mining and its applications!Data mining and its applications!
Data mining and its applications!
COSTARCH Analytical Consulting (P) Ltd.
 
Data mining by_ashok
Data mining by_ashokData mining by_ashok
Data mining by_ashokAshok Kumar
 
Predictive modeling
Predictive modelingPredictive modeling
Predictive modeling
Prashant Mudgal
 
Data Mining – analyse Bank Marketing Data Set
Data Mining – analyse Bank Marketing Data SetData Mining – analyse Bank Marketing Data Set
Data Mining – analyse Bank Marketing Data SetMateusz Brzoska
 
Data mining financial services
Data mining financial servicesData mining financial services
Data mining financial servicesHprentice
 
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
Editor IJMTER
 
Data Mining
Data Mining Data Mining
Data mining in marketing
Data mining in marketingData mining in marketing
Data mining in marketing
rushabhs002
 
Data mining on Financial Data
Data mining on Financial DataData mining on Financial Data
Data mining on Financial Data
AmarnathVenkataraman
 
Additional themes of data mining for Msc CS
Additional themes of data mining for Msc CSAdditional themes of data mining for Msc CS
Additional themes of data mining for Msc CS
Thanveen
 
Data mining
Data miningData mining
Data mining
nandini patil
 
What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation
Pralhad Rijal
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?
Seerat Malik
 
Data mining
Data miningData mining
Data mining
SATISH KUMAR
 

What's hot (19)

Top Data Mining Techniques and Their Applications
Top Data Mining Techniques and Their ApplicationsTop Data Mining Techniques and Their Applications
Top Data Mining Techniques and Their Applications
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
 
Data Mining and Data Warehouse
Data Mining and Data WarehouseData Mining and Data Warehouse
Data Mining and Data Warehouse
 
Data mining and its applications!
Data mining and its applications!Data mining and its applications!
Data mining and its applications!
 
Group7_Datamining_Project_Report_Final
Group7_Datamining_Project_Report_FinalGroup7_Datamining_Project_Report_Final
Group7_Datamining_Project_Report_Final
 
Data mining by_ashok
Data mining by_ashokData mining by_ashok
Data mining by_ashok
 
Predictive modeling
Predictive modelingPredictive modeling
Predictive modeling
 
Data Mining – analyse Bank Marketing Data Set
Data Mining – analyse Bank Marketing Data SetData Mining – analyse Bank Marketing Data Set
Data Mining – analyse Bank Marketing Data Set
 
Data Mining
Data MiningData Mining
Data Mining
 
Data mining financial services
Data mining financial servicesData mining financial services
Data mining financial services
 
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
PERFORMING DATA MINING IN (SRMS) THROUGH VERTICAL APPROACH WITH ASSOCIATION R...
 
Data Mining
Data Mining Data Mining
Data Mining
 
Data mining in marketing
Data mining in marketingData mining in marketing
Data mining in marketing
 
Data mining on Financial Data
Data mining on Financial DataData mining on Financial Data
Data mining on Financial Data
 
Additional themes of data mining for Msc CS
Additional themes of data mining for Msc CSAdditional themes of data mining for Msc CS
Additional themes of data mining for Msc CS
 
Data mining
Data miningData mining
Data mining
 
What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation What is Data mining? Data mining Presentation
What is Data mining? Data mining Presentation
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?
 
Data mining
Data miningData mining
Data mining
 

Viewers also liked

Transport mode virtual private network(vpn)
Transport mode virtual private network(vpn)Transport mode virtual private network(vpn)
Transport mode virtual private network(vpn)Murniana Shazwen
 
Transport mode virtual private network(vpn)
Transport mode virtual private network(vpn)Transport mode virtual private network(vpn)
Transport mode virtual private network(vpn)Murniana Shazwen
 
Sutera menurut al quran dan sains
Sutera menurut al quran dan sainsSutera menurut al quran dan sains
Sutera menurut al quran dan sainsMurniana Shazwen
 
Perkembangan ilmu fizik dlm sejati& tokoh
Perkembangan ilmu fizik dlm sejati& tokohPerkembangan ilmu fizik dlm sejati& tokoh
Perkembangan ilmu fizik dlm sejati& tokohMurniana Shazwen
 
Borang soal selidik kbs
Borang soal selidik kbs Borang soal selidik kbs
Borang soal selidik kbs Azrina Rosli
 
Sutera menurut al quran dan sains
Sutera menurut al quran dan sainsSutera menurut al quran dan sains
Sutera menurut al quran dan sainsMurniana Shazwen
 
Borang rekod solat dirumah
Borang rekod solat dirumahBorang rekod solat dirumah
Borang rekod solat dirumah
Murniana Shazwen
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
ux singapore
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Stanford GSB Corporate Governance Research Initiative
 

Viewers also liked (10)

Transport mode virtual private network(vpn)
Transport mode virtual private network(vpn)Transport mode virtual private network(vpn)
Transport mode virtual private network(vpn)
 
Transport mode virtual private network(vpn)
Transport mode virtual private network(vpn)Transport mode virtual private network(vpn)
Transport mode virtual private network(vpn)
 
Data mining
Data miningData mining
Data mining
 
Sutera menurut al quran dan sains
Sutera menurut al quran dan sainsSutera menurut al quran dan sains
Sutera menurut al quran dan sains
 
Perkembangan ilmu fizik dlm sejati& tokoh
Perkembangan ilmu fizik dlm sejati& tokohPerkembangan ilmu fizik dlm sejati& tokoh
Perkembangan ilmu fizik dlm sejati& tokoh
 
Borang soal selidik kbs
Borang soal selidik kbs Borang soal selidik kbs
Borang soal selidik kbs
 
Sutera menurut al quran dan sains
Sutera menurut al quran dan sainsSutera menurut al quran dan sains
Sutera menurut al quran dan sains
 
Borang rekod solat dirumah
Borang rekod solat dirumahBorang rekod solat dirumah
Borang rekod solat dirumah
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 

Similar to Data mining

Introduction to Big Data Analytics
Introduction to Big Data AnalyticsIntroduction to Big Data Analytics
Introduction to Big Data Analytics
Utkarsh Sharma
 
Data mining
Data miningData mining
Data mining
jadhav_priti
 
Data Mining
Data MiningData Mining
Data Mining
SOMASUNDARAM T
 
Big data overview
Big data overviewBig data overview
Big data overview
Shyam Sunder Budhwar
 
Data Science in Python.pptx
Data Science in Python.pptxData Science in Python.pptx
Data Science in Python.pptx
Ramakrishna Reddy Bijjam
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
Srinivasan Sankar
 
Datamining
DataminingDatamining
Datamining
DataminingDatamining
What Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfWhat Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdf
Agile dock
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
Hadi Fadlallah
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
Er. Nawaraj Bhandari
 
Data Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxData Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptx
hp41112004
 
Top 30 Data Analyst Interview Questions.pdf
Top 30 Data Analyst Interview Questions.pdfTop 30 Data Analyst Interview Questions.pdf
Top 30 Data Analyst Interview Questions.pdf
ShaikSikindar1
 
Prescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptxPrescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptx
Karthik132344
 
Powerpoint si
Powerpoint siPowerpoint si
Powerpoint si
sebastiansaenzc
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introduction
Basma Gamal
 
Data mining
Data miningData mining
Data mining
Daminda Herath
 

Similar to Data mining (20)

Introduction to Big Data Analytics
Introduction to Big Data AnalyticsIntroduction to Big Data Analytics
Introduction to Big Data Analytics
 
Abstract
AbstractAbstract
Abstract
 
Unit 4 Advanced Data Analytics
Unit 4 Advanced Data AnalyticsUnit 4 Advanced Data Analytics
Unit 4 Advanced Data Analytics
 
Data mining
Data miningData mining
Data mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Big data overview
Big data overviewBig data overview
Big data overview
 
Data Science in Python.pptx
Data Science in Python.pptxData Science in Python.pptx
Data Science in Python.pptx
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
 
Datamining
DataminingDatamining
Datamining
 
Datamining
DataminingDatamining
Datamining
 
What Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdfWhat Is Data Mining How It Works, Benefits, Techniques.pdf
What Is Data Mining How It Works, Benefits, Techniques.pdf
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Classification and prediction in data mining
Classification and prediction in data miningClassification and prediction in data mining
Classification and prediction in data mining
 
Data Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptxData Mining Presentation for College Harsh.pptx
Data Mining Presentation for College Harsh.pptx
 
data analysis-mining
data analysis-miningdata analysis-mining
data analysis-mining
 
Top 30 Data Analyst Interview Questions.pdf
Top 30 Data Analyst Interview Questions.pdfTop 30 Data Analyst Interview Questions.pdf
Top 30 Data Analyst Interview Questions.pdf
 
Prescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptxPrescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptx
 
Powerpoint si
Powerpoint siPowerpoint si
Powerpoint si
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introduction
 
Data mining
Data miningData mining
Data mining
 

Recently uploaded

The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 

Recently uploaded (20)

The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 

Data mining

  • 1.
  • 2. DEFINITION Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses.
  • 3.
  • 4. Extract, transform, and load transaction data onto the data warehouse system. Store and manage the data in a multidimensional database system. Provide data access to business analysts and information technology professionals. Analyze the data by application software. Present the data in a useful format, such as a graph or table.
  • 6. Stored data is used to locate data in predetermined groups. For example, a restaurant chain could mine customer purchase data to determine when customers visit and what they typically order. This information could be used to increase traffic by having daily specials.
  • 7. Data items are grouped according to logical relationships or consumer preferences. For example, data can be mined to identify market segments or consumer affinities.
  • 8. Data can be mined to identify associations. The beer-diaper example is an example of associative mining.
  • 9. • Data is mined to anticipate behavior patterns and trends. For example, an outdoor equipment retailer could predict the likelihood of a backpack being purchased based on a consumer's purchase of sleeping bags and hiking shoes.
  • 10. Evolutionary Step Business Question Enabling Technologies Product Providers Characteristics Data Collection(1960s) "What was my total revenue in the last five years?" Computers, tapes, disks IBM, CDC Retrospective, static data delivery Data Access(1980s) "What were unit sales in New England last March?" Relational databases (RDBMS), Structured Query Language (SQL), ODBC Oracle, Sybase, Informix, IBM, Microsoft Retrospective, dynamic data delivery at record level Data Warehousing &Decision Support (1990s) "What were unit sales in New England last March? Drill down to Boston." On-line analytic processing (OLAP), multidimensional databases, data warehouses Pilot, Comshare, Arbor, Cognos, Microstrategy Retrospective, dynamic data delivery at multiple levels Data Mining(Emerging Today) "What’s likely to happen to Boston unit sales next month? Why?" Advanced algorithms, multiprocessor computers, massive databases Pilot, Lockheed, IBM, SGI, numerous startups (nascent industry) Prospective, proactive information delivery
  • 11.
  • 12.
  • 14. Neural Network • Are used in a blackbox fashion. • One creates a test data set,lets the neural network learn patterns based on known outcomes, then sets the neural network loose on huge amounts of data. • For example, a credit card company has 3,000 records, 100 of which are known fraud records • The data set updates the neural network to make sure it knows the difference between the fraud records and the legitimate ones.
  • 15. Link analysis • This is another technique for associating like records • Not used too much, but there are some tools created just for this. • As the name suggests, the technique tries to find links, either in customers, transactions and demonstrate those links.
  • 16. Visualisation • Helps users understand their data • Makes the bridge from text based to graphical presentation. • Such things as decision tree, rule ,cluster and pattern visualization help users see data relationships rather than read about them. • Many of the stronger data mining programs have made strides in improving their visual content over the past few years.
  • 17. Decision Tree • Use real data mining algorithms • Decision trees help with classification and spit out information that is very descriptive,helping users to understand their data. • A decision tree process will generate the rules followed in a process. • For example, a lender at a bank goes through a set of rules when approving a loan. • Based on the loan data a bank has, the outcomes of the loans and limits of acceptable levels of default, the decision tree can set up the guidelines for the lending institution.
  • 18. PROCESS STAGES 1 The initial exploration 2 3 Model building or pattern identification with validation/verification Deployment
  • 19. Stage 1: Exploration • This stage usually starts with data preparation which may involve cleaning data, data transformations, selecting subsets of records and - in case of data sets with large numbers of variables ("fields")
  • 20. Stage 2: Model building and validation This stage involves considering various models and choosing the best one based on their predictive performance. • i.e. explaining the variability in question and producing stable results across samples.
  • 21. Process Models Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment
  • 23. Stage 3: Deployment That final stage involves using the model selected as best in the previous stage and applying it to new data in order to generate predictions or estimates of the expected outcome.
  • 24. • KDD Nuggets and Rexer Analytics have surveys and asked people involved in data mining which the most popular software that they use. • While it is not necessarily true that the most popular software is the best for a particular purpose they can help guide us in choosing which software to evaluate.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29. • Include a wide variety of methods. • Easy to use interface makes it accessible for general user • Flexibility and extensibility make it suitible for academic user • Is written in java and released under the GNU General Public Licence (GPL). • Can be run in Windows, Linux, Mac and other platform.
  • 30. • Part of SAS suite of analysis software and uses a client-server architacture with java based client allowing parallel processing and grid-computing. • Can be deployed on both Windows and Linux/Unix platforms. • User interface-easy to use data-flow gui • Can intergrate code written in the SAS language. • Data mining package with multiple techniques and data flow interface