SlideShare a Scribd company logo
1 of 31
Overview of Data Mining Meeting of WP Data Mining April 28, 2008 Bowo Prasetyo http://www.scribd.com/prazjp http://www.slideshare.net/bowoprasetyo ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Contents ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What Is Data Mining? ,[object Object],1) Berry and Linoff,  Data Mining Techniques for Marketing, Sales and Customer Support  (Book), 1997
Does It Differ To Statistics? ,[object Object],16) D. Pregibon,  Data Mining: Statistical Computing and Graphics , p. 7-8, 1997 Statistics Artificial Intelligence Database Data Mining
Statistics, AI, Database ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Why Uses Data Mining? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],2) ESG Research,  New ESG Research Finds Large Organizations Experiencing Explosive Growth in Log Data Collection, Analysis, and Storage , 2007 ( http://www.enterprisestrategygroup.com/_documents/NewsEvent/NewsEvent439.pdf )  3) EMC — IDC Research,  The Expanding Digital Universe: A Forecast of Worldwide Information Growth Through 2010 , 2006 ( http://www.emc.com/about/destination/digital_universe/ )
What Can Data Mining Do? Examples
On Business and Network Security ,[object Object],[object Object],[object Object],[object Object],4) G. Adomavicius and A. Tuzhilin,  Using data mining methods to build customer profiles , in Computer magazine p. 74-82, 2001 5) Z. Huang, H. Chen, C. Hsu, W. Chen, S. Wu,  Credit rating analysis with support vector machines and neural networks: a market comparative study , in Journal of Decision Support Systems p. 543-558, 2004 6) T. Fawcett and F. Provost,  Adaptive Fraud Detection , in Journal of Data Mining and Knowledge Discovery p. 291-316, 2004 7) W. Lee and S. J. Stolfo,  Data Mining Approaches for Intrusion Detection , in Proceedings of the 7th USENIX Security Symposium, 1998
On The Web ,[object Object],[object Object],[object Object],[object Object],8) R. Cooley, B. Mobasher, J. Srivastava,  Web Mining: Information and Pattern Discovery on the World Wide Web , in Proceedings of 9th International Conference on Tools with Artificial Intelligence (ICTAI) p. 0558, 1997 9) Larry Page, Sergey Brin, R. Motwani, T. Winograd,  The PageRank Citation Ranking: Bringing Order to the Web , 1998 ( http://citeseer.ist.psu.edu/page98pagerank.html )  10) M. Eirinaki and M. Vazirgiannis,  Web mining for web personalization , in ACM Transactions on Internet Technology (TOIT) p. 1- 27, 2003.  11) S. W. Changchien and T. Lu,  Mining association rules procedure to support on-line recommendation by customers and products fragmentation , in Journal of Expert Systems with Applications v. 20-4 p. 325-335, 2001
On Environment ,[object Object],[object Object],[object Object],[object Object],12)  J. Han, K. Koperski, N. Stefanovic, GeoMiner: a system prototype for spatial data mining, in  Proceedings of ACM SIGMOD international conference on Management of data p. 553 - 556, 1997 13) Z. Nazeri and J. Zhang,  Mining aviation data to understand impacts of severe weather on airspace system performance , in Proceedings of International Conference on Coding and Computing p. 518- 523, 2002.  14) V. Kumar, M. Steinbach, P. Tan, S. Klooster, C. Potter, A. Torregrosa,  Mining Scientific Data: Discovery of Patterns in the Global Climate System , in Proceedings of the Joint Statistical Meetings p. 5--9, 2001 15) M.  Steinbach, P. Tan, V. Kumar, S. Klooster, C. Potter ,  Data Mining for the Discovery of Ocean Climate Indices , in Proceedings of the 5th Workshop on Scientific Data Mining p. 7-16, 2002
Methods in Data Mining Basic Methods
Classification, Clustering, Association Rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Classification ,[object Object],[object Object],[object Object],17)  http ://en.wikipedia.org/wiki/Naive_Bayes_classifier
Clustering ,[object Object],[object Object],[object Object],[object Object],[object Object],18) J. A. Hartigan and M. A. Wong,  A k-means clustering algorithm,  in Applied Statistics, 28 (1) p. 100-108, 1979
Association Rules ,[object Object],[object Object],[object Object],[object Object],19) R. Agrawal, R. Srikant,  Fast Algorithms for Mining Association Rules , in Proc. 20th Int. Conf. Very Large Data Bases, VLDB, 1994
Association Rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],( C k : Candidate itemset of size  k )  ( L k : frequent itemset of size  k  whose  support  >=  minsup )
Association Rules ,[object Object],[object Object],[object Object]
Visualization Of Mining Results ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Contoh Kasus Aturan Asosiasi di Toserba
Item dan Transaksi ,[object Object],[object Object],[object Object],[object Object],[object Object],transaksi item
Frequent Item (Item Sering) ,[object Object],[object Object],support minimum support
n -Length Item ( n -Item) ,[object Object],2-length item 3-length item
Aturan Asosiasi ,[object Object],[object Object],beras => minyak goreng support(minyak goreng & beras) support(beras) = 2/2 = 1 confidence antecedent consequent
Aturan Asosiasi Lengkap
Mining Environmental Data Examples
Explosion in Environmental Data ,[object Object],[object Object],[object Object],[object Object]
Geo-spatial Database ,[object Object],[object Object],[object Object],[object Object],[object Object],GeoMiner
Earth Science ,[object Object],Regions that are covered by the highly correlated pattern, FPAR-Hi    NPP-Hi Shrubland regions FPAR: Fractional Intercepted Photosynthetically Active Radiation NPP  : Net Primary Production
Earth Science ,[object Object],Two clusters for NPP (land) and two clusters for SST (ocean). The clusters approximate the northern and southern hemispheres, for land and ocean. SST: sea surface temperature
Earth Science ,[object Object],Clusters of ocean near the Philipines (SST) and lands of Eastern Brazil, Southern Africa, and a bit of Australia (NPP) is highly correlated (0.47). In particular, this sea region is highly correlated (0.66), with SOI, which is a climate index related to El Niño, and it is known that parts of Southern Africa and Australia experience droughts related to El Nino.
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

What's hot

PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANKPATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
IJDKP
 

What's hot (9)

Analysis of Crime Big Data using MapReduce
Analysis of Crime Big Data using MapReduceAnalysis of Crime Big Data using MapReduce
Analysis of Crime Big Data using MapReduce
 
Data mining and knowledge Discovery
Data mining and knowledge DiscoveryData mining and knowledge Discovery
Data mining and knowledge Discovery
 
Data Warehousing and Business Intelligence Project on Smart Agriculture and M...
Data Warehousing and Business Intelligence Project on Smart Agriculture and M...Data Warehousing and Business Intelligence Project on Smart Agriculture and M...
Data Warehousing and Business Intelligence Project on Smart Agriculture and M...
 
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANKPATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
PATTERN DISCOVERY FOR MULTIPLE DATA SOURCES BASED ON ITEM RANK
 
Data Mining Overview
Data Mining OverviewData Mining Overview
Data Mining Overview
 
A literature review of modern association rule mining techniques
A literature review of modern association rule mining techniquesA literature review of modern association rule mining techniques
A literature review of modern association rule mining techniques
 
Bigdata AI
Bigdata AI Bigdata AI
Bigdata AI
 
Data science courses
Data science coursesData science courses
Data science courses
 
Data analytics courses
Data analytics coursesData analytics courses
Data analytics courses
 

Similar to Overview of Data Mining

Similar to Overview of Data Mining (20)

Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Centerp...
 
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
Big Data Applications & Analytics Motivation: Big Data and the Cloud; Center...
 
data.2.pptx
data.2.pptxdata.2.pptx
data.2.pptx
 
Introduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .pptIntroduction to Data Mining and technologies .ppt
Introduction to Data Mining and technologies .ppt
 
Data Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notesData Mining mod1 ppt.pdf bca sixth semester notes
Data Mining mod1 ppt.pdf bca sixth semester notes
 
Data-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptxData-Mining-ppt (1).pptx
Data-Mining-ppt (1).pptx
 
Data-Mining-ppt.pptx
Data-Mining-ppt.pptxData-Mining-ppt.pptx
Data-Mining-ppt.pptx
 
unit 1 DATA MINING.ppt
unit 1 DATA MINING.pptunit 1 DATA MINING.ppt
unit 1 DATA MINING.ppt
 
Application of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in ThailandApplication of web ontology to harvest estimation of rice in Thailand
Application of web ontology to harvest estimation of rice in Thailand
 
Application of web ontology to harvest estimation of rice in thailand
Application of web ontology to harvest estimation of rice in thailandApplication of web ontology to harvest estimation of rice in thailand
Application of web ontology to harvest estimation of rice in thailand
 
future2020
future2020future2020
future2020
 
IRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET- Improving the Performance of Smart Heterogeneous Big DataIRJET- Improving the Performance of Smart Heterogeneous Big Data
IRJET- Improving the Performance of Smart Heterogeneous Big Data
 
10probs.ppt
10probs.ppt10probs.ppt
10probs.ppt
 
Data science Innovations January 2018
Data science Innovations January 2018Data science Innovations January 2018
Data science Innovations January 2018
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Development of Data Integration & Analysis System in Japan
Development of Data Integration & Analysis System in JapanDevelopment of Data Integration & Analysis System in Japan
Development of Data Integration & Analysis System in Japan
 
Data science innovations
Data science innovations Data science innovations
Data science innovations
 
Foresight conversation
Foresight conversationForesight conversation
Foresight conversation
 
REVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining TechniquesREVIEW: Frequent Pattern Mining Techniques
REVIEW: Frequent Pattern Mining Techniques
 
Data Mining introduction and basic concepts
Data Mining introduction and basic conceptsData Mining introduction and basic concepts
Data Mining introduction and basic concepts
 

More from Bowo Prasetyo

More from Bowo Prasetyo (10)

e-Voting Application using Barcode Vtoken
e-Voting Application using Barcode Vtokene-Voting Application using Barcode Vtoken
e-Voting Application using Barcode Vtoken
 
e-Voting Application using Internal Vtoken
e-Voting Application using Internal Vtokene-Voting Application using Internal Vtoken
e-Voting Application using Internal Vtoken
 
Konsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor ModelingKonsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor Modeling
 
Konsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor ModelingKonsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor Modeling
 
Konsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor ModelingKonsep Baru Pemodelan Database dengan Anchor Modeling
Konsep Baru Pemodelan Database dengan Anchor Modeling
 
Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6
 
Mengenal Rapidminer
Mengenal RapidminerMengenal Rapidminer
Mengenal Rapidminer
 
Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6
 
Nutch dan Solr
Nutch dan SolrNutch dan Solr
Nutch dan Solr
 
Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6Mengamankan Aplikasi Java EE 6
Mengamankan Aplikasi Java EE 6
 

Recently uploaded

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Overview of Data Mining